Registered Member
|
I took a closer look of the library, and appreciate many of the designs of the library.
However, before I can make serious use of Eigen in our projects, there remain some questions. 1. Is the memory layout for plain objects guaranteed to be continuous? In a large project, there can be multiple libraries working with each other. The Map class provides a feasible solution to let Eigen work on external storage. However, sometimes there can be cases where I may have other routines work on Eigen objects. In these cases, I need to understand the memory layout of the Eigen objects more clearly. I notice that the PlainObjectBase class has a method called "data", which, I guess, is returning the base address of the data contained in a plain object, such as Matrix and Array, etc. The question here is that "is the memory in a matrix/array continuous?", such that one can use "a.data()" and "a.data() + a.size()" respectively as the begin and end iterators when traversing the elements contained therein. 2. Is there any plan to incorporate R-value references and move semantics in Eigen? I did a quick check of several classes in the library codes, and found that R-value references are not used. For things such as matrices and arrays, implementing move semantics when passing stuff as R-values would sometimes lead to substantial performance boost by reducing unnecessary copying. I understand it is often the case that return-value-copying is eliminated by compiler via RVO and NRVO. However, there are still cases where RVO isn't triggered and move semantics can play an important role. C++0x is still very new and not supported in many platforms. But I think it is still good to allow user to turn it on (say, via preprocessor macros) when the compiler supports it. For the current version, is there any facilities (helper classes, etc) provided in Eigen that could help in this problem? 3. Support of elementary function evaluation? From the document, the library seems to support only a limited subset of elementary function evaluation on arrays. Some widely used functions, such as floor, ceil, round, etc are still not available. I'd like to see them available as soon as possible, even the underlying implementation is just a plain for-loop. In addition, a mapping function on arrays that behaves like std::transform (performing element-wise calculation) could be very convenient in many cases. Might be such thing has been in the library and I am just missing it ... 4. BLAS/LAPACK back-end? I did several simple benchmarks on an Intel i7 box. The performance of Eigen is very good with "-O3 -msse4.2". What I really like is that when the matrix is small, it does not incur much overhead as in MKL. That being said, I found that the performance of eigen on BLAS level 3 still not as good as MKL or GotoBLAS when matrix is large (e.g. 2000 x 2000), about 30%-40% slower (I have manually set #threads to 1 when invoking external BLAS libs). Therefore, it makes sense to provide a BLAS (and probably LAPACK) band-end for uses who need high performance computation on large matrices. But this could be tricky, as Eigen outperforms MKL when matrices are not very large, and the switching point seems to be machine-dependent. I am considering a temporary solution by writing a sgemm/dgemm function on an Eigen matrix, which invokes an external BLAS routine, and uses it in the cases where matrix size is large. But I really want to see it provided by Eigen. Thanks a lot. |
Moderator
|
1. Yes, the memory layout of a Matrix<> object is *always* contiguous. .data() gives you the address of the first element, and .outerStride() the number of elements between two columns (or rows) which is always equal to either rows() of cols() (depends on the storage order of course, the default is column major). data() and data()+size() are valid begin/end iterators.
2. Yes, someone already did some experiments but this can only be optional because we aim at supporting today compilers and many platforms. Also not that in Eigen we never return matrix objects by lightweight objects called expressions, so as long as it concerns Eigen's functions, R-value ref and move semantic won't bring any perf gain. 3. Adding floor, ceil and the like is pretty trivial. For transform, you can do: result = input.unaryOp(functor_object); result = A.binaryOp(B,functor_object); This is more powerful than std::transform because our approach returns a lightweight expression which can be combined with others, and it also support vectorization if the functor does. 4. Yes for intel i7 we are so good. I don't know why because on previous CPU generation we are in pair with MKL and GotoBLAS... And yes there is a plan to add optional BLAS/Lapack backends. Should be available within 2-3 months. Also note that Eigen support multithreading, -fopenmp.... |
Registered Member
|
Thanks for the reply. For questions 2 and 3 in this thread, I still have two additional questions.
For Q2, I was actually considering something like
What would you suggest in writing such thing? Due to the branching, RVO is probably not triggered even optimization switch is turned on. But unnecessary copying is undesirable, especially when X.rows() is large. For Q3. when use the functor as follows result = input.unaryOp(my_floor_functor) Is there any requirement of the functor? Or can I just write like
In addition, if I want to wrap it in a more friendly inline function as
What should I use here for "[SomeExpressionClass]" such that I can return an expression instead of an evaluated array ? |
Registered Member
|
In terms of element-wise evaluation, the compiler (gcc 4.6.0) said that ArrayXd does not have a member called "unaryOp" ...
But I traced into the Eigen code base, and got some ideas of how these stuff were put together, and made my own as follows
It seems to work in a simple test. But since I just had a superficial idea of how Eigen works by briefly looking at the code, I am not sure whether this way is efficient. Here, by efficiency, I am not referring to SIMD implementation of floor (to make an SIMD impl, I may also have to define stuff like packetOp, predux, and a specialized version of functor_traits.) At this point, what I want to make clear is that whether the construction of the CwiseUnaryOp instance makes a deep-copy of a, or it just contains a reference(or pointer) to a. Thanks. |
Moderator
|
G2: what about MatrixXf res;
if(...) res.noalias() = ...; else res.noalias() = ...; return res; Q3: sorry it is called unaryExpr: http://eigen.tuxfamily.org/dox/classEig ... b0525ddd65 and your implementation of floor is correct, no immediate evaluation nor useless copies. |
Registered Member
|
I think this has come up before but would you consider adding .begin() and .end() methods to the PlainObjectBase template. I know that in my project I could save some code duplication by calling existing code except that it requires a .begin() method. Also I find myself writing data() + size() sufficiently frequently that I would like to be able to condense it to end(). In any case I will try using the plugin mechanism to extend the PlainObjectBase class so that I can get experience with plugins. |
Moderator
|
there is better: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=231
|
Registered Member
|
That someone was me. The patch which enables move support can be found here: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=266 Gael is of course right, that you get no performance gain when you use just Eigen. But there are a few cases where move semantics help, especially, when you start extending Eigen without falling back to the internal trickery. Feel free to give the patch a try. But I am not even sure whether this patch can be applied anymore to the default branch. - Hauke |
Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]