Registered Member
|
Hello,
I need to do a lot of multiplications between real and complex matrices. the simple multiplication:
is two times slower ( I guess it's doing a complex/complex multiplication) than if I introduce a temporary:
While following code doesn't compile
Did I miss some efficient way to do it with Eigen, or do I need to use the version with the temporary matrix? I'm using Eigen 2.0.11. Thanks for your help! Regards, Philipppe |
Moderator
|
Actually, in the 2.0 branch, matrix products mixing reals and complexes do perform real * complex multiplications. The slowdown you observed is because you lost the vectorization.
Also I confirm that with the 2.0 branch, .real() and .imag() are read-only. With the devel branch, such hybrid real-complex products are not better yet (even worse for large products because in this case it really does complex-complex multiplications). I plane to make them as fast as possible when we'll vectorize complexes (probably for Eigen 3.1). On the other hand, the devel branch allows you to write: res.real() = matRe * matCmplx.real(); res.imag() = matRe * matCmplx.imag(); |
Registered Member
|
Oups, I forgot part of the compiler flags when I tried my test cases, sorry. Thanks a lot for your answer.
Well actually I'm a little bit confused now. I'm using gcc-4.3.4. If I use "-O2 -march=core2", it stays at around a factor two, "-O3" doesn't make any real difference. To get a real improvement I need to give the "-funroll-loops" flag, then I get a difference of about 30%. Did I hit a special case, or should I always use '-funroll-loops' with eigen? Because then it might be useful to mention it on the tutorial page. regards, Philippe |
Registered Member
|
Normally, Eigen unrolls loops by itself so -funroll-loops shouldn't give a big performance boost. What matrix size are you using? Can you improve Eigen performance without passing -funroll-loops by defining EIGEN_UNROLLING_LIMIT to a higher value before you include Eigen? (First print this value to see its default value, or grep it in Eigen's sources).
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
The matrices are 100x100:
without -funroll-loops real x complex: 0.0026 s splitted product: 0.0014 s with -funroll-loops real x complex: 0.0019 s splitted product: 0.0015 s I tried now with 1000x1000: without -funroll-loops real x complex: 2.72 s splitted product: 1.24 s with -funroll-loops real x complex: 2.16 s splitted product: 1.17 s and changing EIGEN_UNROLLING_LIMIT didn't make it better (tried 110, 150, 200, 500, 1000) |
Registered Member
|
oh ok, really big matrices. The only explanation is that -funroll-loops performs some partial unrolling. That's on our TODO, we know it would be useful, hopefully it'll eventually get done...
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
oh I see, it seems small matrices (up to ~20x20) are faster in the direct product than in a split version.
Thanks a lot for your help, and thanks for that impressive library. |
Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]