Moderator
|
Note that noalias or lazy are only useful for products. So here you should try to remove them as they might complicate the task of the compilers. Here the results are consistent:
Eigen2 Eigen3 novec 1 1 vec 0.64 0.59 |
Registered Member
|
Yeah, that's what I expected, but then I saw a difference in the timings and assumed it might be more complicated. I re-ran em, and indeed noalias makes no difference to gcc. However, for MSC:
gcc 4.4.3 + eigen2 is still faster than the custom vectorization. The results above are using -O3 and few other flags, but I tried with plain -O2, and it's just 0.03 seconds slower; so that's not a big issue. I'm assuming that represents working as intended - it's a bit faster that v2's vectorization, consistently. However, MSC isn't being happy; I think inlining isn't working as expected (if it were, the profiler outputs shouldn't include any of the eigen internals). |
Registered Member
|
It's indeed the inlining.
in assign.h, I replaced lines near line 407:
with:
And MSC now performs as expected: (benchmark version without lazy/noalias)
gcc+eigen2(novector) is still abnormally quick; it must be auto-vectorizing by itself. |
Registered Member
|
Great work!
We have a macro EIGEN_STRONG_INLINE exactly for that (see in Macros.h). Why don't you send us a patch? So you get credited in the hg history. http://eigen.tuxfamily.org/index.php?ti ... ng_a_patch
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
Try -O2 then. According to the gcc man page, -ftree-vectorize is only enabled at -O3. And you can use -ftree-vectorize-verbose=n to get info.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
I think I'm having trouble with the eigen mailing list daemon. Can someone verify whether or not they've received two messages on the mailing list? One concerns a minor patch to BenchTimer, and the second concerns another regression from Eigen2-->Eigen3 in code like this:
Patches were attached to both. |
Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]