Registered Member
|
I am getting an unusually slow performance in gcc 4.8.2 compared to clang 3.4.2 for the following piece of code:
Compile Flags: -O3 -DNDEBUG -march=native Clang 3.4.2 runtime : real 0m0.003s user 0m0.003s sys 0m0.000s Gcc 4.8.2 runtime : real 0m2.247s user 0m2.248s sys 0m0.000s Any insight into why there is such a big difference? |
Moderator
|
This has nothing to do with Eigen, it's simply that in your example the compiler can aggressively remove all the loops. For instance, you can remove the inner loop as follow:
and this version is obviously 2000x faster! gcc 4.8 does not perform this optimization. Of course, you can go even further and remove all the loops but apparently gcc 4.9 does not go that far and only remove the most inner one. Clang is able to go that far and compute 'acc' at compile time, thus leading to the following generate code which is essentially a no-op:
|
Registered Member
|
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]