Registered Member
|
Without many words, here's the code:
Simply compiling with g++ -lrt -DNDEBUG -O3 test.cpp results in about ~2.3 seconds, but if I compile with "-msse2", it takes about 5 seconds, so it's more than twice as slow. I'm guessing this has something to do with the low dimensionality of the column vectors, so that using SSE2 actually introduces too much overhead? How can I make Eigen detect that and not use SSE2 in this case? Unfortunately, I cannot use fixed size matrices here, since the dimensionality isn't known at compile time. |
Moderator
|
Here I get 4s versus 4.5s, not that significant. The number of rows is not known at compile time? 3 is just an example among many possibilities? I'm asking because you could use the Matrix3Xd type which would lead to much better performance. The following version seems to be faster too:
for(int i=0;i<A.cols();++i) dAB += (A.col(i).rowwise().replicate(B.cols()) - B).squaredNorm(); In the case the number of rows is always small, an even faster solution would be to use row major matrices. In the future the above code with replicate should be able to well vectorize such cases. In the meantime you can help Eigen vectorizing doing: for(int i=0;i<A.cols();++i) for(int k=0;k<A.rows();++k) dAB += (A(k,i) - B.row(k).array()).square().sum(); where B has to be a row major matrix, otherwise this should be slower. |
Registered Member
|
Thanks again, Gael! Unfortunately, I do not know the dimensions at compile-time - it could be something other than '3'. Also, my snippet was just a very reduced test case. I have to go through the column vectors one by one, since in each I have to do calculations which depend on the distances I've calculated with the vectors which came before, so I cannot do that in one sweep.
|
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]