This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Array multiplication (cwiseProduct product) very slow ?

Tags: None
(comma "," separated)
lamtung
Registered Member
Posts
28
Karma
0
Hi,

I'm using EIGEN 3 and was very impressive with its performance in vector dot product. But now I'm facing a performance problem with a simple array multiplication, hopefully it is because I did something wrong ...

Here is my test code
Code: Select all
double* a = ei_aligned_new<double> (NCOLS * NROWS);
double* b = ei_aligned_new<double> (NCOLS * NROWS);
double* c = ei_aligned_new<double> (NROWS);
Map<RowMajorMatrixXd, Aligned> ei_a (a, NROWS, NCOLS);
Map<RowMajorMatrixXd, Aligned> ei_b (b, NROWS, NCOLS);
RowVectorXd ei_d;

// This is the code that is used in the benchmark
ei_d = (ei_a.cwiseProduct(ei_b)).rowwise().sum();

The simple line of code above takes more time to finish than a completely naive implementation (30%). NCOLS is 4, and NROWS is 2000
What did I do wrong ?
Thanks !
User avatar
bjacob
Registered Member
Posts
658
Karma
3
The reason why it's slow are that:

1. there are only 4 columns, but you declared this dimension as dynamic, so Eigen is optimizing for larger sizes than that. The only was that such as small dimension can be efficient, is by declaring it at compile time. So instead of RowMajorMatrixXd, use:

Matrix<double,Dynamic,4,RowMajor>

This can easily explain why it'd be slower than a naive implementation using this compile-time information.

2. You do:
Code: Select all
RowVectorXd ei_d;

// This is the code that is used in the benchmark
ei_d = (ei_a.cwiseProduct(ei_b)).rowwise().sum();

Why did you declare this ei_d as a row-vector? Being the vector of sums in each row, it's naturally a column-vector. I'm not saying that this is wrong or slower (it's not), but this suggests that your code is not doing what you had in mind??

3. you're taking the sum inside of each row, but each row is of length 4, which is too small to take advantage of SSE for computing the sum. Eigen should then realize that it's a better idea to take packets in the other direction, but the rowwise() operations are not current smart enough for that (afaiu). So, that part is Eigen's fault. You can however get Eigen to compile it right by using a matrix product:

Code: Select all
ei_d = ei_a.cwiseProduct(ei_b).lazyProduct(Vector4d::Ones());


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
lamtung
Registered Member
Posts
28
Karma
0
Thanks so much, things work now as expected !!!

1. Yes, it was my mistake ! I didn't quite understand what is meant in the tutorial by "larger size", is it the dimension of the matrix or the number of elements it contains, I assumed the latter.

2. In my code there is also a part where the vector is multiplied with another matrix, so I just set every vector to be row major. And you are right, it should be column major !

3. Thanks for the trick, I definitely will need it in the future !
User avatar
bjacob
Registered Member
Posts
658
Karma
3
lamtung wrote:1. Yes, it was my mistake ! I didn't quite understand what is meant in the tutorial by "larger size", is it the dimension of the matrix or the number of elements it contains, I assumed the latter.


It depends on the context, on what you're doing. For row-wise operations, what matters is the number of columns.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!


Bookmarks



Who is online

Registered users: Bing [Bot], blue_bullet, Google [Bot], rockscient, Yahoo [Bot]