Registered Member
|
In my program, I have a very big C-Array of size nstates * ncat * npattern ( nstates is usually 4 or 20, ncat is 4, npattern can be as big as 10K ) and a small array of size nstates * nstates.
What I have to do are pretty simple matrix operations, such as take a segment of nstates elements from the big array and multiply it with the matrix (nstates * nstates) I tried to implement it in EIGEN and then compare with a naive implementation. Here is a snippet of the code:
As you might easily see, the code does exactly what Ive written above: take small vector segments (4 elements) and use them for computation one after another. This works fine and give me 10% speed up by using EIGEN. Through Benchmarking I noticed that EIGEN does not work so well with small matrix/vector. So I think it might be better to somehow do the computation in a whole instead of extracting small vector like I did in the code above. Here is the new code, which does exactly the same, the only difference is that it extracts a very big segment (nstates*npattern element) to do the computation (I have to use Stride for this), the outer for loop therefore also disappears, which I thought would help since this loops 10K+ times.
To my big surprise, this code runs even slower than a naive implementation. Could it be that the matrix multiplication between (4x4) and (4x10K) are not so efficiently implemented ? bjacob, can you help me ? THANKS ! |
Registered Member
|
I think , in the second code, I create a dynamic vector lh_allPtn of very big size (10K elements) to store the results in the function, which causes malloc. The function is called thousands time during the run, that might be the reason for bad performance.
I will try to use a class variable for that vector, which is only malloced once at the beginning. |
Registered Member
|
Yes, that's a good direction to look into.
Your original post would take me too much time to look into; if you need help, please isolate a smaller portion of code and ask a more precise question PS. Eigen does very well on small matrices (like 4x4) provided that you tell it at compile time about these (template parameters).
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered users: Bing [Bot], blue_bullet, Google [Bot], rockscient, Yahoo [Bot]