This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Eigen in parallel threads

Tags: None
(comma "," separated)
mpmcl
Registered Member
Posts
19
Karma
0

Eigen in parallel threads

Mon Sep 21, 2009 7:38 pm
I have been benchmarking execution times for threads in Mac OS 10.5.8 (Leopard). Language is Objective-C++. Compiler is gcc 4.0.1.

Typically, parallel threads provide speed improvement proportional to the number of cores, at least up to four, but this is NOT TRUE when the computation involves Eigen 2.0.5. Here is such a computation:

// Matrices are row-major. Data arrays (g, w) are fixed globals.
-(void)main
{
Matrix3d GtWGi;
const int NREPEATS = 10000;

for (int j = 0;j < NREPEATS;j++) {
MatrixXd G = Map<MatrixXd>(g, 12, 3);
MatrixXd W = Map<MatrixXd>(w, 12, 12);
GtWGi = (G.transpose()*W*G).inverse();
}
}

And here are two benchmarks, using 20 replicates, with two CPUs (at 2 GHz):

Execution time (ms) with 1 thread --> mean = 345, sigma = 6.8
Execution time with 2 threads --> mean = 463, sigma = 8.5 [SLOWER!!]

I am guessing that the culprit is dynamic allocation via malloc() which, to be threadsafe, is serialized.

Have you experienced anything similar and, most especially, is there some fix for this compatible with Eigen? I would very much like to use the latter but speed is important for my application.

Thanks.
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen in parallel threads

Mon Sep 21, 2009 7:53 pm
Compiler is gcc 4.0.1.


Oops, this compiler gives very poor results with Eigen. Try to upgrade to GCC 4.2 at least. It is available on Mac.

I am guessing that the culprit is dynamic allocation via malloc() which, to be threadsafe, is serialized.


Yes, I agree it's probably the explanation. If that hypothesis is true, then your code spents a lot of time waiting for malloc() to return.

The quickest way to at least check if that's the problem, is to replace MatrixXd by fixed-size matrix types, that are guaranteed to not cause mallocs:

Code: Select all
// Matrices are row-major. Data arrays (g, w) are fixed globals.
-(void)main
{
Matrix3d GtWGi;
const int NREPEATS = 10000;

for (int j = 0;j < NREPEATS;j++) {
MatrixXd G = Map<Matrix<double,12,3> >(g);
MatrixXd W = Map<Matrix<double,12,12> >(w);
GtWGi = (G.transpose()*W*G).inverse();
}
}


And see if that improves performance.

It seems that you do know the value "3" at compile time, but perhaps not the 12? Then at least declare G as Map<Matrix<double,Dynamic,3> >(g,12,3);
That will already allow to avoid most of the mallocs.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!


Bookmarks



Who is online

Registered users: Bing [Bot], daret, Google [Bot], sandyvee, Sogou [Bot]