Eigen code in C++ 10-20 times slower than matlab • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

Eigen code in C++ 10-20 times slower than matlab

Page 1 of 1 (12 posts)

Tags:

parrimin Registered Member Posts 4 Karma 0	Eigen code in C++ 10-20 times slower than matlab Thu Sep 27, 2012 9:16 am Hi, I am trying to port some Matlab coded prototype to C++ program. I have chosen Eigen to make some calculations, but it has become that the consuming time is about 10 times slower in best case. Let me explain the situation: - Matrix a, unsigned chars, 2782 rows x 128 cols. - Matrix b, unsigned chars, 4000 rows x 128 cols. I need to know for each row (vector) in a, the vector in b with minimum distance. Here is the matlab code, that is running in 0.05 seconds: Code: Select all `aa=sum(a.a,2); bb=sum(b.b,2); ab=ab'; d = sqrt(abs(repmat(aa,[1 size(bb,1)]) + repmat(bb',[size(aa,1) 1]) - 2ab)); [minz index]=min(d,[],2);` and here the c++ code. I tried the same code using ints, floats and doubles: Code: Select all MatrixXf a(a_size, descrSize); MatrixXf b(b_size, descrSize); MatrixXf ab(a_size, b_size); const unsigned char* dataPtr = matrixa; for (int i=0; i<a_size; ++i) { for (int j=0; j<descrSize; ++j) { a(i,j)=(float)dataPtr++; } } const unsigned char vocPtr = matrixb; for (int i=0; i<b_size; ++i) { for (int j=0; j<descrSize; ++j) { b(i,j)=(float)vocPtr ++; } } ab = ab.transpose(); a.cwiseProduct(a); b.cwiseProduct(b); MatrixXf aa = a.rowwise().sum(); MatrixXf bb = b.rowwise().sum().transpose(); MatrixXf d = (aa.replicate(1,b_size) + bb.replicate(a_size,1) - 2*ab).cwiseAbs2(); indexes = new int[a_size]; for (int i=0; i<a_size; ++i) { d.row(i).minCoeff(&indexes[i]); } Time using SSE2 optimizations int: 0.74916s. float: 0.502975s. double: 0.72532s. Without SSE2 int: 1.25108s. float: 0.940441s. double: 1.02473s. The best time (float with SSE2) is far from the results from Matlab. Is there something I am missing to make this faster?
nicaiwss Registered Member Posts 1 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Fri Sep 28, 2012 8:11 am I found that eigen replicate is kind of slow, can you try like this: Code: Select all `MatrixXf a(128, 2782); MatrixXf b(128, 4000); ...... MatrixXf ab = a.transpose()b; a.cwiseProduct(a); b.cwiseProduct(b); aa = a.colwise().sum(); bb = b.colwise().sum(); MatrixXf d(a.cols(),b.cols()); for (int i=0; i<b.cols(); i++) { for (int j=0; j<a.cols(); j++) { d(j,i) = (aa(0,j)+bb(0,i)-2ab(j,i)); } }` Also, I think you can replace your matlab code Code: Select all `repmat(aa,[1 size(bb,1)]) + repmat(bb',[size(aa,1) 1])` with Code: Select all `bsxfun(@plus,aa,bb')` . It should be slightly faster.
parrimin Registered Member Posts 4 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Fri Sep 28, 2012 9:12 am You are right. repmat 2 matrices is almost half of the time. Thanks for your solution, I did the same in other way, but yours is better. And of course thank you for the Matlab solution, but my Matlab prototype is fast enough, I need speed in C++. Any clue on speeding up the matrix product? That is almost the other half of the time spent in the function.
smajewski Registered Member Posts 6 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Sat Sep 29, 2012 10:45 pm Hi I am new to Eigen and Eigen community but maybe I'll be able to help. Make sure you use -O2 , -O3 or -Ofast options with g++/gcc compiler. It gives much better performance, some of my programs using Eigen started running even 20-30 times faster after I have used compiler optimization. Cheers
manuels Registered Member Posts 47 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Sun Sep 30, 2012 9:13 am Is it really neccessary to copy the dataPtr and vecPtr. Use can probably use them directly using Mappings
parrimin Registered Member Posts 4 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Tue Oct 02, 2012 3:48 pm I am compiling using Visual Studio 2010, and optimizations are on. I can use mappings, but, I should cast to float, so, it is the same. Copying the data is about 0.001s., so its not a problem at all. I was just copying because my luck of knowledge about Eigen library. I had some fool error using them last week.
manuels Registered Member Posts 47 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Thu Oct 04, 2012 10:10 am You can try gperf (http://www.gnu.org/software/gperf/) to determine which function call takes the most time.
parrimin Registered Member Posts 4 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Thu Oct 04, 2012 10:21 am Thanks, but I cannot understand how can that software/library help to this task. The part that is taking most of the time is the line that says [code] ab = a*b.transpose(); [\code] If I transpose the b matrix beforehand, the product time is similar, so transposing is not the problem.
twood Registered Member Posts 17 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Fri Oct 05, 2012 1:35 pm How large are your matrices, roughly? What happens if you change that line to: Code: Select all `ab.noalias() = a*b.transpose();` ? (Aliasing explained here: http://eigen.tuxfamily.org/dox/TopicAliasing.html)
hughperkins Registered Member Posts 6 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Sat Oct 06, 2012 2:12 am I tried using Jeigen http://github.com/hughperkins/jeigen , which is a java wrapper around eigen, and therefore has wwwaayyy more overhead than calling it directly. My timings are bad, but not as bad as for yours: matlab: 0.76 seconds jeigen: 2.79 seconds matlab code: a = rand(2782,128); b = rand(4000,128); tic, a * b'; toc jeigen code: DenseMatrix a = rand(2782,128); DenseMatrix b = rand(2782,128); tic(); DenseMatrix c = a.mmul(b.t()); toc(); Some thoughts: - the optimization one has already been mentioned - I'm using ubuntu, rather than windows. maybe something system specific - the one I'm thinking most likely: are you perhaps running on a multicore system? matlab automatically parallelizes over multiple cores. So, let's say you're running on two cores (like me), then matlab will give half the elapsed time (as we see above). If you have 12 cores, then matlab will be twelve times faster...
hughperkins Registered Member Posts 6 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Mon Oct 08, 2012 12:44 am Ooooh, I was using the wrong dimension for the second matrix for the jeigen test. In fact, using the correct dimensions, the timing becomes: matlab: 0.76 seconds jeigen: 3.05 seconds I made a dummy end-to-end multiplier to test java/jna latency, which does everything except actually ask Eigen to multiply the matrices, ie the native method looks like this: // dummy operation to measure end to end latency void dummy_op2( int rows, int middle, int cols, double afirst, double asecond, double aresult ) { MatrixXd first(rows,middle); valuesToMatrix( rows, middle, afirst, &first ); MatrixXd second(middle,cols); valuesToMatrix( middle, cols, asecond, &second ); //MatrixXd result = first second; MatrixXd result(rows,cols); matrixToValues( rows, cols, &result, aresult ); } Then, testing from java: DenseMatrix a = rand(2782,128); DenseMatrix b = rand(4000,128); tic(); DenseMatrix c; c = a.dummy_mmul(b.t()); toc(); c = a.dummy_mmul(b.t()); toc(); c = a.dummy_mmul(b.t()); toc(); Elapsed time: 570 ms Elapsed time: 442 ms Elapsed time: 435 ms .... so the overhead of java/jna for these matrices is about 0.44 seconds. So the native time is about 2.5 seconds. If we assume matlab was running multithreaded on 2 cores, then we would expect matlab to take about 1.5 seconds on a single core. So in fact it does seem that eigen is about 60% slower than matlab/MKL in this case.
manuels Registered Member Posts 47 Karma 0	Re: Eigen code in C++ 10-20 times slower than matlab Mon Oct 08, 2012 9:40 am Can you wrap the scalar product with Code: Select all `__asm__("# begin");` and Code: Select all `__asm__("# end");` compile it with the -S flag and post the resulting assembly code here?

Page 1 of 1 (12 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]