performance of eigen • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

performance of eigen

Page 1 of 1 (13 posts)

Tags:

PNSH Registered Member Posts 2 Karma 0	performance of eigen Sun Dec 13, 2009 5:43 am According to benchmark, eigen is much better than atlas on M * M. However, I tested on some PCs (core duo/core2 quad), eigen 2.0.10/atlas 3.8, single thread. The results showed M(1500,1500)N(1500,1500) cost more or less the time on eigen and atlas. Actually, atlas was a little faster. I also compared matrix element accessing of ublas and eigen dynamic matrices. They are almost the same. Did I missing any optimizations? Here are my sample codes and Makefile Code: Select all #include <boost/numeric/bindings/traits/ublas_vector.hpp> #include <boost/numeric/bindings/traits/ublas_sparse.hpp> #include <boost/numeric/bindings/umfpack/umfpack.hpp> #include <boost/numeric/ublas/matrix_sparse.hpp> #include <boost/numeric/ublas/matrix.hpp> #include <boost/numeric/ublas/io.hpp> #include <eigen2/Eigen/Core> #include <eigen2/Eigen/Array> #include <iostream> #include <fstream> #include <ctime> #include <cstdlib> namespace ublas = boost::numeric::ublas; namespace umf = boost::numeric::bindings::umfpack; using namespace boost::numeric::ublas ; using std::cout; using std::endl; using namespace std; USING_PART_OF_NAMESPACE_EIGEN #define N 1000 int main() { ublas::matrix<double, ublas::column_major, ublas::unbounded_array<double> > bm (N,N); MatrixXd em(N,N); double startt, endt; startt = (double) clock() / CLOCKS_PER_SEC; for (int p = 0; p < 100; p++){ for (int i = 0; i < N; i++){ for (int j = 0; j < N; j++){ bm(i, j) = i+j; } } } endt = (double) clock() / CLOCKS_PER_SEC; cout << (endt - startt)/2 << endl; startt = (double) clock() / CLOCKS_PER_SEC; for (int p = 0; p < 100; p++){ for (int i = 0; i < N; i++){ for (int j = 0; j < N; j++){ em(i, j) = i+j; } } } endt = (double) clock() / CLOCKS_PER_SEC; cout << endt - startt << endl; } Code: Select all #include <eigen2/Eigen/Core> #include <eigen2/Eigen/Array> #include <iostream> #include <fstream> #include <ctime> #include <cstdlib> #ifdef __cplusplus extern "C" { #endif #include <cblas.h> #ifdef __cplusplus } #endif // import most common Eigen types using namespace std; USING_PART_OF_NAMESPACE_EIGEN #define N 1500 int main(int, char []) { MatrixXd m4(N,N); MatrixXd m5(N,N); m4.setRandom(); m5.setRandom(); MatrixXd m6(N,N); double startt, endt; startt = (double) clock() / CLOCKS_PER_SEC; cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1.0f, m4.data(), N, m5.data(), N, 0.0f, m6.data(), N); endt = (double) clock() / CLOCKS_PER_SEC; cout << (endt - startt)/2 << endl; startt = (double) clock() / CLOCKS_PER_SEC; m6 = m4 * m5; endt = (double) clock() / CLOCKS_PER_SEC; cout << endt - startt << endl; } Code: Select all DEFINC = /usr/include/ LIBS = -L /opt/mpich2/lib/ INC = -I $(DEFINC)geodesic -I $(DEFINC)OGRE -I $(DEFINC)CEGUI -I $(DEFINC)OGRE -I $(DEFINC)CEGUI -I $(DEFINC)libxml2 -I $(DEFINC)rlog -I $(DEFINC)log4cpp -I /opt/mpich2/include/ LINK = -larpack -ldmumps -lmumps_common -lscalapack -llapack -lblacs -lblacsf77 -lblacs -lmpich -lptf77blas -lptcblas_atlas -latlas -lumfpack -lamd -lsuperlu -lOpenMeshCore -lOpenMeshTools -lOgreMain -lCEGUIBase -lCEGUIOgreRenderer -lOIS -lANN -lxml2 -liniparser -lrlog -llog4cxx -lcv -lcvaux -lhighgui -lml -pthread -lmetis -lpord -lesmumps -lfax_scotch -lsymbol -ldof -lorder -lgraph_scotch -lscotch -lscotcherr -lcommon CC = mpicc CXX = mpic++ CXXFLAGS = -O3 #= -DRLOG_COMPONENT CXXFLAGS += -msse2 CXXFLAGS += -DEIGEN_VECTORIZE CXXFLAGS += -DEIGEN_NO_DEBUG CFLAGS = $(CXXFLAGS) OBJ = test.o SRC = test.cpp all : test test : $(OBJ) $(CXX) $< -o $@ $(LIBS) $(LINK) #obj : $(SRC) # $(CXX) $(INC) $(CXXFLAGS) -c $< .cpp.o : $(CXX) $(INC) $(CXXFLAGS) -c $< #all : $(OBJ) # $(CXX) $(LIBS) $(LINK) $< -o test clean : rm -rf ./*.o test
bjacob Registered Member Posts 658 Karma 3	Re: performance of eigen Sun Dec 13, 2009 6:39 am According to our benchmark (matrix-matrix product) we're 20% faster. If you only get roughly the same speedm that could be: * either because your ATLAS is better tuned for your CPU cache size than Eigen is by default. With EIGEN, you can currently control that by #defines, grep for CACHE in Core/util/Macros.h * or that could be a difference between your setup and Gael's when he made the benchmark. That benchmark was on x86-64, with GCC 4.4. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
PNSH Registered Member Posts 2 Karma 0	Re: performance of eigen Sun Dec 13, 2009 3:25 pm Thanks for your expiation. Following the suggestion, I enlarged CACHE macro, but it did not help. I did experiment on both i686 and x86_64 with gcc 4.4. Perhaps, the reason is that atlas was better tuned and fitted my machines better.
bjacob Registered Member Posts 658 Karma 3	Re: performance of eigen Sun Dec 13, 2009 6:12 pm Also, the benchmark was with the development branch, which now is much faster than 2.0, and already at that time (March 2009) was a bit faster than 2.0. The CACHE macro is currently named: EIGEN_TUNE_FOR_CPU_CACHE_SIZE. Its default value is 4256256. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
Seb Registered Member Posts 99 Karma 0	Re: performance of eigen Wed Jan 13, 2010 8:42 am Regarding the elementwise access you should take a look at the Eigen functions coeff and coeffRef - they are not error-guided (asserts etc.) and faster.
gaga666 Registered Member Posts 4 Karma 0 OS	Re: performance of eigen Tue Jan 19, 2010 5:22 pm Btw, what about performance in comparison with MTL library? I can't decide which one to use in performance-critical application with small(<20x20) matrix computation -)
bjacob Registered Member Posts 658 Karma 3	Re: performance of eigen Tue Jan 19, 2010 5:33 pm MTL4 and other libraries where among our old benchmark: http://eigen.tuxfamily.org/index.php?ti ... August2008 Later benchmark only show the high-performance libraries. But this benchmark tests dynamic-sizes only. If your sizes are not only small but known at compile time, a different benchmark is needed. Eigen does very well there too as this is one of our primary areas of interest, but then I don't know about MTL. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
gaga666 Registered Member Posts 4 Karma 0 OS	Re: performance of eigen Tue Jan 19, 2010 6:07 pm bjacob, thank you for your reply. And is it possible(i'm really sorry, but i haven't read documentation yet;) to use statically-allocated big enough matrix and use its sub-matrix in computations without reallocation and copying? That would'be been a big performance goal, and, what is even more important, possible to use in real-time applications. e.g. someting like (saw such syntax somewhere): Code: Select all `Matrix A(20,20), B(5,1); Matrix C = map(A,0,0,5,5); Matrix D = C*B;`
bjacob Registered Member Posts 658 Karma 3	Re: performance of eigen Tue Jan 19, 2010 6:17 pm Yes, we have a Map mechanism just as you describe, see class Map. But in your case we have something even better, if you just dont know the exact size of a matrix at compile time but know that it will never be bigger than 20x20 and want to avoid dynamic memory allocation, you could do: Code: Select all `using namespace Eigen; // start with size 5x5 inside of a statically allocated 20x20 array Matrix<float,Dynamic,Dynamic,0,20,20> my_matrix(5,5); // now resize for fun my_matrix.resize(12,13);` Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
gaga666 Registered Member Posts 4 Karma 0 OS	Re: performance of eigen Tue Jan 19, 2010 6:22 pm Wow, that's really great! This feature is very important, because dynamic memory allocation at runtime is unacceptable in real-time systems. Thank you very much, bjacob, for your replies and for great job.
ggael Moderator Posts 3447 Karma 19 OS	Re: performance of eigen Tue Jan 19, 2010 9:27 pm regarding Eigen vs MTL4, for your use case, Eigen has a significant advantage which is explicit vectorization.
gaga666 Registered Member Posts 4 Karma 0 OS	Re: performance of eigen Wed Jan 20, 2010 12:48 pm ggael, that's not an advantage in my case, because i'm writing fo i486@120Mhz using gcc 2.95 . But I can't compile library yet, some errors always appear. Anyway, I'm still trying, because eigen looks really great and perfectly suits my purpose.
bjacob Registered Member Posts 658 Karma 3	Re: performance of eigen Wed Jan 20, 2010 3:15 pm We don't claim to support gcc 2.95. The oldest gcc that we support is gcc 3.3 in Eigen 2.0. It might become gcc 3.4 in the default branch, if gcc 3.3 poses too many problems. Notice that with such old GCC, even if you could compile, the resulting code would still be of very poor quality ---> slow and bloated. If you want high performance, use GCC >= 4.2. It is perfectly able to generate code for your 486. You don't have to run it _on_ the 486. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!

Page 1 of 1 (13 posts)

Bookmarks

Who is online

Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]