Eigen (3.2.4) solves sparse linear system slower than Matlab • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

Eigen (3.2.4) solves sparse linear system slower than Matlab

Page 1 of 1 (10 posts)

Tags:

cppalbrecht Registered Member Posts 3 Karma 0	Eigen (3.2.4) solves sparse linear system slower than Matlab Tue Jun 02, 2015 12:31 pm Hey all, my problem ist that Matlab solves exactly the same linear system three times faster than Eigen 3.2.4 with the sparsesuite library. For the solution within my c++-code, I use the following schemes: Code: Select all `#include <Eigen/UmfPackSupport> ... Eigen::SparseMatrix<double> A(size,size); Eigen::VectorXd rhs(size); ... Eigen::UmfPackLU<Eigen::SparseMatrix<double>> solver(A); Eigen::VectorXd result(size); result = solver.solve(rsd);` or Code: Select all `#include <Eigen/CholmodSupport> ... Eigen::CholmodDecomposition<Eigen::SparseMatrix<double>> solver; solver.analyzePattern(A); solver.factorize(A); Eigen::VectorXd result(size); result = solver.solve(rhs);` Does anyone has an idea, why MATLAB could solve ~2-3 times faster? Here is the link to the linear system: LinSystem.tar.bz2
alsenafi Registered Member Posts 1 Karma 0	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Tue Jun 02, 2015 9:58 pm What compiler are you using? Also, are you sure all the compiler optimizations are enabled?
cppalbrecht Registered Member Posts 3 Karma 0	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Wed Jun 03, 2015 7:16 am compiler is gcc 4.8.2 with optimization flags: -O3 -msse2 -DNDEBUG
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Wed Jun 03, 2015 9:39 am In this scenario, Eigen is only a very thin layer between your code and Suitesparse, so there is not much we can do on Eigen's side. However, Suitesparse solvers have to be linked to BLAS/Lapack library for efficient dense algebra. Matlab is using the highly optimized and parallelized MKL library for that. If you don't have a fast BLAS installed, you can compile Eigen's one yourself: Code: Select all `mkdir build cd build cmake path/to/eigen/sources -DCMAKE_BUILD_TYPE=Release -DEIGEN_TEST_SSE2=ON make blas` and link your code with build/blas/libeigen_blas_static instead of the default blas. If your CPU support AVX, then download the devel version of Eigen, and compile with -DEIGEN_TEST_AVX=ON (x2 speed up). Another possibility is that your matrix is extremely sparse, in which case the builtin SparseLDLT<> solver will be significantly faster than Cholmod.
cppalbrecht Registered Member Posts 3 Karma 0	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Tue Jun 09, 2015 6:45 am Thank you for your help. I had not in mind that MATLAB does not use stack suitesparse solvers. That should be the point. Anyways, is there by chance a straightforward way to use for instance PETSc solvers with Eigen?
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Tue Jun 09, 2015 2:41 pm I'm not sure to follow you, so to make it clear: AFAIK, Matlab does use suitesparse solvers, but suitesparse itself is not self-contained and it is has to be linked to a BLAS/LAPACK library. The speed of suitesparse will thus highly depend on the performance of the chosen BLAS/LAPACK library.
tienhung Registered Member Posts 29 Karma 0	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Mon Jul 06, 2015 6:37 am Dear ggael, I have similar questions about sparse solvers in Eigen. Their performance are not as good as it should be for Ax=b problem. Could you please help me to clarify them? My system: CPU Xeon e5-2609 v2 with 16 Gb DDR3 1600. Currently I use MKL with Eigen, suitesparse and SuperLU, I will try to compile BLAS with AVX option as you suggested. In the first testing, I run a loop on some different matrices A sized 10000x10000, each row has 25 non-zero elements. They could be symmetric but not positive definite. And here is the time performance I got: - SimplicialLDLT solver: 7.32861s (does it work even when the matrix A is not SPD???) - SparseLU solver: 28.3873s - UmfPackLU solver: 1.88132s - SPQR solver: 13.7433s - SuperLU solver: 21.1373s I don't know why the BiCGSTAB solvers are extremely slow and give me very bad accuracy, so I don't use them. In the second testing, I use matrices A as 40000x40000, each row has 25 non-zero elements. - SimplicialLDLT solver: 87s -> 93s - SparseLU solver: 429s -> 456s - UmfPackLU solver: 14s -> 17s - SPQR solver: 149s -> 162s - SuperLU solver: 358s -> 372s I am not sure if the results I got are reasonable, but I expect to get better results from Eigen as they are mentioned in http://eigen.tuxfamily.org/dox/group__TopicSparseSystems.html . Or maybe I miss something with the configuration? Could you please share me the matrices you used (vector_graphics, poisson_SPD, sherman2, ... ) for the Benchmark Routine? I think it'd be great if you provide these matrices to Eigen's users who want to bench/check the time performance. Indeed the UmfPackLU solver is the best one here, but as far as I know, Umfpack works only when matrix A is small. Could you suggest me which solver I should use for bigger and very sparse matrix A (size 500000x500000 and about 50-100 non-zero elements in each row), especially when it is not symmetric positive definite? In my demo, sometimes I need to solve a single Ax=b problem with very big and sparse A, sometimes I need to solve multi small dense linear problems under OMP parallel pragma. The question is do I need to disable/enable multi-threading in Eigen and how to do it? int N = 500000; solve sparse linear problem: Ax = b // A is big (NxN) and very sparse. Do I need to enable multi-threading for Eigen here? #pragma omp parallel for // do I need to disable multi-threading for Eigen here? for (int i = 0; i < N; ++i) { solve dense linear problem: Ax = b // the matrix A is 200x200 }
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Tue Jul 07, 2015 1:40 pm Which eigen version are you using? Please try with the 3.2.5, especially regarding SparseLU for which supernodes has been wrongly disabled in some 3.2.x versions. How are you generating the matrices? Random matrices are usually not representative of real-world problems: for sparse problems the structure matters a lot. If BiCGSTAB is so slow, then maybe you did not used the right preconditioner. For non diagonally dominant problems, try with IncompleteLUT<>. BTW, "50-100 non-zero elements in each row" is not "very-sparse", it is rather dense and for such problem BiCGSTAB, LeastSquareConjugateGradient, or ConjugateGradient are usually much faster than direct solvers. In the devel branch, these iterative solvers can even exploit OpenMP. I'll post an archive with our testbed matrices soon. Could you share yours so that we can include it in our tests. Clearly it is much different that what we currently have. A 10000x10000 one would be good enough as what matter the most for testing purpose is the structure. For OpenMP, see: http://eigen.tuxfamily.org/dox/TopicMultiThreading.html
tienhung Registered Member Posts 29 Karma 0	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Thu Jul 09, 2015 2:35 pm Hi ggael, Thanks for your support. I am using Eigen 3.2.5. In my problem, the matrix A is an approximation of Laplacian-Beltrami operator over big point sets. The number of non-zero element in row i equals to the number of the neighbors within a sphere centered at point p_i. These elements are defined as : A[i,j] = w_ij*f(p_i), here w_ij are weighted values. And A[i,i] = - sum_of_all(A[i,j]) with i != j (thus the matrix A is not diagonally dominant. And It is not symmetric because of double precision ). I tried BiCGSTAB with IncompleteLUT<double> but it still touch a long time to complete with bad convergence. It should work faster than other solvers. Here is a piece of my code with Eigen solvers. Code: Select all #define EIGEN_USE_MKL_ALL //#define EIGEN_DONT_PARALLELIZE #include "Eigen/Core" #include "Eigen/Dense" #include "Eigen/Sparse" #include "Eigen/SPQRSupport" #include "Eigen/UmfPackSupport" #include "Eigen/SuperLUSupport" #include "bench/BenchTimer.h" typedef Eigen::SparseMatrix<double, Eigen::ColMajor, int> SpaMat; typedef Eigen::SimplicialLDLT<SpaMat> SpaLDLTSolver; typedef Eigen::ConjugateGradient<SpaMat> SpaConjSolver; typedef Eigen::BiCGSTAB<SpaMat, Eigen::IncompleteLUT<double>> SpaBiCGSolver; typedef Eigen::SparseLU<SpaMat, Eigen::COLAMDOrdering<int>> SpaLUSolver; typedef Eigen::SparseQR<SpaMat, Eigen::COLAMDOrdering<int>> SpaQRSolver; typedef Eigen::UmfPackLU<SpaMat> SpaUmfPackSolver; typedef Eigen::SPQR<SpaMat> SpaSPQRSolver; typedef Eigen::SuperLU<SpaMat> SpaSupLUSolver; ....... //computer matrices A and B here Eigen::BenchTimer timer; timer.start(); SpaBiCGSolver solver; solver.compute(A); Ut = solver.solve(B); timer.stop(); std::cout << "time " << timer.value() << std::endl; Since the number of non-zero elements in each row is a input parameter, it could be from 25 (for small data sets with around 10k elements) to 500 (for big data sets with millions elements). I said my matrices are sparse because their sparsity is less than 1%. Guess I am not the only one who misunderstand the term "sparse matrix" in Eigen documentation. Anyways, the dense solvers are still much slower than the sparse solvers in my case. Maybe the time performance I got is reasonable, it is slow because of big number of non-zero elements. There are some weird rules in our company, so I have to ask my boss first before download and sharing anything from our servers. It could take time to get his answer. I will pm you later.
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen (3.2.4) solves sparse linear system slower than Ma Fri Jul 10, 2015 8:48 pm tienhung wrote:Hi ggael, Thanks for your support. I am using Eigen 3.2.5. In my problem, the matrix A is an approximation of Laplacian-Beltrami operator over big point sets. The number of non-zero element in row i equals to the number of the neighbors within a sphere centered at point p_i. These elements are defined as : A[i,j] = w_ijf(p_i), here w_ij are weighted values. And A[i,i] = - sum_of_all(A[i,j]) with i != j (thus the matrix A is not diagonally dominant. And It is not symmetric because of double precision ). hm, I guess it is not symmetric because f(p_i) != f(p_j), but if w_ij==w_ji, then you form the A as A(i,j) = w_ij, and solve the problem: A x = F.asDiagonal().inverse() * b; with F a VectorXd with F[i]=f(p_i). Now A is symmetric, and a ConjugateGradient should do the job pretty well. Again, to get the best of ConjugateGradient, use the devel branch and: ConjugateGradient<SparseMatrix<double>, Lower\|Upper> cg(A); x = cg.solve(F.asDiagonal().inverse() * b); You might even compile with -fopenmp to get multithreading.

Page 1 of 1 (10 posts)

Bookmarks

Who is online

Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]