Sparse solvers too slow

Board index

Page 1 of 1 (5 posts)

Tags:

Decomposed_cadaver Registered Member Posts 4 Karma 0	Sparse solvers too slow Wed Jan 17, 2018 5:29 pm Hi, this was probably asked many times, however I cannot find method that would solve my linear systems fast and no tip from forums works for me. I tried: BiCGSTAB (I terminated program after 2 mins of not returning a result) SimplicialLDLT (the same, not returning result within reasonable time limit) LeastSquaresConjugateGradient (same as before) SparseLU (about 10 seconds, which is still kinda unexpectedly long) SparseQR (just eats up more and more memory, turned it off after some time of not returning a result) The matrix comes from a PDE discretization, expected dimensions are about 100k X 100k and more, it is sparse, as only adjacent points in a grid give nonzero coefficients in each row. The matrix I'm trying to solve right now is 0.021% (2908393 specified elements, 117398 rows & cols). Unfortunately the matrix is not symmetric. I can upload it somewhere if you want to see the pattern or test the matrix itself. Should I upload the matrix somewhere so you can see it and tell if there's anything more to be done to speed it up? As to the actual procedure, the code looks like this: Code: Select all `Eigen::SparseMatrix<long double, Eigen::ColMajor> M; VectorXld res; //filling up M, filling up load Eigen::SparseLU< Eigen::SparseMatrix<long double, Eigen::ColMajor> > LUSolver; LUSolver.analyzePattern(M); LUSolver.factorize(M); res = LUSolver.solve(load);` VectorXld is defined as Code: Select all `typedef Eigen::Matrix<long double, Eigen::Dynamic, 1> VectorXld;` What could be the bottleneck? Wrong method? long double type? I'm using Visual Studio 2015, Another strange thing is that even with openmp enabled, the profiler never shows CPU load to be higher than 25% - which indicates that only one core out of 4 is utilized. Aren't linear solvers parallelizable? By the way, solvers tend to take up 2GB of memory, which is suspiciously much for this kind of problem - the whole stiffness matrix should take up MAX 200 MB (this one takes up about 100MB so memory-wise, to use those algorithms, one has to have at least 10x of memory available than really needed for the matrix). I can imagine that inefficient and/or pointless memory access can slow these algorithms down. Is there a way to help it? Thanks. P.S.: I uploaded the matrix for the reference https://file.io/nk3kE2 (AFAIK, after factorize, which takes so much time to compute, the actual solve routine is fast, so only the matrix itself is important and not the RHS vector)
ggael Moderator Posts 3447 Karma 19 OS	Re: Sparse solvers too slow Wed Jan 17, 2018 9:24 pm It seems that you are working on a 2D domain with a 5x5 stencil. For 2D domains, direct solvers are often the fastest and the most reliable. This is why SparseLU exhibit best performance. On current CPUs there is very little precision gain to use long double, but strong performance hit. Switching back to double will leverage SIMD instruction and I expect SparseLU to be 2-4 times faster. It is also important to understand that direct solvers have to factorize the input matrix, and regardless of all the effort to reduce fill-in, the factors are always significantly more dense than the input matrix. A factor 10 is something normal, which is why iterative solvers are attractive, especially for 3D problems that are denser with higher fill-in. If you want more performance, then you might try Eigen's wrapper to UmfPack, Pastix, or MKL's Pardiso. Of course you need to install one of this lib first.... Finally, regarding BiCGSTAB, you should really be able to get something in a reasonable amount time by switching back to double, using IncomplteLUT and lowering the threshold.
Decomposed_cadaver Registered Member Posts 4 Karma 0	Re: Sparse solvers too slow Thu Jan 18, 2018 12:35 am Thank you very much for the answer! I forgot to mention, this was obtained by the finite element method, but yes, the matrix is very similar to what a 5 stencil would look like. Switching to double: factorization now takes 8.8s: not much of an improvement and it's still slower than Mathematica's LinearSolve with 2.5s. I'm wondering what magical algorithm does Mathematica use so it's that much faster. Actually, I was expecting that anything raw made as a console c++ application would compute way faster than Mathematica (M can sometimes get...unreasonably slow). I have an idea: some time ago, just from the lack of imagination, I used Jacobi iteration scheme to obtain the result and it seemed fast. Is there any similar method already in Eigen, that is perhaps a bit faster (Jacobi converges slowly) and more stable (Jacobi demands a lot from the matrix - diagonal dominance). I suspect it can even now be solved by the simplest Jacobi method...in fact, the system is just a linearization of a system of nonlinear equations, so I have to iterate over that as well - by using Jacobi I can perhaps update both and don't wait with the nonlinear terms update until Jacobi converges, however I don't know whether I can expect a convergence in such drastic approach... EDIT: Oh **** I just realized that a portion of my matrix has zeroes on diagonal. So I cannot use Jacobi.
ggael Moderator Posts 3447 Karma 19 OS	Re: Sparse solvers too slow Thu Jan 18, 2018 8:49 pm I don't known about Mathematica, but Matlab relies on very advanced solvers written in C or equivalent language such as the Intel MKL library, or some parts of SuiteSparse. I guess this is the same for Mathematica so it is not surprising that Mathematica is faster for this precise task. As I said, you can call UmfPack or other more advanced solver from Eigen too. The research in fast sparse linear solver is still a very active research area with extremely sophisticated algorithms...
Decomposed_cadaver Registered Member Posts 4 Karma 0	Re: Sparse solvers too slow Sun Jan 21, 2018 8:01 pm Hi, it's me again. I'm trying hard to get it working, but I'm kinda lost...I read somewhere that in order to get umfpack, I need something called "suitesparse". So I downloaded it and put it into my Visual Studio directory so I can include it. In the suitesparse directory there is the umfpack directory, with subdirectory source and there is a bunch of header files. Which of them should I include to get this: https://eigen.tuxfamily.org/dox-devel/classEigen_1_1UmfPackLU.html working? Isn't there an easier way to include something short (like "Eigen/Sparse" instead of 20 header files)? There is one "makefile" file tho, but it gives me error upon including that one.