Registered Member
|
Hello Eigen community,
First of all, thanks for this C++ library, I really enjoy it. I wrote a C++ nonlinear finite element solver using Eigen for all the linear algebra routines. At each iteration of the Newton-Raphson algorithm, I have to solve a large sparse linear system Ax = b, where A is a 33756 x 33756 sparse matrix (yielding 1.139467536e+09 elements), with 1.19472e+06 non zero elements. Matrix A is not necessarily symmetric due to boundary conditions. I tried a few solvers and got the best performance with UmfpackLU which takes approximatively 20 seconds while it takes only 5 seconds to MATLAB. I was told that this difference can be due to the fact that MATLAB's solvers "mldivide" (or "\") is multhreaded. Therefore a tried the -fopenmp flag with the following compilation command:
and run the program as follows
I tried it on three different computers (respectively with 2, 32 and 4 as default OMP_NUM_THREADS). On the two first of them, it didn't change anything: there is no multithreading at all. On the third one, I observed that 4 threads are open when the linear systems are solved. However, I get a better performance (~9 s instead of ~20 s) by setting n=1 instead of n= 2,3, or 4! How can we explain that ? Why can I multithread my program on the third computer but not on the two other ones ? (I checked the number of threads with the top H command) And finally, why do I get a better performance by setting n = 1 instead of letting the default value (n = 4) on the third machine ? I can eventually export the matrix A and vector b (https://eigen.tuxfamily.org/dox-devel/group__TopicSparseSystems.html) if necessary. Thanks for your help, B. |
Moderator
|
If you're using Umfpack for solving, then enabling openmp on the compiler side won't change anything. You have to enable multithreading when configuring/compiling suitesparse itself.
|
Registered Member
|
Thanks for your quick answer. Unfortunately I don't how to do that (><). I'll ask google!
Thanks again, B. |
Registered Member
|
Hello,
I managed to enable multithreading for suitesparse however I still have one little issue. I noticed that only four threads are opened and I didn't find how to specify the number of threads I'd like umfpack to use. I tried
or
but it doens't work at all. When I use the following commands
it returns the right "n" I exported but only four threads are opened Any ideas? Thanks a lot. Brian. |
Registered Member
|
The max number of threads depends on your CPU. What is your CPU model?
|
Registered Member
|
Hello,
this is what I get when typing lscpu: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Stepping: 2 CPU MHz: 3458.089 BogoMIPS: 6915.95 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0,2,4,6,8,10 NUMA node1 CPU(s): 1,3,5,7,9,11 Seems like I could open at least 12 threads unless i'm mistaken |
Registered Member
|
Did you modify the source code of suitesparse or your non-linear iteration code with OpenMP directive ?
if not, it can't be impoved. Most of time, A Linear solver implements the algorithm with multi-thread already. matlab is much effective due to the MKL library |
Moderator
|
if umfpack does not use openmp, then playing with OPENBLAS_NUM_THREADS is pointless. Again refer to umfpack documentation.
|
Registered users: Bing [Bot], Evergrowing, Google [Bot], rockscient