Registered Member
|
Hi,
I'm writing a computing-intensive application that makes a heavy use of conjugate gradient with very large (10^6) sparse matrix (namely a VLSI circuit placer). For its simplicity and since I use C++, I chosed Eigen. However, I would like more performance and an 8x speedup from using parallelism would be particularly useful. AFAIK, there is no support for parallel sparse-matrix*vector product now in Eigen. I don't know the codebase yet, but I would like to give some time if I can help with an implementation. Is anyone else interested in this functionality? In the meantime, do you know if there are other sparse-matrix libraries that use thread parallelism without the pain of using MPI - or is it already in Eigen and I missed it? Thanks. |
Moderator
|
I have local changes somewhere implementing parallel sparse*dense products but I never pushed them because I never observed nice speed-up. Probably because of NUMA.
|
Registered Member
|
It's surprising! I would expect an almost linear speedup for large matrices, since only the vector will have to be shared and each thread will access the sparse matrix's storage in a streaming manner. NUMA... are you benchmarking on a cluster?
I'll probably end up trying a naive parallelization with openmp and see how it works for me. Thank you! |
Moderator
|
sparse-vector products are memory bounded, so performance gain probably depend a lot on the matrix structure. If you wanna try something, search for "scaleAndAddTo" in Eigen/src/SparseCore/SparseDenseProduct.h and add an omp directive on the outermost loop.
|
Registered users: Baidu [Spider], Bing [Bot], Google [Bot], Yahoo [Bot]