Parallel inversion of large matrices

Board index

Page 1 of 1 (6 posts)

Tags:

MarkoFranko Registered Member Posts 11 Karma 0 OS	Parallel inversion of large matrices Sun Sep 18, 2011 12:01 pm Hello! I have in my code few extremely large matrices of dimension 2120100=24 000 (complex numbers) And I need the inverse matrices of them. LargeMatrix.inverse() takes tooo long. Is it possible to parallelize this inversion with openMP? thanks
ggael Moderator Posts 3447 Karma 19 OS	Re: Parallel inversion of large matrices Sun Sep 18, 2011 12:54 pm First, are you sure you need the inverse ? if you do computation like "A^-1 * B" then rather do X = A.lu().solve(B) and you need A^-1 multiple time create a PartialPivLU<> object. Second, a 24k^2 matrix is indeed pretty big for dense storage. Are you sure your matrix is not sparse, i.e., that it contains much more 0 entries than non zeros, like 3 to 20 non zeros per row or column ? If that's the case you should consider using a SparseMatrix<> with associated sparse solvers. Nevertheless, for dense objects you can enable parallelization by enabling OpenMP ( -fopenmp with GCC). This should speedup the LU factorization step.
MarkoFranko Registered Member Posts 11 Karma 0 OS	Re: Parallel inversion of large matrices Sun Sep 18, 2011 6:42 pm Hi ggael, thank you for your reply. ggael wrote:First, are you sure you need the inverse ? if you do computation like "A^-1 * B" then rather do X = A.lu().solve(B) and you need A^-1 multiple time create a PartialPivLU<> object. Unfortenately I need the whole inverse matrix. But one part of the code was optimized with your suggestion (X = A.lu().solve(B)). ggael wrote:Second, a 24k^2 matrix is indeed pretty big for dense storage. Are you sure your matrix is not sparse, i.e., that it contains much more 0 entries than non zeros, like 3 to 20 non zeros per row or column ? If that's the case you should consider using a SparseMatrix<> with associated sparse solvers. min 50% of the matrix are non zeros ggael wrote: Nevertheless, for dense objects you can enable parallelization by enabling OpenMP ( -fopenmp with GCC). This should speedup the LU factorization step. My for loops are parallelized with openmp Code: Select all `#pragma omp parallel { #pragma omp for for loop }` and it works quite well. But when Itry Code: Select all `#pragma omp parallel { InverseOfLargeMatrix=LargeMatrix.inverse(); }` then it takes longer to calculate the inverse than without openmp and I get wrong result. Is it the wrong way? Perhaps is it possible to split the matrix inversion in blocks and than to calculate them with a parallel loop?
ggael Moderator Posts 3447 Karma 19 OS	Re: Parallel inversion of large matrices Sun Sep 18, 2011 8:34 pm simply compile with OpenMP enabled, do not add the "#pragma omp parallel " directive. Basically, the first step of the matrix inversion is to compute a LU factorization. For large matrices this is done per block and the bottleneck appears to be a matrix-matrix product. That is this later operation which is parallelized.
MarkoFranko Registered Member Posts 11 Karma 0 OS	Re: Parallel inversion of large matrices Tue Sep 20, 2011 1:19 pm ggael wrote:simply compile with OpenMP enabled, do not add the "#pragma omp parallel " directive. Basically, the first step of the matrix inversion is to compute a LU factorization. For large matrices this is done per block and the bottleneck appears to be a matrix-matrix product. That is this later operation which is parallelized. I did it as you suggested, but I see only one cpu working when it comes to inversion of the matrix. Is there some implementation like: Code: Select all `MatrixA.inverse().block(a,b)` where the result would be a special block of the inverse of the matrix MatrixA? Because than it would be possible to make a parallelized loop and to calculate all blocks of the inverse. But I am not sure whether it is the correct way, when I look at this: http://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion
ggael Moderator Posts 3447 Karma 19 OS	Re: Parallel inversion of large matrices Tue Sep 20, 2011 10:13 pm Parallelizing a matrix inversion is not trivial at all. As I told you the parallelization only happens during the LU decomposition step and not during the computation of U^-1 * L^-1 that probably explain why you don't notice it.