Registered Member
|
I just upgraded from eigen 3.1.3 to 3.2, in order to use the new SparseLU solver. Until recently i've made due with the IncompleteLUT preconditioner by setting droptolerance to 0.0 and fillfactor sufficiently large.
I first run the compute function once, wich works fine. On the second run however, when i call factorize, the SparseLU solver fails with some sort of memory fault on the member variable m_glu.lusup. The source of the error is in a try/catch block in the memInit function in SparseLU_Memory.h. When a call to "expand" fails, the m_data pointer inside glu.lusup is corrupted. The actual error isn't triggered until it reaches line 130 in SparseLU_column_bmod.h: "glu.lusup(nextlu) = dense(irow);". My matrix is 17324x17324, with 223630 non-zeros. Halving the the problem size to about 9000x9000 avoids the memory bug, but even then, SparseLU runs substantially slower than the IncompleteLUT solver did. |
Registered Member
|
Example of code that fails
|
Moderator
|
Cannot reproduce with this complete code (gcc 4.7.3, Linux 64bits, -O2 -g2):
I also tested with valgrind: no memory error detected. What's you platform/compiler? |
Registered Member
|
I work in Visual Studio 2010, 32bit compilation. You could try to increase 'N' to see if that provokes the bug. I've had a look around in the code and have gotten a better idea, i think, of whats going on:
The "memInit" function in "SparseLU_Memory.h" computes a value that is used to allocate working memory. It's the third line in the function:
'm' and 'n' are the dimensions of my matrix, 'annz' are the nonzero count, and fillratio is a hardcoded performance tuning value set at 20. This means that the values of glu.nzumax and glu.nzlumax are set to somewhere between 1 and 20 times the size of the corresponding dense matrix, i.e. huge. The values are then passed on to the "expand" function, and used to allocate working memory. For my 17kx17k matrix it means more than 1GB, allocated 4 times. The first time i run compute, the call to "expand" fails a couple of times, each time halving the size it's trying to allocate. After a couple of times, the allocation succeeds, and the factorization continues, assumingly as it's supposed to. The second time around, the algorithm tries to reallocate working memory even though as far as I can see, it still has the mamory from last time. On the first call to "expand", the allocation fails as before. The data pointer inside glu.lusup is now corrupted. This time however, presumably because the vectors are allready allocated from last time, the do/while clause thinks it's finished and everything worked as planned, so it exits. I'm thinking working memory should ideally be allocated once during the "analyzePattern" step, not "factorize", and it should be and order of magnitude less than the size of the dense matrix.
|
Registered Member
|
The max on SparseLU_Memory.h line 143 should be a min.
The original SuperLU code that this file was ported from does not have the max. I think there is another issue besides this one with SparseLU for multiple calls to compute and solve with the same solver object. Only the first solve is gives the correct solution when I use SparseLU in my application. I haven't succeeded in reproducing the issue on a small test piece of code to post here, yet. |
Registered Member
|
I agree it should probably be a 'min' instead of a 'max' (indeed, my own code seems to work just fine if I change it), but the problem is just as much that the seemingly 'safe' routine for allocating memory isn't as safe as it's trying to appear.
To adress the performance differences i observe compared to incompleteLUT, it seems that the COLAMD reordering method that is default in SparseLU isn't as effective as the AMD reordering for my problem. However, when i override COLAMD with AMD, as far as I can make out the sourcecode, the solver uses the AMD-permutation to do a column reordering only. Seeing as AMD is based on a symmetric version of the input matrix (A + A^T, or A*A^T), I'm fairly certain the reordering should be applied to both sides of the matrix, keeping the symmetry property. The incompleteLUT does just this. |
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]