Registered Member
|
Hi,
Until eigen 3.1.0, I used eigen's SparseLDLT for solving problems like Ax=B, where A is sparse complex (and spd), and x and B are dense matrices. Today I switched to 3.1.0 and had to modify of course my code to use SimplicialLDLt. The problem is I notice a large 4x performance drop during the decomposition, not during solve. Is it expected for the moment? I mean do you intend to get back to SparseLDLT performance? example code:
Best regards, Romain
rbossart, proud to be a member of KDE forums since 2008-Dec.
|
Moderator
|
hm, the main (and in theory only) difference is that SimplicialLDLT performs a fill-in reducing re-ordering that, in all my experiments, significantly speeds up the factorization. Perhaps your matrix is already well ordered for factorization? Unfortunately there is currently no function to tweak the re-ordering. You could check that's indeed the reason by tweaking the file SimplicialCholesky.h:
in the function analyzePattern, comment the block line 638, and line 646, add: m_P.resize(0);. This should look like: ... /*{ CholMatrixType C; C = a.template selfadjointView<UpLo>(); // remove diagonal entries: C.prune(keep_diag()); internal::minimum_degree_ordering(C, m_P); }*/ m_P.resize(0); if(m_P.size()>0) m_Pinv = m_P.inverse(); else m_Pinv.resize(0); ... |
Registered Member
|
Thanks Gaël,
Following your instructions, performance is still behind, but way closer to SparseLDLT, (e.g. now 178ms against 112ms for SparseLDLT = a factor of about 1.5) Do you have other clues like this one to get closer to original ? Thanks, Romain
rbossart, proud to be a member of KDE forums since 2008-Dec.
|
Moderator
|
I guess your matrix has a special structure, it would be interesting if you could send it to me. You can #include <unsupported/Eigen/SparseExtra> and call Eigen::saveMarket(mat, "filename"), and gzip the resulting file. Would be very useful for our unit tests.
Now regarding the last ms to catch up, we would need more bench. For instance, it would be interesting to bench separately the analyzePattern and factorize steps (I think they were called differently in SparseLDLT). |
Registered Member
|
I can report a similar experience.
Moving from SparseLDLT to SimplicialLDLT caused a severe performance drop. I tried the codefix you suggested and it helped a good deal, but it's still not as good as it was. I have only done "eye measuring" of the performance, but the performance drop in both cases seems similar in magnitude to what rbossart reported. My matrix is 12k by 12k and contains ~70k nonzeros (140k accounting for symmetry), so it's linearly sparse. The nonzeros are evenly spread along and close to the diagonal. Could it be that the permutation reordering should be skipped/modified when the matrix becomes very sparse? |
Moderator
|
Indeed, if your entries are already very closed to the diagonal, then the somewhat costly reordering should be skipped. We are working on an interface to control the reordering.
Could you save your matrix into a file and send me a zipped version? To save the matrix: #include <unsupported/Eigen/SparseExtra> Eigen::saveMarket(mat, "filename.mtx'); |
Moderator
|
haavind, I tried your matrix (12k x 12k) and i get:
0.0016 sec for SimplciialLDLT 0.0019 sec for SparseLDLT using default eigen 3.1 code. So the new version is rather faster here! |
Registered Member
|
I did some profiling myself with different results:
SimplicialLDLT: 0.010 s - 0.011 s SimplicialLDLT wo/permutation: 0.0050 s - 0.0055 s SparseLDLT: 0.0036 s - 0.0040 s Do the new solver have any optimal preference regarding collumn/row major and upper/lower triangular matrices? |
Moderator
|
Here is my benchmark, compiled with gcc 4.6, -O2 -DNDEBUG:
Make sure you compile with optimizations enabled. Also what's your CPU? (mine is a 2.3 core i7) |
Moderator
|
nevertheless, if I disable reordering, then I get a significant speedup:
SimplicialLDLT wo/permutation: 0.00065s So it still makes sense to add this option. |
Registered Member
|
my cpu is a 2.67ghz dual core i7-620M that can clock temporarily up to 3.33ghz on strongly single threaded tasks.
i compiling with msvc, (VS2010) and optimization is set to -O2 if i remember correctly. But my matrix is column major, upper triangular. Don't SimplicialLDLT assume lower triangular unless told otherwise? It fits with your profile results beeing ~5 times better than mine. |
Moderator
|
alright, I did not see your file contained the upper half only. Now I can reproduce similar results:
SimplicialLDLT: 0.055s SimplicialLDLT wo/permutation: 0.027s SparseLDLT: 0.021s |
Registered users: Baidu [Spider], Bing [Bot], Google [Bot]