This forum has been archived. All content is frozen. Please use KDE Discuss instead.

SimplicialLDLt / SparseLDLT performance

Tags: None
(comma "," separated)
rbossart
Registered Member
Posts
14
Karma
0
Hi,

Until eigen 3.1.0, I used eigen's SparseLDLT for solving problems like Ax=B, where A is sparse complex (and spd), and x and B are dense matrices.

Today I switched to 3.1.0 and had to modify of course my code to use SimplicialLDLt. The problem is I notice a large 4x performance drop during the decomposition, not during solve.

Is it expected for the moment? I mean do you intend to get back to SparseLDLT performance?

example code:
Code: Select all
typedef std::complex<double> char_type;
typedef Eigen::Matrix<char_type, Eigen::Dynamic, 1> vector_char;

// I have a buffer somewhere, I just map it, thanks to the clever Map
Eigen::Map<vector_char> X(vector_char::Map(&buffer[0], buffersize));
vector_char B(buffersize);


typedef Eigen::SparseMatrix<char_type> EigenSparseSelfAdjointMatrix;
//... build A and B ... (X is already mapped and ready) ...

// finally decompose and solve
SimplicialLDLt<EigenSparseSelfAdjointMatrix> ldltOfA(A);
X = ldltOfA.solve(B);


Best regards,
Romain


rbossart, proud to be a member of KDE forums since 2008-Dec.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
hm, the main (and in theory only) difference is that SimplicialLDLT performs a fill-in reducing re-ordering that, in all my experiments, significantly speeds up the factorization. Perhaps your matrix is already well ordered for factorization? Unfortunately there is currently no function to tweak the re-ordering. You could check that's indeed the reason by tweaking the file SimplicialCholesky.h:

in the function analyzePattern, comment the block line 638, and line 646, add: m_P.resize(0);. This should look like:

...

/*{
CholMatrixType C;
C = a.template selfadjointView<UpLo>();
// remove diagonal entries:
C.prune(keep_diag());
internal::minimum_degree_ordering(C, m_P);
}*/

m_P.resize(0);

if(m_P.size()>0)
m_Pinv = m_P.inverse();
else
m_Pinv.resize(0);

...
rbossart
Registered Member
Posts
14
Karma
0
Thanks Gaël,

Following your instructions, performance is still behind, but way closer to SparseLDLT, (e.g. now 178ms against 112ms for SparseLDLT = a factor of about 1.5)

Do you have other clues like this one to get closer to original ? ;-)

Thanks,
Romain


rbossart, proud to be a member of KDE forums since 2008-Dec.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
I guess your matrix has a special structure, it would be interesting if you could send it to me. You can #include <unsupported/Eigen/SparseExtra> and call Eigen::saveMarket(mat, "filename"), and gzip the resulting file. Would be very useful for our unit tests.

Now regarding the last ms to catch up, we would need more bench. For instance, it would be interesting to bench separately the analyzePattern and factorize steps (I think they were called differently in SparseLDLT).
haavind
Registered Member
Posts
13
Karma
0
OS
I can report a similar experience.

Moving from SparseLDLT to SimplicialLDLT caused a severe performance drop.

I tried the codefix you suggested and it helped a good deal, but it's still not as good as it was. I have only done "eye measuring" of the performance, but the performance drop in both cases seems similar in magnitude to what rbossart reported.

My matrix is 12k by 12k and contains ~70k nonzeros (140k accounting for symmetry), so it's linearly sparse.
The nonzeros are evenly spread along and close to the diagonal.

Could it be that the permutation reordering should be skipped/modified when the matrix becomes very sparse?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Indeed, if your entries are already very closed to the diagonal, then the somewhat costly reordering should be skipped. We are working on an interface to control the reordering.

Could you save your matrix into a file and send me a zipped version? To save the matrix:

#include <unsupported/Eigen/SparseExtra>


Eigen::saveMarket(mat, "filename.mtx');
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
haavind, I tried your matrix (12k x 12k) and i get:

0.0016 sec for SimplciialLDLT
0.0019 sec for SparseLDLT

using default eigen 3.1 code. So the new version is rather faster here!
haavind
Registered Member
Posts
13
Karma
0
OS
I did some profiling myself with different results:

SimplicialLDLT: 0.010 s - 0.011 s
SimplicialLDLT wo/permutation: 0.0050 s - 0.0055 s
SparseLDLT: 0.0036 s - 0.0040 s

Do the new solver have any optimal preference regarding collumn/row major and upper/lower triangular matrices?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Here is my benchmark, compiled with gcc 4.6, -O2 -DNDEBUG:

Code: Select all
...
using namespace Eigen;

typedef SparseMatrix<double> SpMat;

#include <iostream>
int main(int argc, char** argv)
{
  SpMat A;
  VectorXd x, b;
  loadMarket(A,argv[1]);
  x.setRandom(A.cols());
  b : A * x;
  std::cout << A.cols() << "\n";
  SimplicialLDLT<SpMat> ldlt;
//  SparseLDLT<SpMat> ldlt;

  BenchTimer t;
  t.start();
  ldlt.compute(A);
  t.stop();

  std::cout << t.best() << "   " << (ldlt.info()==Success) << "\n";

  t.reset();
  t.start();
  x = ldlt.solve(b);
  t.stop();
  std::cout << t.best() << "\n";
}


Make sure you compile with optimizations enabled. Also what's your CPU? (mine is a 2.3 core i7)
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
nevertheless, if I disable reordering, then I get a significant speedup:

SimplicialLDLT wo/permutation: 0.00065s

So it still makes sense to add this option.
haavind
Registered Member
Posts
13
Karma
0
OS
my cpu is a 2.67ghz dual core i7-620M that can clock temporarily up to 3.33ghz on strongly single threaded tasks.
i compiling with msvc, (VS2010) and optimization is set to -O2 if i remember correctly.

But my matrix is column major, upper triangular. Don't SimplicialLDLT assume lower triangular unless told otherwise? It fits with your profile results beeing ~5 times better than mine.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
alright, I did not see your file contained the upper half only. Now I can reproduce similar results:

SimplicialLDLT: 0.055s
SimplicialLDLT wo/permutation: 0.027s
SparseLDLT: 0.021s


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot]