This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Simplicial LLT/L(D)LT - performance

Tags: sparse, cholesky, llt, ldlt, performance, spd sparse, cholesky, llt, ldlt, performance, spd sparse, cholesky, llt, ldlt, performance, spd
(comma "," separated)
kalesony
Registered Member
Posts
3
Karma
0

Simplicial LLT/L(D)LT - performance

Fri Sep 21, 2012 9:08 am
Hi, this is my first post here and would like use this opportunity to thank everyone who contributed to Eigen :).

Now, back to my question.

a. Problem: solving system of linear equations with a sparse SPD coefficient matrix (H)
b. Dimensions: H is now precisely 104x104
c. Sparsity:
____c1. H_non_zero_entries_cnt/H_all_entries_cnt = ca. 0.45
____c2. sparsity pattern is fixed for the entire scenario
d. Usage scenario: solving the linear system multiple (hundreds of thousands -> millions -> ...) times; for each factorization there are ca. 180 "solves"

[note: points above describe the actual situation which made me create this thread, but in general: H can be arbitrarily large and the proportion between factorizations and solves can be different, which does not change the fact that solves are the bottleneck]

e. Approach: I used to treat H as a dense matrix and apply dense L(D)LT solvers but I needed MUCH shorter execution times => upon noticing (c1) I turned to sparse H and simplicial L(D)LT. To exploit (c2) I use a single analyzePattern() and then only factorize()-s and solve()-s. The same sparse matrix object is used the entire time - only non-zero entries (InnerIterator) are updated (before each factorization).

f. Question: after compilation and testing I was happy to see proper results but a bit frustrated with the performance:
____+ sparse factorize() turned out to be comparable to dense compute()
____+ sparse solve() turned out to be comparable (actually - a bit slower) to the dense solve()

Was I being naive hoping for performance boost? Do you know a better way I could exploit the sparsity of H to get the boost?

Thanks!
Dee33
Registered Member
Posts
54
Karma
0
OS
Your matrix is actually very small.
I'm afraid you won't see any difference between dense and sparse solve.
In general, you should call analyzepattern() only once as long as the structural pattern does not change, and then call factorize() once as long as the values in the matrix does not change, then call solve() for all right hand sides. I assume It is what you are already doing.
For your particular case, since your matrix is almost 50% full, it's better to call a dense solver.
kalesony
Registered Member
Posts
3
Karma
0
Dee33 - thanks for the speedy reply.
So 50% sparsity is not all that sparse, huh? This is bad news.

The thing that originally got my hopes up was this article by Roy Featherstone - it is about LTL/LTDL factorizations with no fill-in. I know this is a specific case but I imagined (mistakenly, as it seems) that "generic" sparse solvers would also provide some speed-up. Nevertheless, my H matrix is precisely the one described in this paper so all Feathersotne's assumptions hold.

It would be great if I could pick your brains on the following subject: is it worth spending time trying to go with the aforementioned article in hope of getting a substantial speed-up compared to Eigen's built in L(D)LT solvers? I am asking this since I imagine that Eigen's implementations (both dense and sparse ones) are already highly optimized while the ideas presented in the article are abstracted from actual implementation reality and may not provide any improvement once implemented within Eigen. To be honest - I had a (brief) look at the Eigen's code for these solvers and my eyes soon began to bleed ;) (although I know that they are far less complex than other modules).

Please let me know - is it worth further investigation or maybe chances are way too slim and I should look for optimizations elsewhere?
Dee33
Registered Member
Posts
54
Karma
0
OS
I see. With a careful implementation of the cholesky-like decomposition on your matrix, it will produce no fill in.
Sounds good.
But I remain skeptical about performance on a so small matrix. If your application will deal with larger matrices in the future, then you can consider spending time to implement their proposed approach.
In the actual case, It's better to keep the dense solver, it benefits from vectorization and is more cache friendly, unlike the sparse implementation with multiple indirect adressing.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
I also don't believe you'll get any performance gain with only 50% of non zeros even with no fill-in. The full dense case has the advantage to be vectorized (x4 for floats on SSE), to enable direct and coherent memory access, etc.
kalesony
Registered Member
Posts
3
Karma
0
I get it - thanks a lot for your suggestions :)


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot], Yahoo [Bot]