Registered Member
|
Okay I did some research... the routines you're using here (both for dense and sparse) are very basic, with no pivoting or blocking. This is fine for hermitian positive definite matrices but misses a key advantage of LDLT over LLT - namely, that a good LDLT can in fact handle indefinite as well as skew-hermitian matrices. To cope with indefinite matrices, one must go to a blocked version, where the pivot block size is normally 1x1 but goes to 2x2 if the singleton pivot is too small. Skew-hermitian matrices can be handled by premultiplying them with i (the imaginary unit), which makes them hermitian (but generally indefinite). A very good reference for all this is Chapter 11 of: Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms 2nd Ed., SIAM 2002. ISBN 0898715210, 9780898715217 (I got the entire chapter as preview on Google Books ) So I see two ways forward: 1. Hack 2x2 blocks and pivoting into Eigen's existing LDLT codes, or 2. Adopt into Eigen existing generic dense/sparse block LDLT codes with pivoting. I'll have a look around to see if I can find a code suitable for 2. So far I found a dense one that looks interesting: http://www.alglib.net/matrixops/symmetric/ldlt.php The MPFR/GMP version seems to use custom payload types. What do you think? No luck so far for a sparse version. Cheers
Last edited by linoleo on Tue Jan 27, 2009 8:00 am, edited 1 time in total.
|
Moderator
|
yes I know it's very basic !
for sparse matrices and direct solver there is a very interesting book: http://www.cise.ufl.edu/research/sparse/CSparse/ this book is accompanied with small C library which is open source (LGPL) and available on the web site. If I remember well, it includes routines for LLT, LDLT, and LU using the same code base. Though this library is designed for teaching purpose (eg., support only double), it has a very good simplicity/performance ratio, and I so this would a be a very good starting point. |
Registered Member
|
Guys,
I'm sorry that I was wrong when I said that partial pivoting was unusable for large matrices: that was wrong. I experimented partial pivoting today and my results are that it works well for invertible matrices. It can be used to invert large invertible matrices. Where partial pivoting doesn't work, is for non-invertible matrices: computing the rank/kernel/image only works with full pivoting. Sorry for the noise!
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
As far as I can tell CSparse does not include any LDLT, blocked or otherwise. The same author (Tim Davis) wrote the simple unblocked LDLT code you're already using, so this won't help us at all. |
Registered Member
|
Ah, that makes sense - if you can have an extensive nullspace, you may not be able to find a non-zero pivot at all unless you search the whole matrix. Simplest example: a matrix with only one non-zero element... |
Registered Member
|
I just committed partial-pivoting LU in trunk. I restricted it to square invertible matrices -- which is the main use case for it.
So if you're interested in performance, give it a try. The methods inverse() and determinant() in MatrixBase have been updated to use it instead of full-pivoting LU, so if you use them you don't have to change anything in your code. The class name is currently PartialLU but that is not definitive. Here with gcc-4.4 and SSE2, and a 1000x1000 float matrix I get the following improvements: inverse() : 3.0s (full pivoting LU) --> 2.0s (partial pivoting LU) lu decomposition itself (e.g. if you want a single vector solve): 1.8s (full pivoting LU) --> 0.8s (partial pivoting LU) Thanks to everybody for your input and making me realize that there is a use case for partial pivoting LU...
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
Hi,
I'm new to Eigen and C++. First of all I would like to thank all the developers for this impressive library. I'm not sure how to use the new partial pivoting with lu:
Is this implemented in the last release of Eigen (I'm using 2.0.2)? My original problem is the solution of a linear system:
Compiled with g++-4.2 -O3 -sse3 this takes 1.8sec on my new Intel Macbook 2Ghz. Equivalent codes written first in Matlab and then in C++ using the library armadillo (blas/lapack are provided by the apple accelerate framework) take 0.3sec and 0.2sec respectively. Is this difference due to the use of full pivoting in Eigen? I'm sure that I'm missing something here but I can't figure out what that is. Thanks for your help, Giacomo |
Registered Member
|
No, it is in the development branch, will appear in 2.1. [quote] My original problem is the solution of a linear system: [code] int main() { MatrixXd A = MatrixXd::Random(1000, 1000); MatrixXd B = MatrixXd::Random(1000, 1); MatrixXd X; double starttime=clock(); A.lu().solve(B, &X); cout 1 cores and your software is able to parallelize this operation, that may also explain it. EDIT : I can see that you defined B as a matrixXd while it is really a vector. You have some room for improvement defining B as a VectorXd instead. Same for X.
Last edited by bjacob on Mon Jun 01, 2009 6:13 pm, edited 1 time in total.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
Thanks for your fast reply!
For the record, disabling assertions and using VectorXd didn't have any significant effect. I downloaded the development version and found PartiaLU.h in it but I'm not sure how I should modify my simple code (please se above) to take advantage of partial pivoting. Any help is very appreciated. Thank you. |
Registered Member
|
Just call partialLu() instead of lu(). And you're welcome
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
oh ok... I was typing "PartialLu()", that's why. I now get 1.25sec. |
Registered Member
|
Hm, so it's still 4x or 5x slower than the competition? OK, as I said we have room for improvement -- beginning with implementing the block algorithm. I'd be interested if you can check how many threads the other programs use, and what is the performance when you force the program to use only 1 thread.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
I'd be happy to check that for you if I only knew how do do that I'll try to find out and keep you posted. |
Registered Member
|
If you're using linux or unix you can always make it work on a huge matrix so that it takes like 10 seconds to complete, and during the computation you run 'top' in a terminal. If it says "200%" this means it's using 2 cores of your CPU. Of course this experiments only works if you have a dual core CPU. If you have 1 core, disregard all I said! Also notice that it is a well-known issue (see the TODO) that vectorization doesn't bring as strong a performance increase as it should, with PartialLU. What's needed here is for someone to examine the assembly carefully and understand where the code we're generating is bad...
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
I do have 2 cores. I observed that armadillo (blas/lapack) reached 191% while matlab exceeded 200%. Eigen spiked at 101%. So it actually seems that armadillo and matlab are using two cores but this happens only after a few seconds (I'm not sure if this is due to the refresh rate of "top"). I also did other tests and Eigen turned out to be the fastest at smaller matrix sizes. These are the results for the solution of a 50x50 system: Eigen (partial lu) -> 0.216 ms. Armadillo -> 0.238 ms Eigen (full lu) -> 0.332 ms. Matlab -> 0.350ms Then increasing the matix size Eigen seems to slow down more rapidly than the other two. |
Registered users: Bing [Bot], Google [Bot], Sogou [Bot]