Welcome to the KDE Community Forums, the official forum board for KDE.
You are currently viewing the forums as an unregistered user. Registration allows you to post and discuss topics, receive private messages, vote on ideas, subscribe to topics and many such great features. Registration is a simple process and completely free. So register now and be a part of the community!


TaMo

Registered Member

Posts: 2

Karma: 0

Segfault in Cholesky factorization

Post Sun Oct 04, 2009 11:01 am

Hi,

I am getting weird segfaults using the cholesky factorization. For example, even in this simple program:

Code: Select all

#define SIZE_P 230

int main(){
 
  MatrixXd HP = MatrixXd::Random(9,SIZE_P);

  Matrix<double,9,9> S = HP*HP.transpose()+30*Matrix<double,9,9>::Identity();
 
  Matrix<double,9,1> r = Matrix<double,9,1>::Ones();
     
  S.llt().solveInPlace(HP);


}



I get a segfault. gdb said the following:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7d066c0 (LWP 19357)]
Eigen::ei_solve_triangular_selector<Eigen::Flagged<Eigen::Transpose<Eigen::NestByValue<Eigen::Matrix<double, 9, 9, 2, 9, 9> > >, 1024u, 0u>, Eigen::Matrix<double, 10000, 10000, 2, 10000, 10000>, 1024, 1>::run (lhs=@0xbfbd8598, other=@0xbfbd8b90)
at /usr/lib/gcc/i486-linux-gnu/4.3.2/include/emmintrin.h:150
150 *(__m128d *)__P = __A;
(gdb) up
#1 0x0804e805 in Eigen::LLT<Eigen::Matrix<double, 9, 9, 2, 9, 9> >::solveInPlace<Eigen::Matrix<double, 10000, 10000, 2, 10000, 10000> > (
this=0xbfbd8650, bAndX=@0xbfbd8b90) at ../../eigen2/Eigen/src/Core/SolveTriangular.h:48
48 ei_solve_triangular_selector<Flagged<Lhs,LhsMode,0>,Rhs>::run(lhs._expression(), other);
(gdb) up
#2 0x08049512 in main () at main2.cpp:46
46 S.llt().solveInPlace(HP);


I am compiling using g++ -O3 -msse3.

Am I doing something wrong?

Any help would be greatly appreciated.

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 12:55 pm

Thanks for the report. After trying, I found that this crash exists only in the development version, not in 2.0. So my obvious advice would be to stick to the stable version if this is a major concern to you.

I found that the crash happens whenever SSE (2+) is enabled, and that optimization isn't playing a role here.

So I compiled with just -msse2 -g3 and no optimization, and got this backtrace:
Code: Select all
#0  0x0804e505 in _mm_load_pd (__P=0xbfffede4) at /usr/lib/gcc/i586-suse-linux/4.4/include/emmintrin.h:106
#1  Eigen::ei_pload<double> (__P=0xbfffede4) at eigen2/Eigen/src/Core/arch/SSE/PacketMath.h:176
#2  0x0804b91c in Eigen::ei_cache_friendly_product_colmajor_times_vector<false, false, double, Eigen::ei_gemv_selector<2, 0, true>::run(const ProductType&, Dest&, typename ProductType::Scalar) [with ProductType = Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>, Dest = Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32>]::ActualRhsType>(int, const double *, int, const Eigen::ei_gemv_selector<2, 0, true>::ActualRhsType &, double *, double) (size=3, lhs=0xbfffed54, lhsStride=9, rhs=..., res=0xbfffe6e0, alpha=-1)
    at eigen2/Eigen/src/Core/products/GeneralMatrixVector.h:147
#3  0x080591de in Eigen::ei_gemv_selector<2, 0, true>::run<Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>, Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32> > (prod=..., dest=..., alpha=-1)
    at eigen2/Eigen/src/Core/Product.h:331
#4  0x08058309 in Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>::scaleAndAddTo<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32> > (this=0xbfffe870, dst=..., alpha=-1)
    at eigen2/Eigen/src/Core/Product.h:279
#5  0x08057133 in Eigen::ProductBase<Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>, Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> > >::scaleAndAddTo<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32> > (this=0xbfffe870, dst=..., alpha=-1)
    at eigen2/Eigen/src/Core/ProductBase.h:112
#6  0x08055898 in Eigen::ProductBase<Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>, Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> > >::subTo<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32> > (this=0xbfffe870, dst=...) at eigen2/Eigen/src/Core/ProductBase.h:109
#7  0x08054707 in Eigen::NoAlias<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 1, 1, 32> >::operator-=<Eigen::GeneralProduct<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> >, 3>, Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 33331, 33331, 1, 32>, Eigen::Transpose<Eigen::Block<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 1, 33331, 1, 32> > > (this=0xbfffe87c, other=...) at eigen2/Eigen/src/Core/NoAlias.h:71
#8  0x08051e79 in Eigen::ei_llt_inplace<2048>::unblocked<Eigen::Matrix<double, 9, 9, 0, 9, 9> > (mat=...) at eigen2/Eigen/src/Cholesky/LLT.h:144
#9  0x08050853 in Eigen::ei_llt_inplace<2048>::blocked<Eigen::Matrix<double, 9, 9, 0, 9, 9> > (m=...) at eigen2/Eigen/src/Cholesky/LLT.h:156
#10 0x08050099 in Eigen::LLT_Traits<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 2048>::inplace_decomposition (m=...) at eigen2/Eigen/src/Cholesky/LLT.h:202
#11 0x0804f90b in Eigen::LLT<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 2048>::compute (this=0xbfffed24, a=...) at eigen2/Eigen/src/Cholesky/LLT.h:228
#12 0x0804f18f in Eigen::LLT<Eigen::Matrix<double, 9, 9, 0, 9, 9>, 2048>::LLT (this=0xbfffed24, matrix=...) at eigen2/Eigen/src/Cholesky/LLT.h:85
#13 0x0804eae2 in Eigen::MatrixBase<Eigen::Matrix<double, 9, 9, 0, 9, 9> >::llt (this=0xbfffe9f4) at eigen2/Eigen/src/Cholesky/LLT.h:285
#14 0x0804b06e in main () at a.cpp:16
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 1:08 pm

So the line that breaks is
Code: Select all
              _EIGEN_ACCUMULATE_PACKETS(d,du,d);

at line 147 in GMV.h, EvenAligned case.

I added couts to get the addresses of the packets:
Code: Select all
0xbfdd97c4 0xbfdd989c 0xbfdd9854 0xbfdd980c


So the problem is that the assumption, that the first is aligned, is broken.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 1:12 pm

Gael: leaving this one to you, as it seems that you have more cases to handle here? The first is not necessarily aligned but at the same time,there are at most 4x more cases to check...

TaMo: for now, the quickest is to disable vectorization, it'll fix your issue.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 1:21 pm

Ah! No, I know, sorry, the issue is elsewhere.

See my pointers starting in 0xbf ...... ?

That's _stack_ pointers. What is probably happening is that here, the size 9 is known at compile time, so the array is a static array, and since 9*9*sizeof(double) isn't a multiple of 16, the array is not aligned.

Great, we know that we don't vectorize fixed-size ops that are not fully vectorizable, but then why are SSE instructions used at all here ? That's the bug.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 1:24 pm

This is confirmed:

TaMo: just replacing all matrix types by MatrixXd, works around the bug for now (the arrays are then aligned).
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 1:30 pm

ok one last note:

the other approach would be to get rid of the assumptions "first aligned" as anyway that doesn't apply when taking blocks...

actually, lately i've been wondering. Perhaps it always was stupid to align dynamic arrays ??? We have to treat the last few remaining scalars anyway, so why not treat the first few? (Which we already do whenever we want something that applies to blocks)
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!



TaMo

Registered Member

Posts: 2

Karma: 0

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 8:00 pm

Wow, that was quick. Thanks for the reply! I will try that and let you know if there are problems still.

In a related question: is there a way to pre-allocate the temporaries of cholesky, so that they are not re-allocated every time? For the particular example above with a 9x9 matrix it may not be too bad, but in other parts of my program I need to take many LLT's of large matrices (e.g. 200x200). Alternatively, is it possible to compute the LLT in place (i.e., to overwrite the matrix being decomposed)?

In fact, I am in the process of switching my code from MKL to eigen. In MKL you can pre-allocate all the temporaries, and it will also compute the decomposition destructively. I would like to be able to get as close as possible to that performance.

That said, I would also like to congratulate and thank you for this wonderful effort. Had this library existed a couple of years ago, I would have saved many many hours of programming.....

Thanks!!!

Moderator User avatar

bjacob

Moderator

Posts: 293

Karma: 0

France

Re: Segfault in Cholesky factorization

Post Sun Oct 04, 2009 10:18 pm

TaMo wrote:In a related question: is there a way to pre-allocate the temporaries of cholesky, so that they are not re-allocated every time? For the particular example above with a 9x9 matrix it may not be too bad, but in other parts of my program I need to take many LLT's of large matrices (e.g. 200x200). Alternatively, is it possible to compute the LLT in place (i.e., to overwrite the matrix being decomposed)?


For in-place decompositions, we don't yet have that API although that's planned (we even discussed exactly that a few days ago).

In fact, I am in the process of switching my code from MKL to eigen. In MKL you can pre-allocate all the temporaries, and it will also compute the decomposition destructively. I would like to be able to get as close as possible to that performance.


Yes, I understand. Not sure where we stand, performance wise, vs MKL, at the moment. If this is performance critical for you, I don't want to induce you into migrating too early. Do benchmarking before.

That said, I would also like to congratulate and thank you for this wonderful effort. Had this library existed a couple of years ago, I would have saved many many hours of programming.....

Thanks! we appreciate kind words.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!

Moderator User avatar

ggael

Moderator

Posts: 205

Karma: 0

Gender:  Male

France

Re: Segfault in Cholesky factorization

Post Mon Oct 05, 2009 8:07 am

bjacob wrote:
TaMo wrote:In a related question: is there a way to pre-allocate the temporaries of cholesky, so that they are not re-allocated every time? For the particular example above with a 9x9 matrix it may not be too bad, but in other parts of my program I need to take many LLT's of large matrices (e.g. 200x200). Alternatively, is it possible to compute the LLT in place (i.e., to overwrite the matrix being decomposed)?


For in-place decompositions, we don't yet have that API although that's planned (we even discussed exactly that a few days ago).



Well, at least you can declare a LLT object once, and use it several time:

LLT<MatrixXd> llt;

for() {
llt.compute(mymat);
llt.solveInPlace(x);
}


In fact, I am in the process of switching my code from MKL to eigen. In MKL you can pre-allocate all the temporaries, and it will also compute the decomposition destructively. I would like to be able to get as close as possible to that performance.


Yes, I understand. Not sure where we stand, performance wise, vs MKL, at the moment. If this is performance critical for you, I don't want to induce you into migrating too early. Do benchmarking before.



In the devel branch we have a blocked LLT decomposition and fast triangular solvers, so for large matrices the overall performance of Eigen's LLT should be very close to the one of MKL. The main difference is that in Eigen we don't generate multithreaded code yet... For small matrices, Eigen outperforms MKL.

I'm looking at this issue right now...
ggael, proud to be a member of KDE forums since 2008-Dec.

Next »

Who is online

Users browsing this forum: No registered users and 2 guests