Registered Member
|
Hello
I get a "munmap_chunk: invalid pointer" error leading to a SIGABRT in my code when using the SparseLU solver. I suspect that I'm feeding the solver something fishy. However the error arrive after 1h+ of computation in release mode and most values are optimized away. So I don't really know what's going wrong. However I'd prefer if the sparseLU solver raises an error in such cases rather than abort the program. Unfortunately, the long computation time means that the error may be hard to catch even if it's quite reproducible (the errors happened with Eigen 3.2.8 on 3 different machines). I run a debug session, and an assertion was raised instead, but I forgot to copy it :/. It was saying that something was wrong with the factorization. The code (huge and ugly) is here : https://bitbucket.org/specmicp/specmicp The trouble occurs in the function "restart_timestep" (line 111) of this file : https://bitbucket.org/specmicp/specmicp ... ew-default There is a small wrapper around the Eigen solvers that allows me to chose the one I want at runtime. The error does not occur with the SparseQR solver. The corresponding code is here : https://bitbucket.org/specmicp/specmicp ... ?at=master I ran valgrind to get more information, the output is pasted below. Unfortunately I'm not sure what's it's telling me. My questions are : - Is there anything I'm doing obviously wrong ? - IS there anything more I can do to track down the error ? - Does the SparseLU solver assumes that the input is "correct" ? What is the definition of correct in this case ? - Can the SparseLU solver be changed to catch the problem and abort the factorization rather the cause a memory error ?
|
Moderator
|
oh, the first error is happening in the solve step and not during the factorization. Please paste the assertion with the associated back-trace. Also make sure the factorization succeeded before calling solve, like if(lu.info()==Eigen::Succeed) x=lu.solve(b); else ....
|
Registered Member
|
I thought it was what I was doing.... but clearly not ! With this check I'm able to stop after a failed factorization and avoid trying to solve a badly formed system. In my case I guess that I run into some problems with numerical precision. The SparseQR solver is able to overcome them, not the SparseLU solver, but I guess it can be expected. Thanks. I'm going to inspect more closely the problem. I'll post the exact assertion that was raised when I get it again, but it may take a while for the debug version to get to the point where the problems tart to arrive. |
Registered Member
|
This is the assertion, which makes sense now :
And the associated backtrace :
It was my fault for not checking the factorization step. Thank you for your answer. |
Registered users: Bing [Bot], Evergrowing, Google [Bot], rockscient