This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Stability issues with HybridNonLinearSolver

Tags: None
(comma "," separated)
eudoxos
Registered Member
Posts
36
Karma
0
Hello,

I was using (successfully) HybridNonLinearSolver (hg trunk), with numerical differentiation; sometimes the solver asks me (i.e. the functor object) to evalute residuals for solution=[nan, ..., nan] (at which point I interrupt the computation). I checked all inputs to be finite numbers. Where should I look for possible causes?

The outcome is unfortunately not deterministic, for that same setup (reloading the whole simulation data), it will give a solution (which is always the same!), and sometimes all NaN, sometimes all -NaN. All my variables are properly initialized. Any further hints?

I tried running with valgrind, and only got this warning, though I am not sure how much relevant it is:
Code: Select all
Conditional jump or move depends on uninitialised value(s)
==32602==    at 0xF8D41A5: void Eigen::internal::dogleg<double>(Eigen::Matrix<double, -1, -1, 0, -1, -1> const&, Eigen::Matrix<double, -1, 1, 0, -1, 1> const&, Eigen::Matrix<double, -1, 1, 0, -1, 1> const&, double, Eigen::Matrix<double, -1, 1, 0, -1, 1>&) (dogleg.h:30)


Thanks! Edx.
User avatar
Koldo
Registered Member
Posts
16
Karma
0
Hello Edx

I am a newbie with non linear solver but perhaps I can help.

Could you post a simple demo?
eudoxos
Registered Member
Posts
36
Karma
0
Hi, I would like to provide an example, but unfortunately the code to compute solution error is too complex (it searches neighbor points and evaluates derivatives of some vector and tensor fields based on approximated derivatives etc)...

I was dumping the jacobian at every operator invocation and the difference from different runs comes apparently from some numerical issues; the initial solution vector is varied to compute the jacobian using forward differences, column by column. Due to boundary conditions, last block of DoFs has unit diagonal, like this
Code: Select all
 ...... 0 0 0
 ...... 0 0 0
 ...... 0 0 0
 000000 1 0 0
 000000 0 1 0
 000000 0 0 1

where the ... indicates dense block full of numbers.

I am not using external scaling, so columns are scaled by column-wise blue norm. At the next operator invocation (I suppose after column scaling), two runs which were up-to-now identical, give different results. In one case, the jacobian is
Code: Select all
........ 0 0 0
........ 0 0 0
........ 0 0 0
-0 -0  0 0 0 0
-0 -0  0 0 0 0
-0 -0  0 0 0 0

and this gives a good solution at the end (error cca 2e-9).

In the second case I get
Code: Select all
........ 0 0 0
........ 0 0 0
........ 0 0 0
0  0  0  inf -nan -nan
-nan .. -nan -nan -nan
-nan .. -nan -nan -nan

leading of course to garbage solution.

Can someone give a hint?

Last edited by eudoxos on Mon Aug 01, 2011 2:03 pm, edited 1 time in total.
eudoxos
Registered Member
Posts
36
Karma
0
PS I know that those unknowns giving 1's on the diagonal could be condensed away; I wanted to have something which will work in any general case for the start, before it gets some optimization.
eudoxos
Registered Member
Posts
36
Karma
0
I finally found out it is a bug in unsupported/Eigen/src/NonLinearOptimization/r1updt.h, which does not assign v_givens and w_givens elements if corresponding v and w entries are 0. In the code:
Code: Select all
for (j=n-2; j>=0; --j) {
   w[j] = 0.;
   if (v[j] != 0.) {
      /* ... */
      v_givens[j]=givens;
      /* ... */
   }
   /* BUG: what if v[j]==0 ?!! */
}

Garbage then propagates to fjac in r1mpyq and so on.

Valgrind was perhaps on the right track in the end.

What would be the quick solution? Assign zeros to v_givens and w_givens in such a case?

PS. another thing that I noticed that
Code: Select all
 wa2 = fjac.colwise().blueNorm();
is uselessly (seems to me) evaluated twice in HybridNonlinearSolver::solveNumericalDiffOneStep, before the while loop starts.
User avatar
orzel
Registered Member
Posts
9
Karma
0
OS
eudoxos wrote:PS. another thing that I noticed that ... is uselessly (seems to me) evaluated twice in HybridNonlinearSolver::solveNumericalDiffOneStep, before the while loop starts.


Indeed, this is because of a confustion between two changes i did. I've fixed this in both branches "default" and "3.0". Tests confirm it is ok.
Thanks for reporting.
User avatar
orzel
Registered Member
Posts
9
Karma
0
OS
eudoxos wrote:I finally found out it is a bug in unsupported/Eigen/src/NonLinearOptimization/r1updt.h, which does not assign v_givens and w_givens elements if corresponding v and w entries are 0.

(reported as http://eigen.tuxfamily.org/bz/show_bug.cgi?id=322)

Indeed, this is now fixed in both "3.0" and "default" branches.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]