This forum has been archived. All content is frozen. Please use KDE Discuss instead.

[BUG] Some decompositions calling abort() with -NDEBUG set

Tags: None
(comma "," separated)
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
Hi all,

I have been testing Eigen on an ARM Cortex-M4 CPU, and "normal" operations such as vector/matrix products/sums work fine (as expected) when compiling with -NDEBUG.
However if I try to use an inplace QR "Eigen::HouseholderQR< Eigen::Ref< mat_t > > qr(mat);" I get a lot of link-errors as:

Code: Select all
/usr/bin/../lib/gcc/arm-none-eabi/6.3.1/../../../../arm-none-eabi/lib/thumb/v7e-m/fpv4-sp/hard/libc.a(lib_a-abort.o): In function `abort':
abort.c:(.text.abort+0xa): undefined reference to `_exit'
/usr/bin/../lib/gcc/arm-none-eabi/6.3.1/../../../../arm-none-eabi/lib/thumb/v7e-m/fpv4-sp/hard/libc.a(lib_a-sbrkr.o): In function `_sbrk_r':
sbrkr.c:(.text._sbrk_r+0xc): undefined reference to `_sbrk'
...

This all starts with a call to abort(), which should not happen when compiling with -NDEBUG (as it means Eigen is calling an assert somewhere).

So I have also tested all other inplace decompositions, almost all work but 2 fail:
Working
- LLT
- LDLT
- FullPivLU

Failing by trying to link abort()
- PartialPivLU
- HouseholderQR

Failing by trying to link abort() when matrix size is > 7x7
- ColPivHouseholderQR
- FullPivHouseholderQR
- CompleteOrthogonalDecomposition

Test System
- arm-none-eabi-gcc v6.3.1 (https://launchpad.net/gcc-arm-embedded)
- Ubuntu 16.04
- Eigen 3.3 branch (latest commits pulled)

Test Program showing the error
Minimal project highlighting the problem: https://github.com/korken89/eigen_decomp_error
In main.cpp, just un-comment each decomposition and run make.

I hope someone can help here!
Thanks!

Last edited by emilf on Thu Jun 08, 2017 6:08 pm, edited 1 time in total.
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
Moreover, I have found that even though the definition of the matrix is float, the decompositions seem to be using doubles internally (even when specifying -fsingle-precision-constant).
As soon as a decomposition is used a lot of the libraries for doubles are included (add, sub, mul, etc).

Or is there something specific I as user must do here?
Because if I add the flag -ffast-math, the problem goes away. Is there something happening on the inside due to numerical stability?
More specifically, it is the -fno-trapping-math flag that causes all the double libraries to go away.
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
There seems to be a deeper problem somewhere.
I just did a test where I implemented a naive HouseholderQR (see below), and it stops compiling with the same reference to abort() when the input matrix goes from size 7x7 to 8x8.
I am however unable to pinpoint which operation that makes this happen.
Code: Select all
template <typename T>
auto sgn(T val) {
    return (T(0) < val) - (val < T(0));
}

template <typename T>
void my_inpace_QR(Eigen::MatrixBase<T> &m)
{
  Eigen::Matrix<typename T::Scalar, T::RowsAtCompileTime, 1> w;

  for (auto i = 0; i < T::ColsAtCompileTime; i++)
  {
    const float nx  = m.col(i).norm();
    const float s   = sgn(-m(i,i));
    const float u1  = m(i,i) - s*nx;
    const float tau = -s*u1/nx;

    w.head(T::RowsAtCompileTime - i)
      = m.col(i).tail(T::RowsAtCompileTime - i) / u1;

    w(0) = 1;

    m.block(i, i, T::RowsAtCompileTime, T::RowsAtCompileTime)
      -= tau
       * w.head(T::RowsAtCompileTime - i)
       * (
            w.head(T::RowsAtCompileTime - i).transpose()
            * m.block(i, i, T::RowsAtCompileTime, T::RowsAtCompileTime)
         );
  }
}
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
What seems to be one culprit is matrix multiplication.
A simple "result = mat1 * mat2" works fine until size > 7x7, then the error starts happening.
What is happening here? It seems to start linking in exception and allocation files.

Is there a way to stop all this?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Indeed, for "large" matrix-matrix products, code calling malloc/free gets instantiated because this code path needs a working buffer to modify the storage layout for optimal performance. If the sizes are known at compile-time, then this buffer is statically allocated and calls to malloc/free are bypassed at runtime. So providing dummy malloc/free functions should do the job for you.
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
Thank you for you reply ggael! So that is what is happening.
But that seems a little odd, as you know at compile-time that it won't allocate.

As a suggestion, would it be possible add another level of indirection over the allocation expressions?
Which uses SFINAE to not have new/delete at all in the code when doing static sized matrices, alternatively add a check if C++17 is enabled and use an if constexpr (I guess it's just a normal if now)?

This would make the Eigen library much more usable for us embedded folk (I have found people having similar problems while searching for a solution) :)
If you point me to the active code where this is happening, I can probably add this as well - but I don't know the Eigen code base well enough.
What are your thoughts on this?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
That's more tricky than it looks like. First of all, in order to reduce multiple instantiations of the same code (matrix product code is really heavy) the dynamic allocation part does not know the initial compile-time sizes, it only knows scalar types, runtime sizes, and got a working space pointer, say P. If P is null then it is allocated on demand. Moreover, it is not always possible (or rather, it would be painful) to always preallocate P because of multi-threading, implicit transposition and nesting.
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
Ah, I see the predicament here. Thanks for the explanation!

The main problem seems to be that Eigen is very general, and works (very well) on multiple targets, so it was never really aimed for embedded use - it came later.
There doesn't happen to be a plan for an "Eigen Light", aimed for resource constrained embedded systems? :)
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Actually, we do care about embedded systems from the starts, which is why malloc is actually not called at runtime in your case and we have several unit tests to check that, but I agree that if the code calling malling was not instantiated this would be even better. Again, you just need to provide:

void* malloc(size_t) { return 0;}
void free(void*) {}

to get it compile and run fine. Not that bad.
User avatar
emilf
Registered Member
Posts
29
Karma
0
OS
Yeah, that is acceptable!
Could I suggest to perhaps add a small section in the documentation for embedded use? So people in the future have a reference. :)


Bookmarks



Who is online

Registered users: Bing [Bot], daret, Google [Bot], sandyvee, Sogou [Bot]