Registered Member
|
Hi all,
I've got Eigen running on an STM32F4 (Cortex M4) discovery board with ARM gcc 4.9.3. I've compared it with another matrix library provided by the PX4 autopilot project. I did this simple test:
Eigen takes about 30 uS to complete, whilst the PX4 matrix library takes 3 uS. The PX4 library just does two nested for loops, no error checking, straight forward. I did a quick inspection of the assembly between #start_here and #end_here and Eigen generates an insane amount of code like below.
I have no idea what it is doung. But it's probably why it's running so slow, just a lot of instructions to go through. Any ideas? Nghia |
Moderator
|
make sure that you compiled with optimizations: -O3 -DNDEBUG and also make sure that C is really used afterwards so that the compiler did not removed useful code for PX4 version. The best is usually to wrap the interesting expression within a non inlined function:
EIGEN_DONT_INLINE void foo(Matrix2f &A, const MatrixX2f &B, const Matrix2f &C) { C = A+B; } and same for PX4, then you can start looking at the assembly.... |
Registered users: bartoloni, Bing [Bot], Evergrowing, Google [Bot]