This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Ideas for speeding up eigen in debug compiles?

Tags: None
(comma "," separated)
hauptmech
Registered Member
Posts
2
Karma
0
The speed of unoptimized (ie compiled in for Debug) eigen is killing me. Does anyone have techniques for getting around this? Eigen is widely scattered throughout my code so I can't isolate it easily. The only idea I have so far is to create an Eigen only library with the subset of Eigen I'm using in a thin wrapper of the same names and flipping namespaces an ifdef...

thoughts?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
What about compiling with both debug symbols and minimal optimizations: -O1 -g2 ?
graphicsMan
Registered Member
Posts
16
Karma
0
OS
What about adding an inlining macro to eigen (e.g. EIGEN_INLINE)? This would allow changing inline to be __forceinline in intel and setting the always_inline attribute for gcc, etc... It would also allow you to set inlining back to just 'inline' if you actually wanted to debug Eigen itself. I believe the massive slowdown for debugging apps using Eigen extensively is due to the large call stacks generated at runtime.
graphicsMan
Registered Member
Posts
16
Karma
0
OS
I see there is an EIGEN_STRONG_INLINE macro. Unfortunately, it seems it's not set up to be modified prior to including the Eigen headers, and it currently sits at just 'inline' unless you have the intel compiler. Unsure how much changing all 'inline' to always inline vs just EIGEN_STRONG_INLINE to always inline would help, but it might be worth playing with both options.
graphicsMan
Registered Member
Posts
16
Karma
0
OS
Here are some timings with one instantiation of my app:

no special inlining...
dbg 00:03:38.412005
rel 00:00:12.835160

eigen_strong_inline to forced ...
dbg 00:03:05.560514
rel 00:00:12.861347

all eigen inlining forced...
dbg 00:02:27.530674
rel 00:00:12.824934

release mode is more-or-less unaffected by this change (I think this is within statistical noise). debug mode is 33% faster when all functions are inlined and 15% faster when the EIGEN_STRONG_INLINE functions are inlined.

It would be interesting to see how other apps are affected.

BTW, for me it was sufficient for the third test to do this:

#define inline inline __attribute__((always_inline))
#include <Eigen/Dense>
#undef inline

The second test was accomplished by modifying Macros.h:

// EIGEN_FORCE_INLINE means "inline as much as possible"
#if (defined _MSC_VER) || (defined __INTEL_COMPILER)
#define EIGEN_STRONG_INLINE __forceinline
#else
#define EIGEN_STRONG_INLINE inline
#endif

to

// EIGEN_FORCE_INLINE means "inline as much as possible"
#if (defined _MSC_VER) || (defined __INTEL_COMPILER)
#define EIGEN_STRONG_INLINE __forceinline
#else
#define EIGEN_STRONG_INLINE inline EIGEN_ALWAYS_INLINE_ATTRIB
#endif
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
note that many trivial function are note explicitly qualified with "inline" because they are defined in the class declaration. So perhaps even better speedup could be achieved if really all functions are forced to be inlined. Perhaps we could add a EIGEN_FORCE_INLINING option doing that.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
graphicsMan
Registered Member
Posts
16
Karma
0
OS


Bookmarks



Who is online

Registered users: Bing [Bot], claydoh, Google [Bot], rblackwell, Yahoo [Bot]