This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Shouldn't Eigen be faster with single precision?

Tags: None
(comma "," separated)
philipremes
Registered Member
Posts
3
Karma
0
Hi guys,

I'm using Eigen for a signal processing application, with lots of matrix operations. I defined all the matrices as fixed size along these lines:

typedef Matrix<REAL,SPECTRUM_SIZE,1> vecSpectrum;
typedef Matrix<REAL,MULTIPLET_NUM,MULTIPLET_NUM> mat;

I observe that my program runs SLOWER when the matrices and vectors are float, rather than double. About 50 microseconds for double, ~100 for float.

Most of my experience in timing program execution speed comes from CUDA, where going from double to single precision is an automatic >=2x increase in speed from memory bandwidth consideration, ie from shipping data back and forth. I'm not so experienced with how the precision plays a role in speed with all CPU executions, but expected at least the same speed. I do have optimization enabled in the compiling, which is a factor of >10x speed difference on versus off.

Anyone know something pertinent, or played around with this themselves?

Thanks for your help!
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
what's your system? if it is a 32bits system then make sure to compile with SSE2 enabled (-msse2). Also make sure you compile with optimizations enabled (-O2). For most operation you should indeed get a x2 speedup (well nearly).

If you are using iterative algorithms, then it might be the case that it converges quickly with double precision (unlikely though). Another reason would be that with single precision you get overflow issues.
philipremes
Registered Member
Posts
3
Karma
0
Hi ggal,

I was gone on vacation there, so I didn't see your reply, thanks. I am running on a 64 bit system. My algorithm is iterative, but I set the iterations fixed for these tests, so convergence couldn't be the issue. I'm not using any SSE2 though because the final procedure goes on an embedded system. I don't know much about SSE2, but I don't think the embedded system supports it. The -02 optimization is enabled though.

I will have to investigate more closely the overflow possibility. Thanks for that.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
You should check your hardware to see what floating point arithmetics are supported, and how they are accelerated. I'm pretty sure this has nothing to do with Eigen.


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot]