This forum has been archived. All content is frozen. Please use KDE Discuss instead.

extending Eigen with SSE functions

Tags: None
(comma "," separated)
robstr
Registered Member
Posts
4
Karma
0

extending Eigen with SSE functions

Fri Apr 01, 2016 1:20 pm
Hey,

I'm trying to extend Eigen with some SSE Code. I found the What happens inside Eigen ... Page and found the packet method.

What is the way to go here - I need a __m128 object ( 4 floats ) - i don't need to check if my computer supports the SSE Intrinsic.

Would be nice if someone could give me a hint :)
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
I'm not sure to really understand your question. As an attempt, let me say that most of operations on an Array4f boils down to a single SSE instruction, like:

Array4f a, b, c;
a = b + c;

will compile as a movps + addps.
robstr
Registered Member
Posts
4
Karma
0
thank you for the fast replay. Unfortunately my question was not precise enough.
I need to calculate the correlation with the respect of NaN's. I've found the stackoverflow post with the intrinsic source code and want to integrate this to my application. All my data is stored in Eigen objects.

My question ist, is it possible to use the .packet method to access 4 packet floats ? Or is there a faster way of doing a loop with isnan checks ?

UPDATE
Is you're example possible with a Vector using 4 floats at once ?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
It still unclear how you want to integrate this, but the simplest is to directly call the function given on SO:

Code: Select all
VectorXd a(n), b(n);
float r = rms(a.data(), b.data(), a.size());
robstr
Registered Member
Posts
4
Karma
0
how does eigen create the SSE instructions ?
From your example page
ggael wrote:
Array4f a, b, c;
a = b + c;

will compile as a movps + addps.


I want to create something similar to the + operator like in the example from the Eigen Page:
Code: Select all
for(int i = 0; i < 4*(size/4); i+=4) u.packet(i)  = v.packet(i) + w.packet(i);
for(int i = 4*(size/4); i < size; i++) u[i] = v[i] + w[i];


Bookmarks



Who is online

Registered users: Bing [Bot], Evergrowing, Google [Bot], rockscient