This forum has been archived. All content is frozen. Please use KDE Discuss instead.

vectorized atan2

Tags: None
(comma "," separated)
brebs
Registered Member
Posts
3
Karma
0

vectorized atan2

Fri Jun 26, 2015 7:55 pm
I saw that users who wanted a coeff wise atan2 functions were told that it was missing but that one could use A.binaryExpr(B, std::ptr_fun(atan2)) instead. This is not going to end up being vectorized (as in SSE/AVX), is it? I would like a vectorized version. I am currently playing with an SSE wrapper library (ut-sse) that provides an SSE implementation of atan2 (some sort if polynomial approximation formula), but since I use Eigen a lot in my code I was thinking to use Eigen here as well. I also use PCL.

Here is my current project: I have a point cloud with 150,000 xyz points. I would like to compute for each one of them the distance and azimuth from a pivot point:
Code: Select all
Vector2f v = pt-pivot
d = v.norm()
a = atan2(v.y(), v.x())


I think -- please correct me if I am wrong -- that the norm operation will be partially vectorized: sqrt(x*x+y*y) but the gain here is probably small, since it's only 2D (as opposed to 4 floats)

Instead I was thinking that I could implement as follow:
Code: Select all
create an array for the x and the y
substract the pivot
compute the squares, add them, get the square root
implement the polynomial approximation of atan in the same way


does this sound reasonable? Or am I trying to reinvent the wheel here.

I am assuming here that if I have 2 ArrayXf a and b and I do a+b it will end up decomposing the array in packets of 4 floats and using sse instructions to perform the addition. Is this correct? (assuming the compiler and hardware support it of course). In general, how to take advantage of SIMD in Eigen to process a lot of points
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: vectorized atan2

Tue Jul 07, 2015 1:03 pm
Yes, if a and b are two ArrayXf then most coefficient-wise operations will be vectorized, e.g.:

norms = sqrt(((a - p_x).abs2() + (b-p_y).abs2())

If you manage to implement atan2 the same way, then you're good. However, it is likely that to implement atan2 you will need temporaries that will be used multiple times. In that case, better use a custom functor implementing the packet() function. See Core/functors/BinaryFunctors.h for some examples. Ideally, this functor would simply be a wrapper to a patan2 function specialized for different SIMD engines.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]