SSE approximate inverse square root?

Board index

Page 1 of 1 (6 posts)

Tags:

mikaellund Registered Member Posts 7 Karma 0 OS	SSE approximate inverse square root? Wed Jan 05, 2011 9:10 am Hi, I'm doing particle simulations and my current bottleneck are 1/sqrt(x) and 1/x operations. Is there a way to make Eigen use the SSE intrinsic estimates "rsqrtps" and "rcpps" for the former two operations? Thanks, Mikael
ggael Moderator Posts 3447 Karma 19 OS	Re: SSE approximate inverse square root? Wed Jan 05, 2011 5:11 pm Be aware that these two instructions are very imprecise, and they require one or two Newton iterations for accuracy. Nevertheless it is very easy to add support for such instruction in Eigen. The first thing to do is to implement the appropriate functor. Here is an example: Code: Select all template<typename Scalar> struct fastinvsqrt_func { EIGEN_EMPTY_STRUCT_CTOR(fastinvsqrt_func) inline const Scalar operator() (const Scalar& a) const { return Scalar(1)/sqrt(a); } typedef typename packet_traits<Scalar>::type Packet; inline Packet packetOp(const Packet& a) const { return rsqrtps(a); } }; template<typename Scalar> struct functor_traits<scalar_sqrt_op<Scalar> > { enum { Cost = 2 * NumTraits<Scalar>::MulCost, PacketAccess = #ifdef EIGEN_VECTORIZE_SSE true #else false #endif }; }; and then you can do: (a+b).unaryExpr(fastinvsqrt_func()) and/or add a global shortcut such that you can do: fastinvsqrt(a+b); and/or add a fastinvsqrt member function to ArrayBase (or even DenseBase if you wish) using our plugin mechanism, see: http://eigen.tuxfamily.org/dox-devel/To ... MatrixBase
mikaellund Registered Member Posts 7 Karma 0 OS	Re: SSE approximate inverse square root? Sat Jan 08, 2011 10:19 am Thanks a lot - I'll try it out!
mikaellund Registered Member Posts 7 Karma 0 OS	Re: SSE approximate inverse square root? Sat Jan 08, 2011 11:36 am Hi again, I tried the above suggestion, but - see code snippet below - I obviously didn't quite get it. The program compiles but prints 1,2,..10. Any help? Code: Select all #include <iostream> #include <Eigen/Dense> namespace Eigen { template<typename Scalar> struct fastinvsqrt_func { EIGEN_EMPTY_STRUCT_CTOR(fastinvsqrt_func) inline const Scalar operator() (const Scalar& a) const { return Scalar(1)/sqrt(a); } typedef typename ei_packet_traits<Scalar>::type Packet; inline Packet packetOp(const Packet& a) const { return rsqrtps(a); } }; template<typename Scalar> struct ei_functor_traits<fastinvsqrt_func<Scalar> > { enum { Cost = 2 * NumTraits<Scalar>::MulCost, PacketAccess = #ifdef EIGEN_VECTORIZE_SSE true #else false #endif }; }; } int main() { int n=10; Eigen::ArrayXf v(n); for (int i=0; i<n; i++) v[i]=i+1; v.unaryExpr( Eigen::fastinvsqrt_func<float>() ); for (int i=0; i<n; i++) std::cout << v[i] << std::endl; }
ggael Moderator Posts 3447 Karma 19 OS	Re: SSE approximate inverse square root? Sat Jan 08, 2011 9:31 pm unaryExpr returns an expression, so you should assign it: v = v.unaryExpr( Eigen::fastinvsqrt_func<float>() );
mikaellund Registered Member Posts 7 Karma 0 OS	Re: SSE approximate inverse square root? Mon Jan 10, 2011 7:20 pm ahh, silly me. It's working now - thanks a lot. I attach the final snippet, should anyone be interested. Code: Select all #include <iostream> #include <Eigen/Dense> namespace Eigen { template<typename Scalar> struct invsqrt_op { EIGEN_EMPTY_STRUCT_CTOR( invsqrt_op ) inline const Scalar operator() (const Scalar& a) const { return Scalar(1)/ei_sqrt(a); } typedef typename ei_packet_traits<Scalar>::type Packet; inline Packet packetOp(const Packet& a) const { return _mm_rsqrt_ps(a); } }; template<typename Scalar> struct ei_functor_traits<invsqrt_op<Scalar> > { enum { Cost = 2 * NumTraits<Scalar>::MulCost, PacketAccess = #ifdef EIGEN_VECTORIZE_SSE true #else false #endif }; }; } int main() { int n=10; Eigen::ArrayXf v(n); for (int i=0; i<n; i++) v[i]=i+1; v = v.unaryExpr( Eigen::invsqrt_op<float>() ); std::cout << v; }