Registered Member
|
I am in need of an intrinsic function (for rounding) of a VectorXd , and I was hoping someone had an example of the push into the _m256d .
Basically, is the VectorXd 32 byte aligned? If so is it as simple as something like (shown here with add) void avx_add(double *src,double *res) { __m256d *iptr = (__m256d*)res; __m256d t = _mm256_load_pd(src); *iptr = _mm256_add_pd(t,t); } int main(VectortXd EigenVec) { int length = EigenVec.size(); //is aligned correctly? // Allocate aligned memory for result double *res = _mm_malloc(length*sizeof(double),32); for(int i=0;i<length;i+=4) { avx_add(EigenVec+i,res+i); } return res; } ? |
Moderator
|
Eigen's VectorXd are indeed aligned on a 32 bytes boundary.
|
Registered Member
|
Registered users: Baidu [Spider], Bing [Bot], Google [Bot], rblackwell