This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Vectorization of sum of exponentials

Tags: None
(comma "," separated)
ibell
Registered Member
Posts
2
Karma
0

Vectorization of sum of exponentials

Sun Jan 12, 2014 11:57 pm
I am trying to vectorize the summation of exponential function evaluations for a project I am working on. The code below shows the basic idea. I had hoped that the Eigen sum of the exponentials would be vectorized, and faster, but the output is

Time A: 1.192
Time B: 0.347

which says that the Eigen version is quite a bit slower. This is in MSVC with SSE2 enabled, and openMP for good measure but OPENMP doesn't seem to make a difference in the results.

Anyone have any ideas? Am I missing something obvious?

Code: Select all
#include <iostream>
#include "Eigen/Dense"
#include "time.h"

int main()
{
   Eigen::ArrayXd vv;
   unsigned int N = 1000000;
   vv.resize(N);
   vv.fill(1);

   clock_t t1, t2;

   t1 = clock();
   double sexp1 = exp(vv).sum();
   t2 = clock();
   double tA = (double)(t2-t1)/CLOCKS_PER_SEC;
   std::cout << "Time A: " << tA << std::endl;

   t1 = clock();
   double sexp2 = 0;
   for( unsigned int i = 0; i < N; i++){ sexp2 += exp(vv[i]); }
   t2 = clock();
   double tB = (double)(t2-t1)/CLOCKS_PER_SEC;
   std::cout << "Time B: " << tB << std::endl;
}
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
First of all, Eigen vectorizes exp for float only, not double. Then, your example is biased because sexp2 is not used, and therefore the compiler is free to cut out the loop. Using gcc and a modified benchmark I get:

Time A: 0.005229
Time B: 0.006288

using:
Code: Select all
#include <iostream>
#include "Eigen/Dense"
#include "time.h"

int main()
{
   Eigen::ArrayXd vv;
   unsigned int N = 1000000;
   vv.resize(N);
   vv.fill(1);

   clock_t t1, t2;

   t1 = clock();
   double sexp1 = exp(vv).sum();
   t2 = clock();
   double tA = (double)(t2-t1)/CLOCKS_PER_SEC;
   std::cout << "Time A: " << tA << " " << sexp1 << std::endl;

   t1 = clock();
   double sexp2 = 0;
   for( unsigned int i = 0; i < N; i++){ sexp2 += exp(vv[i]); }
   t2 = clock();
   double tB = (double)(t2-t1)/CLOCKS_PER_SEC;
   std::cout << "Time B: " << tB << " " << sexp2  << std::endl;
}


and using floats we see the effect of vectorization:

Time A: 0.002032
Time B: 0.007636


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot], Yahoo [Bot]