This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Performance of GEMV

Tags: None
(comma "," separated)
matthiasw
Registered Member
Posts
6
Karma
0

Performance of GEMV

Fri Jun 05, 2015 1:10 pm
Hello again,

I have a question to a performance issue for a gemv operation:
Code: Select all
y = H*x

where y and x are complex vectors and H is a general complex (in my case hermitian) matrix.

Here is a snippet from performance testing:
Code: Select all
#define MATRIX_TYPE Eigen::MatrixXcd
#define VECTOR_TYPE Eigen::VectorXcd

void test_eigen(const MATRIX_TYPE& H, const VECTOR_TYPE& x, VECTOR_TYPE& y) {
  y.noalias() = H*x;
}

void test_loop(const MATRIX_TYPE& H, const VECTOR_TYPE& x, VECTOR_TYPE& y) {
  //y.setZero();
  for(int i = 0; i < H.rows(); ++i) {
    for(int j = 0; j < H.cols(); ++j) {
      y[i] += H(i,j)*x[j];
    }
  }
}

int main(int, const char**) {
  BenchTimer t1, t2;

  VECTOR_TYPE x, y;
  MATRIX_TYPE H;
  x.setZero(2);
  y.setZero(2);
  H.setZero(2,2);

  x(0).real() = 1.;
 // only a test matrix
  H.imag() << 0, 1.,
              -1, 0;

  BENCH(t1, 1000,1000, test_eigen(H,x,y))
  BENCH(t2, 1000,1000, test_loop(H,x,y))

  printf("Eigen GEMV: best: %lf  total: %lf\n", t1.best(), t1.total());
  printf("Loop GEMV: best: %lf  total: %lf\n", t2.best(), t2.total());
}


compliled with:
$ g++ -O3 -msse2 -DNDEBUG -DEIGEN_NO_DEBUG test.cpp
it prints me:
Code: Select all
Eigen GEMV: best: 0.000031  total: 0.034157
Loop GEMV: best: 0.000013  total: 0.013308


So, if I am comparing properly, the Eigen "vectorized" code runs approx. 2-2.5 times slower than the loop.
I suggest, that due to the assignment, some memory reallocations occure, which makes the code slower. Is there a way to optimize the matrix-vector product, in order to be as fast as the loop approach? I have to do such multiplication about 1-2mio times for Dimensions up to 100 for solving a differential equation, so performance matters.


Thanks!

Matthias
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Performance of GEMV

Fri Jun 05, 2015 1:46 pm
I see that you are using extremely small matrices, 2x2. Then using Matrix2xd and Vector2cd type should lead to substantial speed up. noalias() already bypass memory re-allocation. To make the comparison more fair, use .setZero() in the loop-based version, or += in the expression based version.
matthiasw
Registered Member
Posts
6
Karma
0

Re: Performance of GEMV

Fri Jun 05, 2015 2:15 pm
With the setZero() enabled I get:

Code: Select all
Eigen GEMV: best: 0.000031  total: 0.031608
Loop GEMV: best: 0.000020  total: 0.020782

which is still 50% slower.

However, with a bigger a Matrix (100,100), eigen performs much better (with 100 instead of 1k repetitions):
Code: Select all
Eigen GEMV: best: 0.000830  total: 0.092078
Loop GEMV: best: 0.002916  total: 0.292182


I compared the time as function of the matrix size, and it seems, that the vectorized product is faster beginning with sizes > 4. I think that solves my issue, thanks!


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]