This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Row vector distance to row vector is slow

Tags: None
(comma "," separated)
vuakko
Registered Member
Posts
4
Karma
0
Original problem comes from calculating distances (not all) of rows of row major matrices. Now I've noticed that the following code is fast:

Code: Select all
VectorXd A(100);
VectorXd X(100);
(A-X).squaredNorm();

But if both vectors are row vectors, the same code is slow:

Code: Select all
RowVectorXd A(100);
RowVectorXd X(100);
(A-X).squaredNorm();


How is this possible? It seems that row-row case is the only slow one, changing at least one vector as column vector makes it fast again.

Therefore when I calculate row distances of two row-major matrices I get this bad case.

Using Eigen 2.0.12 on 64-bit Linux with GCC.
User avatar
bjacob
Registered Member
Posts
658
Karma
3
Seems like you hit a strange performance bug with eigen 2.0.

Can you try with eigen 3 ( the development branch) ?

At this stage i'm not sure we'll still fix purely performance issues in 2.0.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
vuakko
Registered Member
Posts
4
Karma
0
bjacob wrote:Seems like you hit a strange performance bug with eigen 2.0.

Can you try with eigen 3 ( the development branch) ?

Seems like eigen 3 also has a strange performance bug. The difference isn't actually huge, around 20 %, but it's still there. Here's a simple test set:

Code: Select all
#include <cstdlib>
#include <Eigen/Core>

using namespace Eigen;

int main() {
   srand(1);
   const int nTests = 10000;
   const int nDim = 100;
   
   double result = 0; //Compiler can't throw the loop away now
   for(int k = 0; k < nTests; ++k) {
      VectorXd A = VectorXd::Random(nDim);      
      VectorXd B = VectorXd::Random(nDim);      
      result += (A-B).squaredNorm();
   }
   return (int)result;
}

I need to check my other code, because my whole program is hugely slower when using Eigen instead of STL.
This was just the first problem I could isolate...
User avatar
bjacob
Registered Member
Posts
658
Karma
3
Wait, this is not a good benchmark for a few reasons. First you had to worry about the compiler throwing away the computation; the proper approach was to put it in a non-inlinable function. For that we have EIGEN_DONT_INLINE. Second, you are mostly measuring the time taken by random number generation. Third, the time measurement is best done from the program itself, using performance timers. For that we have the bench/BenchTimer.h helper file. Fourth, your number of repetitions was so small, your program was finishing so fast, that lots of other factors could parasite the result, e.g. startup time. So I rewrote the benchmark as follows:

Code: Select all
#include <Eigen/Dense>
#include <bench/BenchTimer.h>
#include <iostream>

using namespace Eigen;

template<typename VectorType>
EIGEN_DONT_INLINE // so the compiler can't throw it away
typename VectorType::RealScalar foo(const VectorType& A, const VectorType& B)
{
  return (A-B).squaredNorm();
}

template<typename VectorType>
void bench(int nDim, int nTests)
{
   VectorType A = VectorType::Random(nDim);
   VectorType B = VectorType::Random(nDim);
   BenchTimer t; // careful: timer taking best time across runs
   t.start();
   for(int k = 0; k < nTests; ++k) {
     foo(A, B);
   }
   t.stop();
   if(A.rows() == 1) std::cout << "row vectors: ";
   else std::cout <<"column vectors: ";
   std::cout << t.value() << " s" << std::endl;
}

int main() {
   bench<VectorXd>   (100, 10000000);
   bench<RowVectorXd>(100, 10000000);
}


Here are my compilation options and results:
Code: Select all
##### 07:58:52 ~/cuisine$ g++ a.cpp -o a -O2 -I ../eigen2.0 -lrt
##### 07:58:53 ~/cuisine$ ./a
column vectors: 0.603313 s
row vectors: 0.598018 s
##### 07:58:56 ~/cuisine$ ./a
column vectors: 0.58084 s
row vectors: 0.58059 s
##### 07:58:58 ~/cuisine$ g++ a.cpp -o a -O2 -I ../eigen -lrt
##### 07:59:02 ~/cuisine$ ./a
column vectors: 0.577421 s
row vectors: 0.558549 s
##### 07:59:04 ~/cuisine$ ./a
column vectors: 0.57724 s
row vectors: 0.557048 s


As you can see, I can't reproduce the problem...


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
vuakko
Registered Member
Posts
4
Karma
0
Well I know it wasn't really a good benchmark, but it still gave me clear differences between STL and Eigen.
However, I know found out also that at least part of the problem was in my function declarations. I had a function

Code: Select all
double sqDistance(const VectorXd& A, const VectorXd& B) {
   return (A-B).squaredNorm();
}

and I called it with two MatrixXd::row objects. It seems that because MatrixXd::row actually gives out a Block object,
the arguments to the function are implicitly copy constructed from the Blocks.
Just switching to using VectorType from your example gave back the performance difference.

So now I'm back to debugging other performance issues in my code...

P.S. Does Eigen have anything similar to project function of uBlas? It works like this:
Code: Select all
//B is submatrix of A given by index sets r1 and r2.
B = project(A, r1, r2);
vuakko
Registered Member
Posts
4
Karma
0
Another question on the same thing. The foo function you defined above can't handle the case
where one argument is a plain VectorXd and the other one is a Block (e.g. MatrixXd::row).

Shouldn't there be a common super-class of both that the compiler sets as VectorType in the template?
Is this a problem where we should have dynamic binding instead of static given by templates?
User avatar
bjacob
Registered Member
Posts
658
Karma
3
The foo function you defined above can't handle the case
where one argument is a plain VectorXd and the other one is a Block (e.g. MatrixXd::row).


Just let it be template in 2 different parameters:
Code: Select all
template<typename VecType1, typename VecType2> void foo(const VecType1& v1, const VecType2& v2);


Shouldn't there be a common super-class of both that the compiler sets as VectorType in the template?


There is. It's called MatrixBase, templated in the actual derived type (so it's a curiously-recursive-template-pattern).

Code: Select all
template<typename Derived> void foo(const MatrixBase<Derived>& m);


For example, Matrix4f inherits MatrixBase<Matrix4f>, etc. This is static polymorphism: the base class knows at compile time the derived type, since it's passed to it as a template parameter.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot]