## Colwise on Vec3 type is slow

Tue Oct 03, 2017 8:34 pm
Hi,

First of all thank you for your amazing library.

I'm trying to understand what's going on in the following code.

- I'm applying a 3D transformation (R|C) on some 3d point data.
- I'm using the operator() to apply the transformation either to a vector or a matrix.

You will find below the code I used in order to reproduce the issue.
Using the operator() on Vec3 data in a loop seems like 1x slower than calling the operator() on the matrix.
I did not expect such a timing difference.

\$ ./main
For loop (Pose with Vec3): 259ms
(Pose with Mat3X): 18ms

If I add the following function to the Pose class the code is running as fast, but I would like to have only one transform operator and does not need to implement two.

Code: Select all
`inline Vec3 operator () ( const Vec3& p ) const  {    return rotation_ * ( p - center_ );  }`

Can you help me to understand what I have made wrong?

Code: Select all
`#include "Eigen/Dense"#include <chrono>#include <iostream>#include <ratio>#include <type_traits>using Vec3 = Eigen::Vector3d;using Mat3 = Eigen::Matrix<double, 3, 3>;using Mat3X = Eigen::Matrix<double, 3, Eigen::Dynamic>;struct Pose{  Mat3 rotation_;  Vec3 center_;  Pose(const Mat3& r = std::move(Mat3::Identity()), const Vec3& c = std::move(Vec3::Zero()))  : rotation_( r ), center_( c ) {}  inline Mat3X operator () ( const Mat3X& p ) const {    return rotation_ * ( p.colwise() - center_ );  }};int main(){  const int nb_elem = 900000;  Mat3X test = Mat3X::Random(3, nb_elem);  Mat3 rot = Mat3::Identity();  Vec3 center;  Pose pose;  // warm up  for (int i : {0,1,2,3,4}) {    Mat3X testout = test;    for (int i = 0; i < nb_elem; ++i) {      testout.col(i) = pose(test.col(i));    }  }  std::chrono::high_resolution_clock::time_point start;  {    start = std::chrono::high_resolution_clock::now();    Mat3X testout = test;    for (int i = 0; i < nb_elem; ++i)    {      testout.col(i) = pose(test.col(i));    }    const auto end = std::chrono::high_resolution_clock::now();    std::cout << "For loop (Pose with Vec3): "     << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;  }  {    start = std::chrono::high_resolution_clock::now();    const Mat3X testout = pose(test);    const auto end = std::chrono::high_resolution_clock::now();    std::cout << "(Pose with Mat3X): "       << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;  }  return EXIT_SUCCESS;}`

### Re: Colwise on Vec3 type is slow

Wed Oct 04, 2017 1:09 pm
This is because for every calls to pose(p.col(i)) the compiler has to create a Mat3X object (which is allocated on the heap, so there is a costly malloc hidden there) and copy p.col(i) into this Mat3X object. Same for the returned value. The fix is as simple as witting a single generic template function, in c++14:

Code: Select all
`template<typename T>auto operator() (const T& p) const {  return rotation_ * ( p.colwise() - center_ );}`

This will return an expression, so be careful with:

Code: Select all
`auto res = pose(p);`

See: https://eigen.tuxfamily.org/dox/TopicPitfalls.html

If you want to return the computed result (i.e., a Matrix) then write

Code: Select all
`template<typename T>typename T::PlainObject operator() (const T& p) const {  return rotation_ * ( p.colwise() - center_ );}`
### Re: Colwise on Vec3 type is slow

Wed Oct 04, 2017 5:41 pm
Thank you very much.

I will choose the template solution for the moment and will try to learn how to better use Eigen from the TopicPitfalls doc webpage!

Before:
\$ ./main
For loop (Pose with Vec3): 292ms
(Pose with Mat3X): 18ms

After
\$ ./main
For loop (Pose with Vec3): 5ms // Clearly better!
(Pose with Mat3X): 18ms

I wonder if you have any tool to advise to make a runtime or static analysis that allow to find such not optimal return error type?

