This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Optimization Suggestions

Tags: None
(comma "," separated)
RLovelett
Registered Member
Posts
11
Karma
0

Optimization Suggestions

Thu Jul 14, 2011 7:32 pm
I am trying to compare the Eigen product of two matrices with my implementation of the product of two matrices.

In my outer loop I am calling:

Code: Select all
Spline l(/*correct stuff here*/);
l.cubic(/*correct stuff here*/);
/* and */
l.cubicEigen(/*correct stuff here*/);


I'm calling each about 10M times to get a fair benchmark. It's taking the method using my product 2.71s and taking the Eigen version 4.587s. The answers are exactly the same for all of my test cases so I'm assuming they are working similarly.

Is this expected or am I doing something wrong?

PC Vitals:
Win XP SP3
VS 2010
Core i7

Code: Select all
class Spline
{
//Program interpolates a set of n-dimensional data for a given set of n interpolation values.
//Multiple sets of interpolation values may be used  by calling the interpolation function each time.
public:
  Spline(size_t xsSize, double* xs, size_t ysSize, double* ys);
  ~Spline();
  double cubic(double x, int closestIndex);
  double cubicEigen(double x, int closestIndex);
protected:
  size_t _xsSize;
  size_t _ysSize;
  size_t _hsSize;
  double* _xs;
  double* _ys;
  double* _hs;
  double* _A;
  Eigen::Vector4d _B;
  Eigen::Vector4d _C;
  double* _D;
  double* _S;
  Eigen::Matrix4d oneDEigen;
private:
  double* geths(size_t nSize, double* n);
  static double* product(double* A, size_t aRows, size_t aCols, double* B, size_t bRows, size_t bCols);
  static const double oneD[16];
};

Spline::Spline(size_t xsSize, double* xs, size_t ysSize, double* ys)
{
  _xsSize = xsSize;
  _ysSize = ysSize;
  _xs = new double[_xsSize];
  _ys = new double[_ysSize];
  memcpy(_xs, xs, xsSize * sizeof(double));
  memcpy(_ys, ys, ysSize * sizeof(double));
  _A = new double[xsSize];
  _B.resize(4, 1);
  _C.resize(4, 1);
  _D = new double[xsSize];
  _S = new double[xsSize];
  _hsSize = _xsSize - 1;
  _hs = geths(_hsSize, xs);
  oneDEigen <<  2, -2,  1,  1,
               -3,  3, -2, -1,
                0,  0,  1,  0,
                1,  0,  0,  0;
}

// Returns the interpolated value for X given a set of xs and ys in a 1D spline.
double Spline::cubic(double x, int closestIndex)
{
  double* B = new double[4];
  memcpy(B, &_ys[closestIndex], 2*sizeof(double));
  B[2] = (_ys[closestIndex + 1] - _ys[closestIndex - 1]) / 2;
  B[3] = (_ys[closestIndex + 2] - _ys[closestIndex]) / 2;
  double* C = product((double*)oneD, 4, 4, B, 4, 1);

  //Get interpolated value
  double h = _xs[closestIndex];
  double h1 = _xs[closestIndex + 1];
  double xh = (x - h) / (h1 - h);
  double xhsq = xh * xh;
  double xhcb = xhsq * xh;
  return (C[0] * xhcb) + (C[1] * xhsq) + (C[2] * xh) + C[3];
}

double Spline::cubicEigen(double x, int closestIndex)
{
  double fx_1 = _ys[closestIndex - 1];
  double fx = _ys[closestIndex];
  double fx1 = _ys[closestIndex + 1];
  double fx2 = _ys[closestIndex + 2];
  double dx = (fx1 - fx_1) / 2;
  double dx1 = (fx2 - fx) / 2;

  _B << fx, fx1, dx, dx1;
  _C = oneDEigen * _B;

  //Get interpolated value
  double h = _xs[closestIndex];
  double h1 = _xs[closestIndex + 1];
  double xh = (x - h) / (h1 - h);
  double xhsq = xh * xh;
  double xhcb = xhsq * xh;
  return (_C(0,0) * xhcb) + (_C(1,0) * xhsq) + (_C(2,0) * xh) + _C(3,0);
}

/**
 * Performs matrix multiplication of A * B = C.
 *
 * A matrix is expected to be defined as a standard C array. Therefore, if your
 * matrix looked like:
 * | a b c |
 * | d e f |
 *
 * It would be stored in a C array like this:
 * double matrix[2*3] = {a,b,c,d,e,f};
 *
 * Therefore, if you wanted to multiply these two matrices:
 * | a b c |   | x |
 * | d e f | * | y | =
 * | g h i |   | z |
 *
 * Example:
 * size_t aRows = 3, aCols = 3;
 * double A[aRows * aCols] = {a, b, c, d, e, f, g, h, i};
 * size_t bRows = 3, aCols = 1;
 * double B[bRows * bCols] = {x, y, z};
 * double* C = product(A, aRows, aCols, B, bRows, bCols);
 */
double* Spline::product(double* A, size_t aRows, size_t aCols, double* B, size_t bRows, size_t bCols)
{
  // Two matrices can be multiplied only when the number of columns in the
  // first equals the number of rows in the second.
  assert(aCols == bRows);

  // If A is an m-by-n matrix and B is an n-by-p matrix,
  // then their matrix product C is a m-by-p matrix.
  size_t elementsInC = aRows * bCols;
  double* C = new double[elementsInC];

  size_t offset = bRows - 1;
  // Perform the looping multiplication
  for(size_t ii = 0; ii < aRows; ii++)      // Current row index of: A
  {
    size_t aOffset = ii * aCols;
    size_t cOffset = ii * bCols;
    for(size_t jj = 0; jj < bCols; jj++)    // Current col index of: B
    {
      double sum = 0;
      for(size_t kk = 0; kk < aCols; kk++)
        sum += A[aOffset + kk] * B[(kk * bCols) + jj];

      C[cOffset + jj] = sum;
    }
  }

  return C; 
};
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Optimization Suggestions

Fri Jul 15, 2011 11:14 am
that's strange, make sure you benched in "release" mode, i.e., with the optimization enabled. If you have a small self contained test (i.e., with the /*correct stuff here*/ replaced by true code), I could try it myself with GCC to see if that comes from MSVC.
RLovelett
Registered Member
Posts
11
Karma
0

Re: Optimization Suggestions

Fri Jul 15, 2011 2:24 pm
Ok I have posted the exact code used to create this output:
https://gist.github.com/1084757

Using Spline#cubic
Code: Select all
f(5)=1 calculated in: 2.685s.
Fin.


Using Spline#cubicEigen
Code: Select all
f(5)=1 calculated in: 0.373s.
Fin.


@ggael,
So Eigen is faster, WAAAAY FASTER, as I was hoping. I'm sorry for wasting your time. My solution method was to delete everything, checkout my code, and recompile. Once I did that the speed difference became apparent.

Just thought I would post the "final" Eigen code so that if anyone needed it they could use it for reference.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]