This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Code optimisation, Kronecker matrix-vector product

Tags: None
(comma "," separated)
ErlendA
Registered Member
Posts
12
Karma
0
OS
So I have this code, which mulitplies the Kronecker product $Q_1 \otimes Q_2 \otimes \cdots \otimes Q_n$ with a vector, $v$, so that $b=(Q_1 \otimes Q_2 \otimes \cdots \otimes Q_n)v$. The input is a std::vector of sparse matrices , an input vector and an output vector.

In the code below, the innermost loops
Code: Select all
for(int lll=index1;lll<=index2;lll=lll+nright)
{
    innerInVec[llk-1]=outVec[lll-1];
    llk=llk+1;
}


for(int lll=index1;lll<=index2;lll=lll+nright)
{
    outVec[lll-1]=innerOutVec[llk-1];
    llk=llk+1;
}

copies parts of a vector into a smaller one, and a smaller one into a bigger one, and I suspect this implementation hampers performance a bit.

What I would like, is something similar to a block, but with variable jumps between the values it takes out. I thinking about using a map with strides, but I didn't get it to work properly. Does anyone have a good idea on how to optimise this?


Code: Select all
#ifndef KRON_MATVEC_H
#define KRON_MATVEC_H


#include <iostream>
#include <Eigen/Eigen>
#include <Eigen/Sparse>
#include <Eigen/StdVector>


using namespace Eigen;


template<class Derived1, class Derived2>
void kronMatVec(std::vector<Derived1 , aligned_allocator<Derived1> > &Qs, MatrixBase<Derived2> &inVec,
      MatrixBase<Derived2> &outVec)
{
 
  typedef typename Derived2::Scalar Scalar;
 
  int N = Qs.size();
 
  int totDim=1;
  for(int iii=0;iii<N;++iii)
  {
    totDim=totDim*Qs[iii].cols();
  }
 
  assert(inVec.rows() == totDim);
 
  VectorXi n=VectorXi::Zero(N);
  int nright=1;
  int nleft=1;
  for(int iii=1;iii<=N-1;++iii)
  {
    n[iii-1]=Qs[iii-1].rows();
    nleft=nleft*n[iii-1];
  }
 
  outVec=inVec;
 
  n[N-1]=Qs[N-1].rows();
 
  // std::cout << n << std::endl;

 
  int base;
  int jump;
  int index1;
  int index2;
  int sizeInner;
  int llk;
 
  Matrix<Scalar,Dynamic,1> innerInVec;
  Matrix<Scalar,Dynamic,1> innerOutVec;
 
 
  for(int iii=N;iii>=1;--iii)
  {
    base=0;
    jump=n[iii-1]*nright;
    for(int kkk=1;kkk<=nleft;++kkk)
    {
      for(int jjj=1;jjj<=nright;++jjj)
      {
   index1=base+jjj;
   index2=base+jjj+nright*(n[iii-1]-1);
   
   sizeInner=(index2-index1)/nright+1;
   innerInVec.resize(sizeInner);
   innerOutVec.resize(sizeInner);
   // std::cout << "Rows: " << innerInVec.rows() << "\n" << std::endl;
   
   llk=1;
   for(int lll=index1;lll<=index2;lll=lll+nright)
   {
     innerInVec[llk-1]=outVec[lll-1];
     llk=llk+1;
   }
   
   //std::cout << innerInVec << "\n" << std::endl;
   
   //std::cout << "Qs cols: " << Qs[iii-1].cols() << "\nVector rows: " << innerInVec.rows() << std::endl;
   
   innerOutVec.noalias()=Qs[iii-1]*innerInVec;
   
   llk=1;
   for(int lll=index1;lll<=index2;lll=lll+nright)
   {
     outVec[lll-1]=innerOutVec[llk-1];
     llk=llk+1;
   }

      }
      base=base+jump;
     
    }
   
    nleft=nleft/n[std::max(iii-2,1)];
    nright=nright*n[iii-1];
  }
 
 
}


#endif
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
you could indeed use a:

Map<Matrix<Scalar,Dynamic,1>,0,InnerStride<> > sub(&outVec.coeffRef(index1), sizeInner, InnerStride<>(nright));

but I'm not sure you will get much speedup.
ErlendA
Registered Member
Posts
12
Karma
0
OS
Thanks for the tips. I managed to get the code a bit more expressive, using this trick. It is about 5% faster.

One thing that bothers me though, is how to get the proper const_casting to work (as in http://eigen.tuxfamily.org/dox/TopicFun ... Types.html).

The following const_cast does not work
Code: Select all
const_cast< Map<Matrix<Scalar,Dynamic,1>,0,InnerStride<> >& >(outMap)=Qs[iii-1]*outMap;


Code: Select all
template<class Derived1, class Derived2>
void kronMatVec(const std::vector<Derived1 , aligned_allocator<Derived1> > &Qs, const MatrixBase<Derived2> &inVec,
      MatrixBase<Derived2> &outVec)
{
 
  typedef typename Derived2::Scalar Scalar;
 
  int N = Qs.size();
 
  int totDim=1;
  for(int iii=0;iii<N;++iii)
  {
    totDim=totDim*Qs[iii].cols();
  }
 
  assert(inVec.rows() == totDim);
 
  VectorXi n=VectorXi::Zero(N);
  int nright=1;
  int nleft=1;
  for(int iii=1;iii<=N-1;++iii)
  {
    n[iii-1]=Qs[iii-1].rows();
    nleft=nleft*n[iii-1];
  }
 
  outVec=inVec;

  n[N-1]=Qs[N-1].rows();

  int base;
  int jump;
  int index1;
  int index2;
  int sizeInner;
  int llk;
 
  for(int iii=N;iii>=1;--iii)
  {
    base=0;
    jump=n[iii-1]*nright;
    for(int kkk=1;kkk<=nleft;++kkk)
    {
      for(int jjj=1;jjj<=nright;++jjj)
      {
   index1=base+jjj;
   index2=base+jjj+nright*(n[iii-1]-1);
   
   sizeInner=(index2-index1)/nright+1;
   
   Map<Matrix<Scalar,Dynamic,1>,0,InnerStride<> > outMap(&outVec.coeffRef(index1-1),sizeInner, InnerStride<>(nright));
   // std::cout << outMap << "\n" << std::endl;
   outMap=Qs[iii-1]*outMap;
   // const_cast< Map<Matrix<Scalar,Dynamic,1>,0,InnerStride<> >& >(outMap)=Qs[iii-1]*outMap;

      }
      base=base+jump;
     
    }
   
    nleft=nleft/n[std::max(iii-2,1)];
    nright=nright*n[iii-1];
  }
 
}
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
outMap is not const so why would you want to const_cast it?
ErlendA
Registered Member
Posts
12
Karma
0
OS
I don't need it to be const, but I try to conform to the guidelines at http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html when I write code, in order to avoid possible caveats in later development. I also found a way to have constness at the same address.

Please regard this post as solved.

I include the code, for completeness, in case other users need it.

Code: Select all
template<class Derived1, class Derived2>
void kronMatVec(const std::vector<Derived1 , aligned_allocator<Derived1> > &Qs, const MatrixBase<Derived2> &inVec,
      const MatrixBase<Derived2> &outVec)
{
 
  typedef typename Derived2::Scalar Scalar;
 
  int N = Qs.size();
 
  int totDim=1;
  for(int iii=0;iii<N;++iii)
  {
    totDim=totDim*Qs[iii].cols();
  }
 
  assert(inVec.rows() == totDim);
 
  VectorXi n=VectorXi::Zero(N);
  int nright=1;
  int nleft=1;
  for(int iii=1;iii<=N-1;++iii)
  {
    n[iii-1]=Qs[iii-1].rows();
    nleft=nleft*n[iii-1];
  }
 
  const_cast< MatrixBase<Derived2>& >(outVec)=inVec;

  n[N-1]=Qs[N-1].rows();

  int base;
  int jump;
  int index1;
  int index2;
  int sizeInner;
  MatrixBase<Derived2>& outVecNonConst = const_cast< MatrixBase<Derived2>& >(outVec);
 
  for(int iii=N;iii>=1;--iii)
  {
    base=0;
    jump=n[iii-1]*nright;
    for(int kkk=1;kkk<=nleft;++kkk)
    {
      for(int jjj=1;jjj<=nright;++jjj)
      {
   index1=base+jjj;
   index2=base+jjj+nright*(n[iii-1]-1);
   
   sizeInner=(index2-index1)/nright+1;
   
   Map<Matrix<Scalar,Dynamic,1>,0,InnerStride<> > outMap(&outVecNonConst.coeffRef(index1-1),sizeInner, InnerStride<>(nright));
   outMap=Qs[iii-1]*outMap;

      }
      base=base+jump;
     
    }

    nleft=nleft/(n[std::max(iii-2,1)-1]);
    nright=nright*n[iii-1];
  }
 
}


Bookmarks



Who is online

Registered users: Bing [Bot], claydoh, Google [Bot], rblackwell, Yahoo [Bot]