This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Convolution between two matrices (images)

Tags: None
(comma "," separated)
linello
Registered Member
Posts
56
Karma
0
OS
Hi you all, I'm interested if there is a (fast) method to compute, a 2D convolution between two matrices, possibly also defining a kernel, like in Matlab using the conv2 function.

Thanks again!
linello
Registered Member
Posts
56
Karma
0
OS
A first temptative solution can be the following...


Code: Select all
template < typename Scalar, int SizeX, int SizeY , int KSizeX, int KSizeY>
Eigen::Matrix< Scalar, SizeX, SizeY >  conv2(  Eigen::Matrix<Scalar, SizeX, SizeY> &I,  Eigen::Matrix< Scalar, KSizeX, KSizeY > &kernel )
{
    Eigen::Matrix<Scalar, SizeX, SizeY> O;
    int col=0,row=0,i=0,j=0;
   
    for ( row = KSizeX; row < SizeX-KSizeX; row++ )
    {
      for ( col = KSizeY; col < SizeY-KSizeY; col++ )
      {
      Scalar accumulation = 0;
      Scalar weightsum = 0;
      for ( i = 0; i < KSizeX; i++ )
      {
        for ( j = 0; j < KSizeY; j++ )
        {
          Scalar k = I(row+i, col+j);
          accumulation += k * kernel(i,j);
          weightsum += kernel(i,j);
        }
       
      }
     
      O(row,col) = Scalar (accumulation/weightsum);
    }
    }
    return O;
}


Tried with the following lines of code seems to produce a adequate smoothing ( the kernel was taken from wikipedia http://en.wikipedia.org/wiki/Gaussian_blur)

Code: Select all
    Matrix<float,100,100> I;
    I.setRandom();
   
    Matrix<float,7,7> K;
    K << 0.00000067, 0.00002292, 0.00019117, 0.00038771, 0.00019117, 0.00002292, 0.00000067,
    0.00002292, 0.00078633, 0.00655965, 0.01330373, 0.00655965, 0.00078633, 0.00002292,
    0.00019117, 0.00655965, 0.05472157, 0.11098164, 0.05472157, 0.00655965, 0.00019117,
    0.00038771, 0.01330373, 0.11098164, 0.22508352, 0.11098164, 0.01330373, 0.00038771,
    0.00019117, 0.00655965, 0.05472157, 0.11098164, 0.05472157, 0.00655965, 0.00019117,
    0.00002292, 0.00078633, 0.00655965, 0.01330373, 0.00655965, 0.00078633, 0.00002292,
    0.00000067, 0.00002292, 0.00019117, 0.00038771, 0.00019117, 0.00002292, 0.00000067 ;
   
    Matrix<float,100,100> O = conv2(I,K);
    cout << O << endl;


I would like to try some better vectorization...in fact there is no vectorization here ;)

P.S. Boundaries are not handled correctly...


I came up with a better solution...

Code: Select all
template <typename Derived, typename Derived2 >
Derived conv2d(const MatrixBase<Derived>& I, const MatrixBase<Derived2> &kernel )
{
    Derived O = Derived::Zero(I.rows(),I.cols());
   
   
    typedef typename Derived::Scalar Scalar;
    typedef typename Derived2::Scalar Scalar2;
   
    int col=0,row=0;
    int KSizeX = kernel.rows();
    int KSizeY = kernel.cols();
   
    int limitRow = I.rows()-KSizeX;
    int limitCol = I.cols()-KSizeY;
   
    Derived2 block ;
    Scalar normalization = kernel.sum();
    if ( normalization < 1E-6 )
    {
        normalization=1;
    }
    for ( row = KSizeX; row < limitRow; row++ )
    {
     
      for ( col = KSizeY; col < limitCol; col++ )
      {   
      Scalar b=(static_cast<Derived2>( I.block(row,col,KSizeX,KSizeY ) ).cwiseProduct(kernel)).sum();
      O.coeffRef(row,col) = b;
      }
    }
   
    return O/normalization;
}
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
thanks, this will probably be useful to other users though the proper handling of boundaries would be very nice!
matthiasleimeister
Registered Member
Posts
3
Karma
0
Hi all,

I hope it's ok to reopen this very old thread :). I was wondering how the above code example will likely be optimized by Eigen. Especially, can the component-wise product and sum be vectorized?

As I understand it, I.block(row,col,KSizeX,KSizeY ) will only be properly aligned in memory for specific values of row and col. Or is there a temporary object created each time a block is extracted from the full matrix? I checked the assembly output for this bit of code and could only see scalar simd instructions (mulss/addss). Could missing alignment be the cause of this or is there probably some optimization flag missing in my project? In the first case, would it be worth putting the block in the inner loop into a newly created matrix to allow for alignment?

Thanks a lot and best regards,
Matthias
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
The line:

I.block(row,col,KSizeX,KSizeY ).cwiseProduct(kernel)).sum()

should indeed be vectorized, unless KSizeX/KSizeY are not suitable for vectorization. Copying the block into an aligned buffer is point-less as this operation is memory bounded anyway.

To guarantee optimal vectorization, it would be better to process the input image per packet, like:
Code: Select all
const int S = half_filter_size;
const int K = 8; // packet size
for each column j
  for(i=S; i+K+S<rows; i+=K)
  {
     Matrix<Scalar,K,1> res = Matrix<Scalar,K,1>::Zero();
     for(h=-S;h<=S;++h)
       for(g=-S;g<=S;++g)
          res += kernel(g+S,h+S) * input.col(j+h).segment<K>(i+g);
    output.col(j).segment<K>(i) = K;
  }


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]