optimizing matrix-vector multiplies • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

optimizing matrix-vector multiplies

Page 1 of 1 (8 posts)

Tags:

konolige Registered Member Posts 9 Karma 0 OS	optimizing matrix-vector multiplies Sun Aug 29, 2010 4:14 am Hi - I have the following function, that performs a specialized block-matrix to vector multiply, where the block matrices are distributed into a diagonal vector and a sparse off-diagonal mapping. The vector is a variable-size, but it is always a multiple of 6 doubles. I'm looking for advice on how to optimize this. Are loops being unrolled? Is there alignment so that SSE can be used? I have no real experience with the segment<>() operator, and whether the compiler can understand that it preserves alignment. Finally, I need to multiply a vector by the transpose of a matrix, and was wondering is M.transpose()v is the fastest way. Cheers --Kurt void mMV(vector< Matrix<double,6,6>, aligned_allocator<Matrix<double,6,6> > > &diag, vector< map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > > > &cols, VectorXd &vin, VectorXd &vout) { // loop over diag entries for (int i=0; i<(int)diag.size(); i++) vout.segment<6>(i6) = diag[i]vin.segment<6>(i6); // loop over off-diag entries if (cols.size() > 0) for (int i=0; i<(int)cols.size(); i++) { map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > > &col = cols[i]; if (col.size() > 0) { map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > >::iterator it; for (it = col.begin(); it != col.end(); it++) { int ri = (it).first; // get row index Matrix<double,6,6> &M = (it).second; // matrix const Matrix<double,6,1> &vvin = vin.segment<6>(ri6); vout.segment<6>(i6) += M.transpose()vin.segment<6>(ri6); vout.segment<6>(ri6) += Mvin.segment<6>(i*6); } } } }
konolige Registered Member Posts 9 Karma 0 OS	Re: optimizing matrix-vector multiplies Sun Aug 29, 2010 4:18 am Apologies for the formatting, here's a better one...kk Code: Select all mMV(vector< Matrix<double,6,6>, aligned_allocator<Matrix<double,6,6> > > &diag, vector< map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > > > &cols, VectorXd &vin, VectorXd &vout) { // loop over diag entries for (int i=0; i<(int)diag.size(); i++) vout.segment<6>(i6) = diag[i]vin.segment<6>(i6); // loop over off-diag entries if (cols.size() > 0) for (int i=0; i<(int)cols.size(); i++) { map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > > &col = cols[i]; if (col.size() > 0) { map<int,Matrix<double,6,6>, less<int>, aligned_allocator<Matrix<double,6,6> > >::iterator it; for (it = col.begin(); it != col.end(); it++) { int ri = (it).first; // get row index Matrix<double,6,6> &M = (it).second; // matrix const Matrix<double,6,1> &vvin = vin.segment<6>(ri6); vout.segment<6>(i6) += M.transpose()vin.segment<6>(ri6); vout.segment<6>(ri6) += Mvin.segment<6>(i6); } } } }
bjacob Registered Member Posts 658 Karma 3	Re: optimizing matrix-vector multiplies Sun Aug 29, 2010 1:11 pm Hi, let me first answer your general questions and then I'll look at your code more closely. I'm answering with Eigen3 in mind, answers mostly applicable to Eigen2. Are loops being unrolled? You've definitely done all what was needed to preserve the compile-time knowledge about sizes. However, there is another precondition for loops to be unrolled: the estimated 'complexity' of the operation must be below a certain threshold. I am almost certain that your operations here will get unrolled, since 6 is a small enough size, but you can always adjust the threshold if they aren't, by defining EIGEN_UNROLLING_LIMIT to a larger value (the default is 100). Notice that in the special case of matrix products like here, there is potentially a second constant to adjust: EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD is the matrix size where we decide to switch to the cache-friendly approach to matrix products. The default value is 8, so in your case (6) this won't disturb you. But this is also something that would prevent unrolling, since the cache-friendly code, which is mean for large matrices, isn't unrolled. Is there alignment so that SSE can be used? I have no real experience with the segment<>() operator, and whether the compiler can understand that it preserves alignment. Very good question: in your code, this information is lost. Indeed, the 'start' parameter is unknown at compile time; since sizeof(Scalar)==8 here, and SSE requires 16 byte alignment, there only will be alignment if 'start' is even. Which is the case in your program. Now here's a big difference between Eigen 2 and 3. In Eigen 2, there's no way at all that you can specify that your segment is aligned. In Eigen 3, we have working 'Aligned Map', which is something you can use here. Just add these typedefs to your function: Code: Select all `typedef Matrix<double,6,1> Vector6d; typedef Vector6d::AlignedMapType AlignedMapVector6d;` Now you can replace your code Code: Select all `vout.segment<6>(i6) += ...` by this: Code: Select all `AlignedMapVector6d(&vout(i6)) += ...` Finally, I need to multiply a vector by the transpose of a matrix, and was wondering is M.transpose()*v is the fastest way. Yes it is: the transpose() call gets inlined and has zero cost. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
bjacob Registered Member Posts 658 Karma 3	Re: optimizing matrix-vector multiplies Sun Aug 29, 2010 1:31 pm Looking quickly at your code, I can see only 2 things to say: Don't forget to make your input parameters as 'const', for example vin. In your inner loop code: Code: Select all `for (it = col.begin(); it != col.end(); it++) { int ri = (it).first; // get row index Matrix<double,6,6> &M = (it).second; // matrix const Matrix<double,6,1> &vvin = vin.segment<6>(ri6); vout.segment<6>(i6) += M.transpose()vin.segment<6>(ri6); vout.segment<6>(ri6) += Mvin.segment<6>(i6); }` a couple of things. In Eigen3, the indexing type is no longer 'int', it is instead, for dense matrices, the signed integer type having the size of a pointer, aka std::ptrdiff_t. You don't have to know that, just use the typedef Index from your matrix type, e.g. MyMatrixType::Index. This avoids some integer conversions (granted, these are negligible in your case). More importantly, the two last lines might be faster (or not --- do measurements here) if merged into a single line, and marked as lazy. Here's how you would do it with Eigen3: Code: Select all `vout.segment<6>(i6) += M.transpose().lazyProduct(vin.segment<6>(ri6)) + M.lazyProduct(vin.segment<6>(i6));` But since this combined expression has greater complexity, you might have to play again with EIGEN_UNROLLING_LIMIT. Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
konolige Registered Member Posts 9 Karma 0 OS	Re: optimizing matrix-vector multiplies Sun Aug 29, 2010 2:31 pm Thanks, I'll try some of the Eigen3 alignments to see if that helps. Cheers --Kurt
konolige Registered Member Posts 9 Karma 0 OS	Re: optimizing matrix-vector multiplies Mon Sep 06, 2010 4:37 pm Ok, I switched to Eigen3 and am making some progress. But meanwhile I converted my code to be templated on the size of the vector, using N as the template parameter. I can no longer use: Code: Select all typedef Matrix<double,6,1> Vector6d; typedef Vector6d::AlignedMapType AlignedMapVector6d; Now you can replace your code Code: Select all `vout.segment<6>(i6) += ...` by this: Code: Select all `AlignedMapVector6d(&vout(i6)) += ...` I tried something like: Code: Select all `#define AV6d Matrix<double,N,1>::AlignedMapType` But this gives errors on Code: Select all `AV6d(&vout(i*N)) += ...` Any hints?
bjacob Registered Member Posts 658 Karma 3	Re: optimizing matrix-vector multiplies Mon Sep 06, 2010 4:49 pm What's your error? Did you just forget a 'typename' keyword? When T is a template parameter, doing sometemplate<T>::someMemberTypedef is illegal, you need to write typename sometemplate<T>::someMemberTypedef Join us on Eigen's IRC channel: #eigen on irc.freenode.net Have a serious interest in Eigen? Then join the mailing list!
konolige Registered Member Posts 9 Karma 0 OS	Re: optimizing matrix-vector multiplies Mon Sep 06, 2010 7:40 pm Yup, that was it. I'm not that great with templates...kk

Page 1 of 1 (8 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], blue_bullet, Google [Bot], rockscient, Yahoo [Bot]