Best memory strategy: large product of fixed size matrices

Board index

Page 1 of 1 (6 posts)

Tags:

jbromley Registered Member Posts 4 Karma 0	Best memory strategy: large product of fixed size matrices Sat Oct 05, 2013 2:02 pm Hi, I have a large number of small fixed size matrices (e.g. 4x4 or 16x16) that I want to multiply consecutively and only consecutively. Specifically I have N small matrices: W_i where i = 1 ... N and I wish to calculate every possible left and right partial product, i.e L(j) = product(i=1 to j) W_i and R(j) = product(i=j to N) W_i for all j. Then I want to modify the W_i in place and repeat, again and again and again. So I don't expect the initial overhead of memory allocation to be my bottleneck but I do want the matrix multiplications to be as efficient as possible. Obviously this means ensuring alignment for vectorization. But am I also correct in assuming that since matrices are only ever multiplied consecutively it would be most efficient to have the matrix data stored consecutively in memory? If this is the case, what is the best way to achieve this? Is it possible to just create a C-style array of Eigen Matrix objects: Matrix4d W[N]; while still preserving alignment for vectorization? And if so would this guarantee the actual matrix data being consecutive in memory? Or would it be better to do: double * Wmemory = aligned_allocator<double>().allocate(N); followed by replacing every reference to W[i] with a reference to Map<Matrix4d>(Wmemory+i16sizeof(double))? Although I'd prefer not to have to write something so cumbersome(!). I'm also prepared to sacrifice portability and assume that the architecture is intel 64-bit (and so malloc should already return aligned pointers) if this would grant me a simpler syntax in my code. (e.g. in this case could I just create a valarray<Matrix4d>(N) and preserve alignment? And would it have the matrix data consecutive or would it be interspersed with internal variables from the Matrix4d class?). Any ideas greatly appreciated...
ggael Moderator Posts 3447 Karma 19 OS	Re: Best memory strategy: large product of fixed size matric Sat Oct 05, 2013 7:11 pm Code: Select all `Matrix4d W[N];` is the simplest and best way to achieve what you want (alignment + sequential storage).
Hauke Registered Member Posts 109 Karma 3 OS	Re: Best memory strategy: large product of fixed size matric Mon Oct 07, 2013 8:39 am Just out of curiosity. You said that you have a large number of small matrices. How big is N? I am just asking since Code: Select all `Matrix4d W[N];` will be located on the stack. Regards, Hauke
jbromley Registered Member Posts 4 Karma 0	Re: Best memory strategy: large product of fixed size matric Mon Oct 07, 2013 1:31 pm The exact value of N will be allowed to vary (optimization problem). However it is anticipated that it will be in the region of 40. I will definitely also want to be doing the problem with 16x16 matrices though, not only 4x4 ones. And I don't yet know whether I'll need double precision. The L's and R's need storing too, but are in fact post and pre multiplied by a row/column vector so they are actually lists of vectors. However I will need two variants of each, and I'll also maybe need to make an entire copy of the data for 'trial functions' when choosing how to update the W's. So we're looking at around maybe 40x16x16xsizeof(double) bytes of storage for the W's (80KB) 40x16xsizeof(double) for each of the L's and R's (gives 5KB * 4) + a full copy of the data => 200KB total Modern linuxes have a default stack of 8MB, I think(?). So hopefully should be okay. But if it turns out that I do want to explore much larger N, what would be the equivalent way to achieve alignment and sequential storage on the heap?
jbromley Registered Member Posts 4 Karma 0	Re: Best memory strategy: large product of fixed size matric Mon Oct 07, 2013 1:42 pm Oh, and on the topic of stacks. If I wrap my data inside a class/struct, e.g. Code: Select all `class Sequence { public: ... EIGEN_MAKE_ALIGNED_OPERATOR_NEW private: Matrix4d W[40]; ... };` will W still be allocated on the stack if I instatiate an object of type Sequence on the stack? And what if I instantiate my object on the heap, presumably then the Eigen data will also be on the heap? Lastly, should I be worried: is there a performance penalty (other than at allocation time) for performing matrix calculations on heap data vs stack data?
Hauke Registered Member Posts 109 Karma 3 OS	Re: Best memory strategy: large product of fixed size matric Mon Oct 07, 2013 2:18 pm Regarding performance, you need to profile. There is only so much we can guess. I would start with a Code: Select all `Matrix4d W[40];` or Code: Select all `std::array<Matrix4d,40> W;` . If that's enough space for your elements good, if not, you may need to switch to Code: Select all `std::vector<Matrix4d,aligned_allocator<Matrix4d>> W;` . Your wrapper won't help, the type still resides on the stack if you use it as is, unless you allocate on the heap and then you have the same result as with a regular std::vector. - Hauke