Registered Member
|
The general setup is that I have a row-major major X (usually the number of rows in X can be millions and number of columns on the order of hundreds) and I want to compute X^TX = X_1^TX_1 + ... + X_nfolds^TX_nfolds in addition to computing and storing each of X_1^TX_1, ..., X_nfolds^TX_nfolds. where each of X_k is a submatrix of X which is not contiguous in memory (ie it is comprised of rows of X which are definitely not next to each other)
I have code which does accomplish this. However, it causes a huge usage of memory, much larger than the actual memory space of X. The code which does this (note that this is inside a class definition, so X is a row-major MatrixXd class member) is the following:
Does anyone have any recommendations for reducing or eliminating the extra memory usage? Thanks! |
Moderator
|
Except computing the A_i = X_i^T X_i on demand there is not much you can do. Compressed triangular storage could also half the memory usage, but that's not supported yet. You could emulate by storing A_i and A_i+1 into a nbvars+1 x nbvars matrix T_i/2 such that:
A_i <=> (T_i/2).topRows(nbvars).selfadjointView<Upper>() A_i+1 <=> (T_i/2).bottomRows(nbvars).selfadjointView<Lower>() Again, this will "only" halves the memory usage. |
Registered Member
|
Thanks, ggael. In my usage of the code, X^TX in general will be not much larger than 500x500 and thus memory isn't much an issue on that front. What I'm *really* worried about is unwanted copies of X, which can be many gigabytes
|
Moderator
|
Where do you see copies of X? I don't see any in that code, except the numelem*nvars chunks, but I expect them to be quite small?
Looking at your code, you are storing nfolds matrices of size nvars*nvars, so if nvars==500, this represents about nfolds*2MB which can be large if nfolds is large... If nfolds is small, but numelem too large, then you can still compute the A_i for which numelem is too large through smaller chunks. |
Registered users: Bing [Bot], blue_bullet, Google [Bot], rockscient, Yahoo [Bot]