This forum has been archived. All content is frozen. Please use KDE Discuss instead.

How to allocate and fill sparse matrix column wise efficient

Tags: None
(comma "," separated)
becker
Registered Member
Posts
2
Karma
0
First of all I would like to explain you my problem. Usually my sparse matrices can have dimensions up to 100mio x 50k. The exact number of rows and columns is assessed on run time. Moreover, I don’t know the number of non-zero elements during initialization – estimating the exact number of non-zero elements is not possible as it can vary considerable.
I have to fill the sparse matrix column wise at random. In addition when filling a certain column, row values are also inserted at random (but these row values can be sorted beforehand).
My question is how I can fill the matrix best with respect to performance?
So far my approach (pseudo code) is the following – please let me know if I am doing something wrong:

Code: Select all
typedef Eigen::SparseMatrix<float, Eigen::ColMajor, std::ptrdiff_t> SpMatCol;
std::ptrdiff_t row = 35000000;
std::ptrdiff_t col = 10000;
SpMatCol mySP(row, col);
// estimate - is a very conservative estimate better than no estimate ?
int minNNZperCol = 1;
int maxNNZperCol = 20000;
mySP.reserve(maxNNZperCol * col);

// for now let's assume a ordered column wise insertion
   for (long j = 0; j < col; ++j)
   { 
      mySP.startVec(j);
      // get random nnz for current column
      int NNZperColumn = minNNZperCol + (rand() % (int)(maxNNZperCol - minNNZperCol + 1));
      //unsorted index vector
      Eigen::VectorXi vIndex = Eigen::VectorXi::LinSpaced(NNZperColumn, row, 1);
      //sorted index vector
      std::sort(vIndex.data(), vIndex.data()+ vIndex.size());
      // create random values
      Eigen::VectorXf vValue = Eigen::VectorXf::Random(NNZperColumn);

      for (int i = 0; i < vValue.size(); ++i)
      {
         mySP.insertBack(vIndex[i], j) = vValue[i];
      }

   }
   mySP.finalize();


Also do you know why I get std::bad alloc error when increasing the number of columns to 20000. Thus allocating memory for 400mio floats. I'm looking forward to your input. Many thanks in advance!
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
With 20k columns, this represents 20k * 20k * (sizeof(float) + sizeof(ptrdiff_t)) = 4,8 GB.

Regarding speed, your approach sounds fine. Sorting could be done implicitly trough transposing but this would double the peak memory consumption....
becker
Registered Member
Posts
2
Karma
0
Thanks for your quick reply. I appreciate it ;)
One last thing about the 4.8GB. I don't fully understand why I get a bad alloc error when trying to reserve memory for 4.8GB. Shouldn't this be possible on a 16GB ram system or am I missing something. Please note that I am having around 12GB of ram available on run time. In more detail (I'm pretty sure you all know this), the reserve function is calling the reallocate function. Within the reallocate function firstly memory for the values(floats) are reserved and then secondly for the indices (ptrdiff_t). The bad allac error is thrown in internal::scoped_array<StorageIndex> newIndices(size) in CompressedStorage.h. when trying to allocate memory for the index vector.
Maybe the problem is caused by the fact that I am running a 32 bit application (on 64 bit OS-namely Win7). I guess there is a limitation of the address space (~2GB) you can get as a 32 bit application.
Thanks again for your help.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Indeed, if you are running a 32bits application, then you cannot allocate the required amount of memory.
tienhung
Registered Member
Posts
29
Karma
0
Hi ggael,

Is there any possible way to make the insert()/coeffRef() functions be thread-safe? Is it safe to use openMP to parallelize the filling process of sparse/dense matrices? Sorry my question is not really related with the topic.

The code could be something like this small example:
Code: Select all
......
Eigen::SparseMatrix<double> matrixA(n,n);
Eigen::MatrixXd matricB(n,n);
#pragma omp parallel for
for (int i = 0; i < n; ++i)
{
    for (int j = 0; j < n; ++j)
    {
        if (i < j)
        {
            matrixA.coeffRef(i, j) = i + j;
            matrixB.coeffRef(i, j) = i + j;
        }
    }
}
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
To this end you have to make sure that enough memory is reserved for each column and that each column is filled by only one thread.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Evergrowing, Google [Bot]