Performance issues DynamicSparseMatrix vs. SparseMatrix

Board index

Page 1 of 1 (6 posts)

Tags:

katakombi Registered Member Posts 18 Karma 0	Performance issues DynamicSparseMatrix vs. SparseMatrix Thu Feb 14, 2013 8:32 am Hello folks, I've been using Eigen 3.0 to implement EM using a huge sparse matrix. At that time of being I was using DynamicSparseMatrix as it was the only (unofficial) sparse matrix available in Eigen. Now I see it got deprecated in Eigen 3.1. But if I replace it by SparseMatrix my old code runs like 50-500 times slower. I boiled it down to the coeff() and coeffRef() calls I am using in this loop. I will post a snippet of my code here (tDist and count are sparse matrices, there is one more outer loop over i): Code: Select all 82 // iterate over bag of words of the two aligned sentences 83 // - collect total counts 84 double v; 85 for (int d=0;d<dstSent[i].size();d++) { 86 for (int s=0;s<srcSent[i].size();s++) { 87 const int& dst=dstSent[i][d]; 88 const int& src=srcSent[i][s]; 89 v=firstIter?(1.0/(double)tDist.cols()):(tDist.coeff(src,dst)); 90 total_s[dst]+=v; 91 } 92 } 93 94 // iterate over bag of words of the two aligned sentences 95 // - normalize counts 96 double q; 97 for (int d=0;d<dstSent[i].size();d++) { 98 for (int s=0;s<srcSent[i].size();s++) { 99 const int& dst=dstSent[i][d]; 100 const int& src=srcSent[i][s]; 101 q=(firstIter?(1.0/(double)tDist.cols()):(tDist.coeff(src,dst)))/total_s[dst]; 102 count.coeffRef(src,dst)+=q; 103 total[src]+=q; 104 } 105 } It should be possible to rewrite the code in order to NOT use coeff/coeffRef but I am not sure what would be most promising (Triplets?) By the way, I am reading and writing the sparse matrices to disk using this code: Code: Select all `38 for (int k=0;k<t.outerSize();k++) 39 { 40 for (Eigen::SparseMatrix<double,Eigen::RowMajor>::InnerIterator it(t,k); it; ++it) 41 { 42 out<<it.value()<<" "<<it.row()<<" "<<it.col()<<std::endl; 43 } 44 }` It is horribly slow, in fact it takes almost as long as one iteration of my EM training. Is there a better way to dump sparse matrices to disk/load them? cheers & thank you katakombi
ggael Moderator Posts 3447 Karma 19 OS	Re: Performance issues DynamicSparseMatrix vs. SparseMatrix Thu Feb 14, 2013 6:08 pm Using .coeffRef() to assemble a SparseMatrix is fine as soon as you can predict the upper bound of the number of non-zeros per column, e.g. if it is constant: mat.reserve( VectorXi(nnzPerCol, mat.outerSize()) ); If it is not constant at all, you can fill a VectorXi to send to reserve. In doubt, the easiest way to be sure to get nearly optimal performance is to use a triplet list as recommended in the documentation. One more hint: if the number of calls to coeffRef(i,j) += ... with the same i,j pair is quite important then using coeffRef is probably the best. For IO, we have in unsupported/Eigen/SparseExtra saveMarket and loadMarket functions.
katakombi Registered Member Posts 18 Karma 0	Re: Performance issues DynamicSparseMatrix vs. SparseMatrix Fri Feb 15, 2013 10:35 am Thanks a lot! I am going to try it as soon as possible...
katakombi Registered Member Posts 18 Karma 0	Re: Performance issues DynamicSparseMatrix vs. SparseMatrix Sat Feb 23, 2013 8:00 pm Hi there, Ive found that slow writing of matrices was basically not Eigen-related at all. I wasnt aware of std::endl - quite contrarly to '\n' - flushes the output buffer. this results in like 6-10x slowdown when I write large matrices greets Stefan
katakombi Registered Member Posts 18 Karma 0	Re: Performance issues DynamicSparseMatrix vs. SparseMatrix Thu Mar 07, 2013 5:17 am Dear ggael, thanks for your help! After fiddling around for some time I have abandoned my plans to rewrite my code from DynamicSparseMatrix to SparseMatrix. The thing is I construct a sparse matrix and really need random access (I can't precompute triplets since I accumulate counts in no predefined order). Quite sadly so, I also don't have a guess about how many entries per column will be needed as it may vary between zero and e.g. 5% of all existing values. However, my code runs considerably fast using the DynamicSparseMatrix. So, is there any chance that the officially supported SparseMatrix will support these much faster coeff/coeffRef accessors in future? Would it be technically feasable? cheers Stefan PS: could you please explain what you meant here? One more hint: if the number of calls to coeffRef(i,j) += ... with the same i,j pair is quite important then using coeffRef is probably the best.
katakombi Registered Member Posts 18 Karma 0	Re: Performance issues DynamicSparseMatrix vs. SparseMatrix Thu Mar 07, 2013 6:15 am I just realized that I can accumulate counts also using setFromTriplets! Great, that way I should be able to rewrite my code...