Registered Member
|
Hi All, I really need to improve the velocity of my code and I am a Newbie in EIGEN, then I have a bench of basic questions which are still unclear to me. Any input would be very appreciate
1) All my sparses matrices contains INTEGER data but at the very end of the code I need to multiply it with "DOUBLE" data Vector. Then right now I changed all the matrix of the code to "DOUBLE" becaus learned that we cannot mix Double and INTEGER. However is the fact of using "DOUBLE" slowing down the process? If this is the case, should I convert the sparse matrix from integer to double at the end only? Is that possible? 2) I read in the tutorial and forum that the "Mat.reserve" is key to speed up things, but I am not using it here, do you think this will slow down (performance) the execution of logic. Is just that I don't really understood how to setup it so I ignore it, I am afraid of the effect not using it. Is that crucial? 3) Is a dense vector is just a standard Eigen vector with a lot of None Zero or something else? I read that Eigen was good for Sparse Matrix * dense vector product. 4) If I understand well, the only way to loop through a sparse matrix for the None Zero values to speed up the execution is using "InneriTerator it". is it correct? Then please correct me if I am wrong in my understanding : a) The Inneriterator is an Iterator for the non Zero Values in the matrix? ( so if there are 500 none zero, then it will go from 0 to 500) b) it.value() returns the value of the element in current iteration. c) it.column() returns the column index of the element in current iteration. d) it. row() returns the row index of the element in current iteration. e) it.Index() returns index of non-zero element in current iteration. Then is this kind of setup correct? (part of my code) for (int k=0; k<SpNE.outerSize(); ++k) for (SparseMatrix<double>::InnerIterator it(SpNE,k); it; ++it){ if (SpNE.coeffRef(it.row(),k)!= 2) // Give the value of 0 to NE(or SpNE) if NE != 2 SpNE.coeffRef(it.row(),k)=(0); // Iteration sur it.row (where the data is only) } DOC: SparseMatrix<double> mat(rows,cols); for (int k=0; k<mat.outerSize(); ++k) for (SparseMatrix<double>::InnerIterator it(mat,k); it; ++it) { it.value(); it.row(); // row index it.col(); // col index (here it is equal to k) it.index(); // inner index, here it is equal to it.row() } 4) I cannot understand what the "OUTERSARTS" is, I am looking this in the tutorial (http://eigen.tuxfamily.org/dox/TutorialGeometry.html) What is it exactly? 5) Should I compress the matrix to make the code running faster? I saw there is a compress option 6) Is EIGEN working with 64 bits setup? I am doing all in 32 because I found probleme when I try to compile with Vstudio in 64 bits. I know they are a lot of questions but again any input would be fantastic! Thanks a lot, Robin PS: If somebody want I can provide him the complete code for his checking PPS: Thanks for Eigen!! |
Registered Member
|
I can't say what would be faster because it depends on a lot of factors. You need to measure the time for your specific problem. You can use integerMatrix.cast<double>() to convert from integer to double. Again, this depends on your particular problem. Typically, the computation goes in two phases: first you construct your sparse matrix, and then you compute with it. The first phase can be sped up by using reserve(). So if you're finding that the first phase costs a lot of time, it's worthwhile to look into reserve(). If it is the second phase that costs most time, then it's probably not worth bothering about reserve(). In Eigen, objects of type VectorXd, Vector3i etc. are dense vectors, and objects of type SparseVector are sparse vectors. Mathematically, they represent the same thing (a vector), but implemented in a different way. Sparse vectors are optimized for the case where there are many zeros in the vector. It's possible to use a sparse vector where all the elements are zero, but that's not efficient. This would work, but I think it is not so efficient. As the tutorial says, coeffRef() is rather expensive, so you want to avoid it. Instead, try
In the first example in the tutorial, OuterStarts is 0, 3, 5, 8, 10, 12. That means that the non-zero entries of first column of the matrix are stored in the arrays Values and InnerIndices starting from index 0, the second column is stored starting from index 3, the third column is stored starting from index 5, and so on. Probably, but it depends on your application. Typically, you compress after the first phase (constructing your matrix). Try it, measure the time, and see whether it makes a difference. Yes, it should work. Most of us are using 64 bits. It might be an issue with Visual Studio (not that many are using it). |
Registered Member
|
Once again fantastic!
Thanks a lot for those very complete answers. I had not time yet to test all your suggestions but only with the answer of question 4, it seems that the code run more than 100 time faster! I have to double check to be sure but is impressive! Maybe is because I need to work with matrix of 1 Mo Elements!? Also is there a "chronometer" that we can setup at different part of the code in Eigen? for (int k=0; k<SpNE.outerSize(); ++k) for (SparseMatrix<double>::InnerIterator it(SpNE,k); it; ++it){ if (it.value() != 2) // Give the value of 0 to NE(or SpNE) if NE != 2 it.valueRef() = 0; // Iteration sur it.row (where the data is only) } Here the last part in my code where I am half sure if I can optimize something. However since in this case I need to change the values of complete Row ( not only the None Zeros value) I don't think we can do much more... //Fixing the Elements with the K constant RowVector2i lcn(1,3); // Create the vector containing the node which we want fixe. for (int k=0; k<SpNE.outerSize(); ++k) for (int h = 0; h < 2; h++){ SpNE.coeffRef(lcn(h),k) = 0; SpNE.coeffRef(lcn(h),lcn(h))=1; } Now I will check the effect of: compressing the matrix and also try to setup again the 64 version of Visual studio 2008 (maybe Open a topic about it). If visual studio is not that good what is Eigen people using then? Are the other option " Friendly" as well? Any way, thank again for all, you really help me a lot and save my day (or Week!) . Robin |
Registered Member
|
Actually I found the issue with the 64 bit version. Again it was a Newbie problem, I did not realized that I need to include again Eigen but below the 64 platform...
So all good!! Cheers, Robin |
Moderator
|
You can measure time with ou bench/BenchTimer.h timer:
BenchTimer t; t.start(); ... t.stop(); cout << t.value(); t.reset(); // if you not reset it will keep the best measure t.start(); ... t.stop(); cout << t.value(); |
Registered Member
|
Thanks so much, I will try this.
By the way even if my code is already incredibly faster compare to the beginning (thanks to the help of all of you), is it still slow when I need to apply it to a very big model, I tried with a last one and is it running for 8 hours now..., I have 1Mo Elements. Then I will would try to compress the matrix and see. But I have one more question, the tutorial says they Sparse Matrix are compressed anyway after the first operation right? so can this really help to compress it earlier? I copy the complete code here below just in case, I just would like to know if I am not missing something very obvious to make it faster... ps: In Red is when I am doing thing with Sparse Matrix Cheers, Robin ---------------------------------------------------------------------------------------------------------------------- //Define basic Variable { int ne =GetNumbeE(pDoc); int nn =GetNumbeN(pDoc); #define ROWS ne #define COLUMNS 3 #define COLUMNN nn double eps (0.5); SparseMatrix<double> SpNM(ne,nn),SpNE(ne,ne),SpNM2(ne,nn); //Definition of the Sparse VectorXd Ke1 (ne); VectorXd Ke2 (ne); VectorXd Ke3 (ne); VectorXd Netsum1 (ne); VectorXd Netsum2 (ne); VectorXd Netsum3 (ne); VectorXd NEKE1 (ne); VectorXd NEKE2 (ne); VectorXd NEKE3 (ne); VectorXd Nekediv1 (ne); VectorXd Nekediv2 (ne); VectorXd Nekediv3 (ne); VectorXd diri1 (ne); VectorXd diri2 (ne); VectorXd diri3 (ne); VectorXd v(ne); //%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% // Import the data for (int i = 0; i < ne; i++) { //Ke(i)= GetdataKe(pDoc,i); Ke1 (i) = Getdata1(pDoc, i); Ke2 (i) = Getdata2(pDoc, i); Ke3 (i) = Getdata3(pDoc, i); for( int j=0;j<3; j++){ SpNM.coeffRef(i, GetNode(pDoc,i,j) ) = 1; } } //First operation using Sparse matrix SpNE = SpNM * SpNM.transpose() ; //%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for (int k=0; k<SpNE.outerSize(); ++k) for (SparseMatrix<double>::InnerIterator it(SpNE,k); it; ++it){ if (it.value() != 2) // Give the value of 0 to NE(or SpNE) if NE != 2 it.valueRef() = 0; // Iteration sur it.row (where the data is only) } //%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% //Doing some staff in the Sparse Matrix RowVector2i lcn(1,3); // Create the vector containing the node which we want fixe. for (int k=0; k<SpNE.outerSize(); ++k) for (int h = 0; h < 2; h++){ SpNE.coeffRef(lcn(h),k) = 0; SpNE.coeffRef(lcn(h),lcn(h))=1; } //%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% //Starting some operations for (int i = 0;i < ne; i++){ v(i)=1; } Netsum1 = SpNE * v; NEKE1 = SpNE * Ke1; Nekediv1 = NEKE1.array() / Netsum1.array(); %% Not Matrixial Calculation! diri1 =(Nekediv1-Ke1)*eps; Ke1=Ke1+diri1; ///%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% etc.... |
Moderator
|
1- make sure you compiled in release mode (e.g., -O2).
2- which operation takes time? 3- Why do you explicitly insert zeros? (SpNE.coeffRef(lcn(h),k) = 0;) 4- You can prune explicite zeros with mat.prune(0); Typically after setting the values !=2 to 0. You can also do:
5- Before a sparse operation that takes time, check the number of non zeros: mat.nonZeros(); Maybe it is very huge |
Registered Member
|
Hi Gael thanks a lot for you input once again!
Here my answer in the text, 1- make sure you compiled in release mode (e.g., -O2). I was missing this point, thanks ! It seems 20% faster in release mode, make sense? 2- Which operation takes time? I just run the test now with the code (time counter) you told me on the previous answer. I have a problem because my code is linked to another software for the debug, then I cannot see the DOS window in VC. I cannot see the “cout” directly but I can see the value directly in the DEBUG window of VC however it's kind of crazy... When I click on the “t” I can see the next choice ( Start, Time, Worst, Total,etc.). I selected “Time”, need to click like 5-6 time more and finally arrived to “data" array and the first value seems to be the time. If I believe this then the largest operation (70%) is at the start during for (int i = 0; i < ne; i++) { //Ke(i)= IfmGetMatConductivityValue2D(pDoc,i); Ke1 (i) = IfmGetMatYConductivityValue3D(pDoc, i); Ke2 (i) = IfmGetMatYConductivityValue3D(pDoc, i); Ke3 (i) = IfmGetMatZConductivityValue3D(pDoc, i); for( int j=0;j<3; j++){ SpNM.coeffRef(i, IfmGetNode(pDoc,i,j) ) = 1; } } And at the end after the 3 blocks: Netsum3 = SpNE * v; NEKE3 = SpNE * Ke3; Nekediv3 = NEKE3.array() / Netsum3.array(); // Not Matrixial diri3 =(Nekediv3-Ke3)*eps; Ke3=Ke3+diri3; 3- Why do you explicitly insert zeros? (SpNE.coeffRef(lcn(h),k) = 0;) I need all the raw (lcn(h) to be converted to 0 ( is part of the logic of the code) 4- You can prune explicite zeros with mat.prune(0); Typically after setting the values !=2 to 0. You can also do: Code: Select all struct keep_twos { keep_twos() {} inline bool operator() (const Index&, const Index&, const Scalar& value) const { return value==2; } }; mat.prune(keep_twos()); Great I will try this! 5- Before a sparse operation that takes time, check the number of non zeros: mat.nonZeros(); Maybe it is very huge I am not sure to understand this point, how checking the number of 0 can makes the code faster? Thanks a again!! You help me a lot! Robin |
Moderator
|
1- 20% only, I would expect more, but OK.
2- the insertion and sparse matrix - vector products => makes sense. For the insertion you might use .insert(i,j) if you are sure that the entry does not already exist. You should also reserve space: SpNM.reserve(VectorXi::Constant(ne, 3)); 5- it's just to check how big is the data. If, e.g., the matrix contains 30% of nonzeros there might be something wrong with your algorithm, or better use a dense matrix, or... It's just with sparse matrices, knowing the number of nonzeros is even more important than the matrix dimensions. |
Registered Member
|
Thanks a lot Gael,
Maybe I did something wrong with the Release mode, I will check again. What do you mean by -O2? Anyway I agree with you, it seems that the problem is at the moment to create the Sparse Matrix itself ( if I understood well). I am not sure what do you mean by if the entries already exist but when I write SpNM.coeffRef(i, IfmGetNode(pDoc,i,j) ) = 1; is the first time I create the sparse matrix in the code ( this is sure), so I guess this is fine. Regarding the none zero, I am pretty sure that the matrix should contain about 10% (90% zeros) then is the typical case for using Sparse matrix. Thanks a lot ! I will try insert and send the results. Cheers, Robin |
Registered Member
|
Dear Jitse, Thanks for this great step by step answers. I had made use of a sparse iterator as you described. It seems it works faster than an index based access using .coefRef(i,j). However, there seems to be a problem when I use two iterators on a matrix one after the other like below -- giving me all zeros in the A11 values.
However, the code below works perfectly fine. In all cases I have checked there is no problem with my functions somefnc and someotherfunc.
I will appreciate any possible explanation. Thanks E. |
Moderator
|
Your two versions are completely different since in the first case you call valueRef() = someotherfunc() on all non-zeros instead of on the diagonal entries only. When you only want to access to the diagonal entries, using coeff()/coeffRef() is perfectly fine.
|
Registered Member
|
Thanks ggael. The it.value() in the second code is actually a typo when copying and pasting -- sorry. I have now rewritten the correct code as below
and
Are you saying that for a sparse matrix with only diagonal non-zeros, accessing via coeffRef() costs the same as using the iterators? My experiments seem to show that using the iterator is faster. It is just that it gives me zeros whenever I use the two iterators one after the other as in the first example. I have even tried putting an if (it.row==j) tomake sure somehow updates to the non-diagonal elements are not happening . But no luck; I just cannot explain it. |
Moderator
|
Ah, if the sparse matrix is actually diagonal, then yes both versions should be OK. Please, provide a self contained example allowing to reproduce the issue.
|
Registered Member
|
Attached is a working example. I just don't see why the two examples are returning different values. Is there some error that I am not noticing? Please look at how the way the iterators are implemented in myfunction() and myOtherfunction() is affecting the results.
|
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]