![]() Registered Member ![]()
|
I've been using Eigen to compute product of many sparse matrices and was very pleased with perfomance. As half of the matrices in the product are essentially exponentials of another matrix its not efficient to exponentiate directly (as sizes grow up to 2^10-2^14), but rather to use Krylov-like methods to compute exp(t*H)*v so that pruduct slims down to many sparse matrix sparse vector products over selected columns. However perfomance of sparse matrix sparse vector product is at least 10 times slower that of direct matrix-matrix product. One would expect similar perfomance (+some overhead) of both algorithms, as is the case with gmm.
I wrote small test case to bench both algorithms with both libraries (output of its result is above). I do use NDEBUG and native optimisations but results are disastrous ![]()
|
![]() Moderator ![]()
|
I guess that your example is not representative as using sparse representation for that small matrices is not recommended. What are the typical sizes of your real-world use case? Are you sure that the result remains sparse?
|
![]() Registered Member ![]()
|
Matrices are generally very sparse due to their nature: H is interaction hamiltonian which is mostly diagonal in occupation basis and op are fermionic operators. And matrix product is always sparse by the phisycal nature of the problem, but here I ran test case with 1024x1024 matrices (limitied to 1000 products):
In direct matrix-matrix product Eigen to my pleasure overtook gmm, but same product computed matrix by column yeilds abyssmal time for Eigen. To make the matters worse as I have to compute the following product in real case millions of times per iteration where n could go up to 1000:
Where exponential argument is variable I have to compute exp everytime and Taylor or Pade are not optimal for such large matrices. Thus I compute exp(t*H)*v instead via Newton-Leja algorithm but it has dozens matrix-vector products inside and this is where I noticed major slowdown for the first time. As I first tried matrix-vector product for the first time in 3.2.1 or 3.2.2 and first my thought was that this is a regression in Eigen. Now I'm faced with a choice of either introducing yet another matrix library to the project (gmm) or optimizing matrix-vector product if I did something wrong here (if I did), or simply wantiing for a fix if this is a regression, thus I brought the matter here. |
![]() Moderator ![]()
|
Can you try with the devel branch, I've just committed a fix.
|
![]() Moderator ![]()
|
Moreover, it might also help to call prod2.reserve(prod2.cols() * approximate_nnz_per_column) before starting the vector by vector computation to prevent from multiple memory reallocation and copies when inserting the column vectors.
|
![]() Registered Member ![]()
|
Thank you for the fix!
Not 15 times slower as before but only 4 times now for matrix-vector product! Moreover reservations for vec1, vec2 and prod2 allowed to shrug off 3 seconds. However it is still very suspicios, as when I make H fully diagonal, thus making all vectors and columns in byproducts have only one element things are still off:
On a side note: matrix-matrix product is slower by 15% in dev version ![]() |
![]() Moderator ![]()
|
yes, this slowdown in this extreme case is due to the fact that by default we are assuming aliasing and thus each product creates a temporary. In general this is not an issue because the cost of this dynamically allocated temporary is small compared to the cost of the product, but in your extreme case this is killing the performance. This will be fixed soon through the use of noalias (as with dense product). In the meantime you can update your copy of the devel branch and directly call some internal functions to by-pass this temporary:
and for larger matrices you may also need to increase the stack allocation limit by compiling with, e.g.: -DEIGEN_STACK_ALLOCATION_LIMIT=1000000 |
Registered users: Bing [Bot], claydoh, Google [Bot], rblackwell, Yahoo [Bot]