![]() Registered Member ![]()
|
Hello, I have a program that contains a lot of matrix multiplications and matrix inversions. The matrices are non-symmetric and the size is 400x400.I was wondering if there is any way to speed up the calculations.I have enabled the O2 optimizations and include the following commands
EIGEN_VECTORIZE #define NDEBUG #define EIGEN_RUNTIME_NO_MALLOC I send you a part of my code to have an idea: MatrixXcd S12[NumberOfSlices]; S12[0] = MatrixXcd::Zero(2 * rows, 2 * cols); for (int j = 0; j < NumberOfSlices - 1; j++) { InverseT11Matrix[j] = T11[j].inverse(); MatrixXcd U = I - (InverseT11Matrix[j] * Gamma[j] * S12[j] * Gamma[j] * T12[j]); MatrixXcd InvU = U.inverse(); S12[j + 1] = InvU*(InverseT11Matrix[j] * Gamma[j] * S12[j] * Gamma[j] * T11[j] - InverseT11Matrix[j] * T12[j]); InverseUMatrix[j] = InvU; } MatrixXcd S11[NumberOfSlices]; S11[0] = I; for (int j = 0; j < NumberOfSlices - 1; j++) { S11[j + 1] = InverseUMatrix[j] * InverseT11Matrix[j] * Gamma[j] * S11[j]; } All the arrays(S11[],S12[],Gamma[], etc) hold 3 matrices. I am not sure how to write the code in order to benefit from the vectorization and generally I would really appreciate if you have any suggestions on how to improve the code. Also, all this time I was running the code using my laptop but now I can use a super-computer with many cores and large memory.I realized that my code is not getting faster, probably because I can write a parallel code to take advantage of the multi-threaded applications.Do you have any comments on that or anything that can help? As you realize I m relatively new in the programming, so please forgive if my questions are a bit ambiguous or not well expressed. Any help will be appreciated very much. Thanks in advance for you time. |
![]() Moderator ![]()
|
What are the sizes of the matrices? Are some matrices diagonal or triangular?
Are they all squared? If not, adding parenthesis at the right place might reduce the number of operations, e.g., A*(B*v) is much faster than A*B*v if A and B are matrices and 'v' a vector! You can enable multi-threading within Eigen by enabling OpenMP (e.g., -fopenmp with gcc). Set the environment variable OMP_NUM_THREADS to the number of physical cores to avoid hyperthreading. |
![]() Registered Member ![]()
|
The matrices are all square and the size is 1600x1600. Only Gamma matrix is diagonal.
I 've already used the -fopenmp because i m using gcc but I haven't set the environment variable OMP_NUM_THREADS to the number of physical cores.I will try that! |
![]() Moderator ![]()
|
Setting OMP_NUM_THREADS or disabling hyper-threading is very important. Regarding Gamma, you should store it as a VectorXd and use A * Gamma.asDiagonal() * B. If some matrices are real and not complex, then use a MatrixXd and try to combine real matrices first before combining them to complexes (using adequate parenthesis).
|
Registered users: Baidu [Spider], Bing [Bot], Google [Bot], rblackwell