Registered Member
|
Hello everyone,
I am getting bad performance when computing
After checking in the forum I found this thread, where this issue is discussed. There someone proposed a better way of doing this operation by using
Is this correct? Because, this code does not produce the same matrix as A.transpose() * A. I am a bit lost right now... If this is not the correct way, then what is the best way for computing A^T * A? Thank you and best regards, G. Ros. |
Registered Member
|
Ok, I have realized that the correct expression should be:
However, it seems to be that the transpose of A is making the operation very slow. When A is used instead of A.transpose() the time goes down to 1.5 ms. Is there a way of doing this in an efficient fashion? Best regards, G. Ros |
Registered Member
|
|
Moderator
|
What was the error?
Also, regarding your first post, 13ms for this computation means 23GFLOPS for a full product. Of course, using the selfadjointView allows to reduce the number of computations, but for such small outputs I don't expect much gain from this trick. |
Registered users: Bing [Bot], Google [Bot], Sogou [Bot]