Registered Member
|
Hi,
first of all, I have to say, awesome job . You have one heck of a fast matrix matrix multiplication. I even tried a SSE implementation (floats), by using the fact that you can transpose the second matrix and therefore you can access its elements sequentially in memory (improve memory read performance). Still yours was faster. My question is this: I want to multiply A*B where here B is a symmetric matrix, so A*B = A*BT (where BT = B transpose). So considering the symmetry of B, how can I have a even faster code. Explaination: For example element (1,1) of resulting multiplication should be A.row(1).dot(B.col(1)), however because of symmetry of B, we can calculate A.row(1).dot(B.row(1)) As you know, this helps with memory access and cache. (also let's consider that all sizes of rows and cols of A and B are multiple of 4). My code using SSE and knowing the 16-byte alignment of every row and column resulted in a slower code than Eigen. So now, I want to know if I can improve Eigen even more considering symmetric matrix B. Thanks |
Moderator
|
yes because here you are neglecting the multi levels of caching...
This is already in Eigen (devel branch), e.g.: C = A * B.selfadjointView<Lower>(); (or Upper if you computed/stored the upper triangular part). |
Registered users: Bing [Bot], Google [Bot], Sogou [Bot]