This forum has been archived. All content is frozen. Please use KDE Discuss instead.

OpenMP: Vector-Matrix multiplication

Tags: None
(comma "," separated)
manuels
Registered Member
Posts
47
Karma
0

OpenMP: Vector-Matrix multiplication

Sat Nov 13, 2010 12:39 pm
Hi,

I didn't want to take this thread over, so I started a new one.

I'm having a piece of code like this:
Code: Select all
Matrix1.col(i+1) = Matrix2*(Matrix1.col(i) - Vector) + Matrix1.col(i)

And OpenMP is set as CXXFLAG.
My operation should not differ from a matrix-matrix multiplication (from eigen's internal point of view). Am I right?

But nevertheless only one cpu is used to perform this operation.
Do you have an idea why?

Cheers,

Manuel


PS: Currently, I have no access to the source code/Makefile. But I can post it at Monday.

EDIT: Matrix indices
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
The internal implementation of matrix-matrix and matrix-vector products are very very different, and so far only the former is multithreaded. Before looking for a multithreaded mattrix-vector product, you should rather see if your algorithm cannot be implemented in term of matrix-matrix products via a blocking strategy. This is more work, but the performance gain will be much higher.

Now regarding your use case, two temporaries are created. One for (Matrix1.col(i) - Vector) and one to hold the result of the product. The later can be avoided as follow:

mat1.Col(i+1) += mat1.Col(i);
mat1.Col(i+1).noalias() -= mat2*(Matrix1.col(i) - Vector);
manuels
Registered Member
Posts
47
Karma
0
Thanks for your hint to split the operation into two parts!

I tried to find out what you meant by replacing my matrix-vector multiplication with a matrix-matrix multiplication using a blocking strategy but I could not find any good information about it.

Do you mean I should slice the vector into columns of a matrix and after the matrix-matrix multiplication I should sum up the rows of the result?


EDIT: And if I should do that, what's the best method to "reinterpret" the size of the vector.
Something like
Code: Select all
Map<Matrix<double, BLOCKSIZE, SIZE/BLOCKSIZE> > VectorAsMatrix = vector.data()

?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
no no, actually I've no idea about your algorithm, perhaps what I'm saying is not possible in your case. But I see you have a loop on the column of a matrix, so perhaps it is possible to treat N columns at once + some extra work just like it is done for all block decompositions: Cholesky, LU, QR, tridiagonalisation, etc. Just an idea....
manuels
Registered Member
Posts
47
Karma
0
Ah, ok. No, I cannot do this because the values of columns of Matrix1 depend recursively on each other.

Actually, I have this equation:
Code: Select all
vec_new = mat*(vec_old - other_vec) - vec_old

I just use a matrix instead of vec_old, vec_new respectively, to keep a history of the values of this vector.

But mat is very sparse, the vectors are dense. So do you think I would be a good idea to do what I said:
Slice the vec_old into a matrix (just a mapping, not really copying) and then do a matrix-matrix multiplication. Finally sum up the rows of the result. (If I swap the matrices of the matrix-matrix multiplication: sum up colwise).
Something like
Code: Select all
Map<Matrix<BLOCKSIZE, vec.size()/BLOCKSIZE> > VectorAsMatrix = vec_old - other_vec;
Matrix<...> mat = ...;

Matrix<...> intermediate_result = VectorAsMatrix*mat;
vec_new = -vec_old;
vec_new += intermidiate_result.colwise().sum();


Eventually this could have two advantages:
1) the matrix-matrix product could be calculated using OpenMP.
2) if I choose the correct BLOCKSIZE accordingly to my cache size, the dense rows of VectorAsMatrix would be kept in the cache, and only sparse columns of mat must be requested from the memory.

Would this be a good idea or is that overingeneering?


Cheers and thanks for your great hints!

Manuel


Bookmarks



Who is online

Registered users: Bing [Bot], Evergrowing, Google [Bot], rockscient