This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Is there a GPU DeviceMatrix and DeviceVector anywhere?

Tags: None
(comma "," separated)
bravegag
Registered Member
Posts
52
Karma
0
Hi,

I'm working on a large project that uses Eigen and has many CPU-algorithms where MatrixXd and VectorXd are ubiquitous in the public interfaces. Now I'd like to add complete GPU variants (not just invoking the MAGMA backend with a lossing memory copy Host from/to Device strategy) for all the algorithms without having to change my algorithm interfaces everywhere i.e. I'd like to have a DeviceMatrixXd and DeviceVectorXd that can be interchangeably used wherever MatrixXd and VectorXd can.

To this end I'm tempted to define my own but after looking into https://bitbucket.org/ggael/eigen-nvcc I was wondering whether a DeviceMatrix and DeviceVector already exist anywhere?

TIA,
Best regards,
Giovanni
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
No DeviceMatrix yet. This eigen-nvcc branch has now been merged within the main Eigen devel branch. Currently it only allows to call Eigen from CUDA. However this is a necessary step toward the evaluation of Eigen's expressions on CUDA. This second step will become straightforward once the refactoring of our expression template engine will be finished (I made good progress recently).
bravegag
Registered Member
Posts
52
Karma
0
Hi ggael,

Thank you for your prompt answer and always excellent support :)

In your answer somehow I decode that what you did there could help me because to be able to use Eigen namely MatrixXd from a cuda kernel then the MatrixXd memory has to be device memory or? I actually looked into this Eigen-nvcc branch but could not find any cuda memory allocation within Memory.h which would be the place to look for any form of allocation correct?

I have a lot of presure and interest on this branch, please let me know whether I can help in any way.

Best regards,
Giovanni
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
The current port is mostly to allow to declare fixed-size small matrices and vectors (or similar Map objects) within CUDA kernel, so memory management is not our business yet. This is related to your goal because the evaluation of, for instance, A = 2 * B + C boils down to something like:
Code: Select all
evaluator<typeof(A)> dst(A);
evaluator<typeof(2*B+C)> src(2*B+C);
for(int i=0; i<A.size(); ++i)
 dst.write(i, src.coeff(i));

This is very simplified, but that's the idea. A real world version of this for loop is there:
https://bitbucket.org/ggael/eigen-evalu ... ult#cl-264
and the creation of the evaluator:
https://bitbucket.org/ggael/eigen-evalu ... ult#cl-616

Now if A would be a DeviceMatrix, we would simply have to do something like (again this is over-simplified, just to get the idea):
Code: Select all
template<Dst,Src> struct Kernel {
  Dst dst;
  Src src;
  __DEVICE__ void operator()(int i) { dst.write(i, src.coeff(i));  }
};
template<Kernel>
__GLOBAL__ void meta_kernel(Kernel kernel) {
 kernel(threadIdx.x());
}
meta_kernel<<<dim3(A.size())>>>( Kernel<typeof(A),typeof(2*B+C)>(A, 2*B+C) );

This mean nvcc must be able to able to properly compile our expression code, that what the current port is doing.

If you wanna help, a first step would be to have a DeviceMatrix class with device storage and adapt your support for magma for this class...

BTW, have you seen that in the future CUDA6, it will possible to allocate "managed" buffer that are automatically synchronised between the GPU and host memory. I don't know the memory overhead yet, but if it's not too high, that might simplify a lot our job!!
User avatar
idg101
Registered Member
Posts
17
Karma
0
OS
I am a bit confused by this discussion so not sure if it answers my question so I will rephrase.

I would like to be able to use the fuctions like colwise() and rowise() but witih matricies on the GPU. Will Eigen have support for this soon?

Thanks in advance.


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot]