Registered Member
|
Hi,
I've been porting my neural network code to Eigen2 and so far it's been great. Here's some observations/wishlist: 1. Nvidia CUDA and CUBLAS automatic integration and abstraction. Right now there's a "gold rush" by scientists for this cheap supercomputing resource. If Eigen2 makes this easy it's adoption rate will increase tremendously. 2. sigmoid and tanh functions for cwise(). Better yet have a generalized method like cwise().apply(tanh) or cwise.apply(myactivationfunction) 3. Kronecker and Direct sum. Yes they are easy to implement but you might have more optimized implementations. 4. Better support for serialization so I could save/read neural networks more easily. Thanks a lot and more power! |
Registered Member
|
We have an issue tracker that gives higher chances that we won't forget about your feature requests
I guess so, but I'm incompetent here. Gael would know better but I guess that contributions are welcome too
For "cwise.apply(func)" : we have this, it's called unaryExpr: http://eigen.tuxfamily.org/dox/classEig ... 045c8a2e9e It requires you to define a functor, rather than taking a function pointer, there are many advantages in this, it allows you to implement vectorization (copy what we do for exp()), etc.
ok indeed, this would be a good addition. Kronecker product was discussed there: http://forum.kde.org/how-to-compute-the ... 50952.html for direct sum, you can use a comma-initialization like
There was a discussion here: http://forum.kde.org/boost-serializatio ... 33007.html In any case you're more than welcome to clone eigen2 and start working on your own additions... then we can consider merging them.
Last edited by bjacob on Mon Jun 22, 2009 3:59 am, edited 1 time in total.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
Thanks for the speedy reply. When it becomes really painful waiting for the neural network training then I might just get my hands dirty with CUDA. Now that I've mentioned it, Eigen2 does make neural network programming less dirty
|
Registered Member
|
I don't work with cuda but a colleague of mine does, and he complains about the compilers being pretty **** at the moment. Of course you can try to compile Eigen with CUDA C++ compiler, but somehow I doubt that it will work out of the box or with minor modifications only. But yes, it would be great :-) |
Registered Member
|
From the name, I infer that CUBLAS is a BLAS implementation using CUDA? I'd say, then, that the best way we can leverage that is to have an optional BLAS backend for level 3 operations. This way we could take advantage of any BLAS, CUBLAS or other.... So far it was not really worth it because we have very good performance on the CPU, but if the GPU allows to go much faster, it becomes interesting !
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered Member
|
With CUDA, the speedup is not 2x or 3x over optimized CPU code but one or two orders of magnitude! 25x to 100x speedup is possible. Of course lots of techniques to achieve this, like minimizing data transfers from main memory to GPU memory. http://www.ddj.com/hpc-high-performance ... /210602684 http://www.khronos.org/opencl/ In http://www.nvidia.com/object/cuda_home.html type "matrix" in "Search" and choose "Sort by Speed Up" |
Registered users: Bing [Bot], Google [Bot], Sogou [Bot]