Eigen MAGMA backend implementation project • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

Eigen MAGMA backend implementation project

Page 1 of 1 (12 posts)

Tags:

bravegag Registered Member Posts 52 Karma 0	Eigen MAGMA backend implementation project Thu Jul 25, 2013 8:54 am Hello, I have "created a fork" of Eigen 3.2 and incorporated some (small) progress I have preparing a MAGMA backend to best exploit GPU & CPU. This is an alternative to using MKL which indirectly uses MKL because MAGMA does use MKL in the back. I have been testing it using MAGMA 1.4.0-beta2 and so far all my project tests pass without having to change our Eigen-based code base which is great! Anyone who wants to contribute please contact me to bravegag@hotmail.com or via the GitHub account below. The code base is available here: https://github.com/bravegag/eigen-magma I have working the first port corresponding to GeneralMatrixMatrix_MAGMA.h which in reality uses MAGMA API but invokes CUBLAS which is slightly faster: https://github.com/bravegag/eigen-magma ... ix_MAGMA.h Another partial implementation (currently a bit of work in progress) ColPivHouseholderQR_MAGMA.h it is missing enabling the macro for float and complex types: https://github.com/bravegag/eigen-magma ... QR_MAGMA.h I have been adding implementations prioritizing the functions we use as part of our project. The remaining *_MAGMA.h implementations are simply mock copies of the MKL counterparts with some basic pre-processing changes i.e. MKL -> MAGMA replacement. Best regards, Giovanni
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Thu Jul 25, 2013 5:12 pm I have created a simple benchmark project to check the implementation: https://github.com/bravegag/eigen-magma-benchmark I have added documentation including the Gflops results for DGEMM and DGEQP3 so far implemented.
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen MAGMA backend implementation project Fri Jul 26, 2013 2:38 pm Great work. I've updated the respective bugzilla entry: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=461.
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Tue Jul 30, 2013 12:33 pm Hi ggael, Thank you. I am currently in holidays but next week will continue expanding the project to other products and factorizations. If you give me your GitHub account I can give you commit access to the project. Best regards, Giovanni
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Thu Aug 08, 2013 1:56 pm Quick update, I have additionally implemented: - dgemv (matrix vector multiplication) - dtrsm (triangular matrix solver) - dpotrf (Cholesky decomposition) The results are very disappointing, unless I have bugs (e.g. copying Host <-> Device more memory than needed ) MAGMA overall and taking into account the memory transfer underperforms in these three cases see: https://github.com/bravegag/eigen-magma-benchmark The Cholesky decomposition result is the most surprising because Eigen beats both MKL and MAGMA (see the Gflops): https://raw.github.com/bravegag/eigen-m ... gflops.png If anyone would be willing to donate a code review I will be more than happy Best regards, Giovanni
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Mon Aug 12, 2013 9:17 am Quick update: I have done some bugfixes and included ?gesvd as well. The benchmark results have been updated to also reflect when it is using MAGMA and when it is using CUBLAS. The benchmark results are now embedded: https://github.com/bravegag/eigen-magma-benchmark So far the MAGMA backend pays off for dgemm (Matrix Matrix product), dgeqp3 (QR decomposition Household with column pivoting), and dgesvd (SVD decomposition). Of course using a different setup, results may differ i.e. using a Tesla K20 card will improve the MAGMA results in all benchmarks but currently I use the cheaper nVidia GTX Titan solution (where I can get 3x cards for the price of one Tesla K20). Best regards, Giovanni
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen MAGMA backend implementation project Mon Aug 12, 2013 10:28 am Hi, here is fix for gpotrf (the input matrix was not SPD): Code: Select all diff --git a/src/main/cpp/benchmark_main.cc b/src/main/cpp/benchmark_main.cc index 4cdb5b5..1ab053e 100644 --- a/src/main/cpp/benchmark_main.cc +++ b/src/main/cpp/benchmark_main.cc @@ -104,34 +104,42 @@ static void run_benchmark(long N, int warm_ups, int num_runs, workload_type work } } +EIGEN_DONT_INLINE static double dgemm(long N) { C = AB; // flops see http://www.netlib.org/lapack/lawnspdf/lawn41.pdf page 120 return 2NNN; } +EIGEN_DONT_INLINE static double dgeqp3(long N) { Eigen::ColPivHouseholderQR<MatrixXd> qr = A.colPivHouseholderQr(); // flops see http://www.netlib.org/lapack/lawnspdf/lawn41.pdf page 121 return NNN - (2/3)NNN + NN + NN + (14/3)N; } +EIGEN_DONT_INLINE static double dgemv(long N) { C = Ab; return 2NN - N; } +EIGEN_DONT_INLINE static double dtrsm(long N) { C = Aqr.solve(B); return NNN; } +MatrixXd L; + +EIGEN_DONT_INLINE static double dpotrf(long N) { Eigen::LLT<MatrixXd> lltOfA(A); - MatrixXd L = lltOfA.matrixL(); + L = lltOfA.matrixL(); return NNN/3.0 + NN/2.0 + N/6.0; } +EIGEN_DONT_INLINE static double dgesvd(long N) { Eigen::JacobiSVD<MatrixXd> svd(A, Eigen::ComputeThinU \| Eigen::ComputeThinV); return 22NNN; @@ -211,6 +219,10 @@ int main(int argc, char* argv) { // input data specific only to this function Aqr = A.colPivHouseholderQr(); } + + if (function == "dpotrf") { + A = A.adjoint()*A; + } real_time_acc = bench_accumulator(); gflops_acc = bench_accumulator();
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Mon Aug 12, 2013 12:29 pm Thank you! I have applied the patch, benchmarking now, when the matrix is SPD then they are all slower ... I will upload the results as soon as they are ready. btw if you have suggestions how to improve the performance let me know. A few thoughts on how to improve: 1) when magma is enabled, then use pinned Host memory. 2) switch some methods to mgpu i.e. multi-gpu e.g. dpotrf has a mgpu version, maybe this could be configured via another define. 3) caching data copied to device but it is kind of hard to do at this level. Best regards, Giovanni UPDATE: results uploaded and MAGMA now shines! https://github.com/bravegag/eigen-magma-benchmark
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Fri Aug 16, 2013 12:42 pm Major performance improvements in many of the factorizations e.g. * DGEMM now tops at ~800 GFlop/s before was ~200 GFlop/s * DPOTRF now tops at ~250 GFlop/s before was ~125 GFlop/s The improvements are due to two aspects: * Enabling CUDA-Double precision on the nVidia driver. The Titan GTX has this peculiarity: the Double-precision performance ratio defaults to 1/24th of the Single-Precision performance and this ratio can be changed using the nvidia driver to 1/3 of the SP performance. * Upgrading to MAGMA official 1.4.0 release. PS: still benchmarking
bravegag Registered Member Posts 52 Karma 0	Re: Eigen MAGMA backend implementation project Tue Sep 17, 2013 5:29 pm Benchmarks updated after doing this (nvidia.NVreg_EnablePCIeGen3=1 enabling PCIe v3.0 and the maximum link speed to 8GT/s instead of 5GT/s i.e. speed ups memory transfers Host <-> Device): https://devtalk.nvidia.com/default/topi ... -on-titan/ This gave a performance edge on the Titan cards worth updating the benchmarks.
Royi Registered Member Posts 34 Karma 0	Re: Eigen MAGMA backend implementation project Sat Jun 17, 2017 6:54 pm Any progress on that? By the way, why isn't Eigne on GitHub?
NeoPhix Registered Member Posts 1 Karma 0	Re: Eigen MAGMA backend implementation project Tue May 29, 2018 7:25 am Join to the question: is any progress here? Now I am working on a project which needs MAGMA (or any GPU BLAS and LAPACK implementation). MKL does not satisfy to necessary performance (+ I need to build project for ARM with nvidia GPU, where I cannot use MKL). Eigen documentation tells me (https://eigen.tuxfamily.org/dox/TopicUs ... apack.html): Since Eigen version 3.3 and later, any F77 compatible BLAS or LAPACK libraries can be used as backends for dense matrix products and dense matrix decompositions. For instance, one can use Intel® MKL, Apple's Accelerate framework on OSX, OpenBLAS, Netlib LAPACK, etc. It works for me when I compile project with OpenBLAS or MKL, but when I try to link project with MAGMA libraries (libmagma.a for example) and flags EIGEN_USE_BLAS and EIGEN_USE_LAPACKE, I have many linking errors of undefined symbols in LAPACK and BLAS calls. In Eigen 3 sources I see some like this: Code: Select all `info = LAPACKE_##LAPACKE_PREFIX##getrf( matrix_order, m, n, (LAPACKE_TYPE*)a, lda, ipiv );` and in library symbols I see these instructions: like "magma_sgetrf_". Does it mean that I have invalid ##LAPACKE_PREFIX## ? Can you give me any tips for linking with MAGMA? Is this option supported?

Page 1 of 1 (12 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot]