Registered Member
|
Hello everybody
I have to say I have been nicely surprised with the eigen library. I'm developing real-time audio plug-ins, and I was looking for efficient matrix operations code for a long time. Using eigen, I have been able to reduce significantly my plug-ins CPU consumption. Thanks a lot for your hard work ! I'm writing this message to ask you several questions about the optimization of my code. I only need to use matrix multiplications, additions, and inversions. Their size is constant during code execution, between 2x2 to 20x20, but these constants may change depending on users interaction. Consequently, it is not possible for me to use fixed size only matrices, without using several implementations of my code according to current sizes... However, right nom I'm comparing the performance of fixed-size and dynamic-size implementations for a special case (2x2 everywhere only), and I have tried to reduce the gap between them. 1) I use Visual C++ 2008 with the MSVC and Intel compilers, and I have tried to select the best code generation flags for both, to use SSE3 instructions and vectorization in general. With the fixed-size implementation, the MS compiler is faster (2.2% CPU consumption vs 3.1%). However, the Intel compiler is faster with the dynamic-size implementation (MS 19.8 Intel 9.6). Do you know why the Intel compiler (with auto-vectorization and best speed optimizations) is slower in the first case ? 2) Moreover, when I use the fixed-size implementation for both compilers, the performance is better with code like "A.noalias() = B*X + C*Y; " than "A.noalias() = B*X; A.noalias() += C*Y;". However, the topic about "Writing Efficient Product Expression" says it would be the contrary... 3) Does specifying maxRows and maxCols in Matrix declaration affect the speed of generated code ? 4) When I compile with the Intel C++ compiler in my dynamic-size implementation, the compiler tells me a lot of times it performs "auto-vectorization" of some eigen code, but never with the fixed-size implementation. Does that mean eigen code is not "vectorizated" efficiently for dynamic-sized matrices ? Moreover, I have a lot of warnings about the use of unuseful "const" assertions... 5) Do you have general tips to reduce the gap bewteen my two implementations (which have not been written in the wiki) ? Thanks in advance ! |
Moderator
|
well, a speed up factor of 3 or 4 between very small fixed sized and dynamic sized matrices has to be expected. There is nothing that you, us, or the compiler can do about it.So I would first identify the time critical part of the algorithm, implements a templated version of it that would accept both fixed and Dynamic values. Then you could have a big switch/case construct to map the runtime sizes to compile-time calls. To trade the compilation-time, binary size and performance you could fall-back to fixed-sizes only for very small ones (2, 3, 4) and keep "Eigen::Dynamic" for large ones.
|
Registered Member
|
Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]