Registered Member
|
Hi
I'm currently writting an image processing application. In my application I need to perform a lot of matrix operations (like subtraction,multiplication etc). Originaly I started with writting my own matrix class in which I implemented basic operations. However after profiling the code it turned out that the most time consuming operations are matrix multiplicaton and matrix addition operations. So I decied to use some specialistic library for linear algebra. Thanks to stackoverflow's users I decied to use Eigen. Unforunatelly after using Eigen matrix it turned out that my application works even slower then before. Function which take majority of time looks like that
I use Visual Studio 2010 and I enabled "Streaming SIMD Extensions 2 (/arch:SSE2) (/arch:SSE2)". Frankly speaking I don't konw how to enhance performance of these functions. After changing my MatrixClass into Eigen::Matrix class my application need 87 seconds to acomplish task - before change, it was 55 s. Could somebody explain me why if I use Eigen::Matrix instead of myMatrixClass I get worse pefrormance ?? I thought it would be the other way around. Maybe I use Eigen class in a wrong way ??
Last edited by nocturn on Thu Jun 02, 2011 11:47 pm, edited 1 time in total.
|
Registered Member
|
Why are you using dynamic matrices to store the numerator and denominator? These are scalar, and IMHO should just be of type double.
|
Registered Member
|
Because if I did sth like that
I can't compile the code - I get an error "no operator += matches these operands", so I decided to use matrix instead of double |
Registered Member
|
It looks like the result of the RHS is a Vector2d. Is that correct? It seems that you're ignoring the second component of that vector, so you could just access the [0] element.
Additionally, you have a common sub-expression which might not be optimized by the compiler. You could try extracting that (the expression ending in transpose()), and reuse the result. Have you profiled the code? Looking at it again, it could be the pow() function. What is the type of m? |
Registered Member
|
I got a bit confused What do You mean by RHS ??
m is a double so I can't use m << powValue in order to speed up pow function However notice that the same code structure is used with my type myMatrixClass and it is faster than using Eigen::Matrix |
Registered Member
|
RHS == right hand side. The lines
Both have the common sub-expression
The compiler may be overwhelmed by the large amount of templateness, and might not recognize that this sub-expression only needs to be computed once. You're also computing the pow() twice per element if both UpdateTranslationMatrix and UpdateAngleCoefficient are called for a given index. However, none of this matters if these aren't hot-spots in your code. Have you managed to profile your code? A good profiler should be able to point you to your bottleneck. |
Moderator
|
hi, here is slightly optimized version. Check this page to see how to write inner products and get a scalar instead of the 1x1 matrix:
http://eigen.tuxfamily.org/dox/QuickRef ... cOperators I also changed some copy per values to const references and factorized duplicated expressions. I believe the bottleneck of your code are the random memory accesses. Since you are looping over the same pattern multiple times, perhaps it would be worth it to pack the data once into a Matrix<double,2,Dynamic> and then leverage big matrix operations. Same for the powers that could be pre-computed once into a VectorXd. If it is still slow, you could check without SSE, and alos check the assembly to see if MSVC is not messing up.
|
Registered Member
|
Thanks for answer.Now it works a bit faster, however version with my own matrix class works a way faster. Here is a screen from profiler.
Without sse is very slow - 108 s.
Could You tell me how to check assembly. Regarding this
Could You tell me sth more about packing data into Matrix<double,2,Dynamic> what I would gain ?? Notice that function UpdateTranslationMatrix and UpdateAngleCoefficient are called a lot of times. |
Moderator
|
hm... when I see your profiler is able to see functions as coeff(), operator(), rows(), and so on I'm wondering whether you are benchmarking in debug or release mode, i.e., without or with optimizations enabled.
|
Registered users: Bing [Bot], Google [Bot], q.ignora, watchstar