Eigen - performance issues (compared to my matrix class) • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

Eigen - performance issues (compared to my matrix class)

Page 1 of 1 (9 posts)

Tags:

nocturn Registered Member Posts 4 Karma 0	Eigen - performance issues (compared to my matrix class) Thu Jun 02, 2011 8:19 pm Hi I'm currently writting an image processing application. In my application I need to perform a lot of matrix operations (like subtraction,multiplication etc). Originaly I started with writting my own matrix class in which I implemented basic operations. However after profiling the code it turned out that the most time consuming operations are matrix multiplicaton and matrix addition operations. So I decied to use some specialistic library for linear algebra. Thanks to stackoverflow's users I decied to use Eigen. Unforunatelly after using Eigen matrix it turned out that my application works even slower then before. Function which take majority of time looks like that Code: Select all TemplateClusterBase* TemplateClusterBase::TransformTemplateOne( vector<Eigen::Vector2d>& pointVector, Eigen::Matrix2d& rotation ,Eigen::Matrix2d& scale,Eigen::Vector2d& translation ) { Eigen::Matrix2d temp = (rotationscale); for (int i=0;i<pointVector.size();i++ ) { //Eigen::MatrixXd outcome = Eigen::Vector2d outcome = temp * (pointVector[i]) + translation; MatrixHelper::SetX(prototypePointVector[i],MatrixHelper::GetX(outcome)); MatrixHelper::SetY(prototypePointVector[i],MatrixHelper::GetY(outcome)); } return this; } Eigen::Vector2d AlgorithmPointBased::UpdateTranslationMatrix( int clusterIndex ) { double membershipSum = 0,outcome = 0; double currentPower = 0; Eigen::Vector2d outcomePoint = Eigen::Vector2d(); outcomePoint << 0,0; Eigen::Vector2d templatePoint; Eigen::Matrix2d temp = prototypeVector[clusterIndex]->rotationMatrixprototypeVector[clusterIndex]->scalingMatrix; for (int i=0;i< imageDataVector.size();i++) { currentPower =0; membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m); outcomePoint.noalias() += (imageDataVector[i] - (temp ( templateCluster->templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]) ))currentPower ; } outcomePoint.noalias() = outcomePoint/=membershipSum; return outcomePoint; //.ConvertToMatrix(); } double AlgorithmPointBased::UpdateAngleCoefficient( int clusterIndex ) { double membershipSum = 0; double currentPower = 0; vector<Eigen::Vector2d> prototypePointVecotr = prototypeVector[clusterIndex]->prototypePointVector; vector<Eigen::Vector2d> templatePointVector = templateCluster->templatePointVector; Eigen::MatrixXd numerator = Eigen::MatrixXd(1,1); numerator << 0; Eigen::MatrixXd denominator = Eigen::MatrixXd(1,1); denominator << 0; Eigen::Vector2d templatePoint; for (int i=0;i< imageDataVector.size();i++) { membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m); templatePoint = templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]; //templatePointVector[matchingPointVector[i]->associatedTemplateIndex]; numerator += (imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose() this->zeroOneMatrix * templatePoint * currentPower; denominator +=(imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose() templatePoint * currentPower; } double tangent = (((numerator) /= (denominator)(0,0))(0,0) ); return atan(tangent); } I use Visual Studio 2010 and I enabled "Streaming SIMD Extensions 2 (/arch:SSE2) (/arch:SSE2)". Frankly speaking I don't konw how to enhance performance of these functions. After changing my MatrixClass into Eigen::Matrix class my application need 87 seconds to acomplish task - before change, it was 55 s. Could somebody explain me why if I use Eigen::Matrix instead of myMatrixClass I get worse pefrormance ?? I thought it would be the other way around. Maybe I use Eigen class in a wrong way ?? Last edited by nocturn on Thu Jun 02, 2011 11:47 pm, edited 1 time in total.
graphicsMan Registered Member Posts 16 Karma 0 OS	Re: Eigen - performance issues (compared to my matrix class) Thu Jun 02, 2011 8:25 pm Why are you using dynamic matrices to store the numerator and denominator? These are scalar, and IMHO should just be of type double.
nocturn Registered Member Posts 4 Karma 0	Re: Eigen - performance issues (compared to my matrix class) Thu Jun 02, 2011 10:47 pm graphicsMan wrote:Why are you using dynamic matrices to store the numerator and denominator? These are scalar, and IMHO should just be of type double. Because if I did sth like that Code: Select all `double numerator = 0; numerator += (imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose() this->zeroOneMatrix * templatePoint * currentPower;` I can't compile the code - I get an error "no operator += matches these operands", so I decided to use matrix instead of double
graphicsMan Registered Member Posts 16 Karma 0 OS	Re: Eigen - performance issues (compared to my matrix class) Thu Jun 02, 2011 10:59 pm It looks like the result of the RHS is a Vector2d. Is that correct? It seems that you're ignoring the second component of that vector, so you could just access the [0] element. Additionally, you have a common sub-expression which might not be optimized by the compiler. You could try extracting that (the expression ending in transpose()), and reuse the result. Have you profiled the code? Looking at it again, it could be the pow() function. What is the type of m?
nocturn Registered Member Posts 4 Karma 0	Re: Eigen - performance issues (compared to my matrix class) Thu Jun 02, 2011 11:53 pm I got a bit confused What do You mean by RHS ?? m is a double so I can't use m << powValue in order to speed up pow function However notice that the same code structure is used with my type myMatrixClass and it is faster than using Eigen::Matrix
graphicsMan Registered Member Posts 16 Karma 0 OS	Re: Eigen - performance issues (compared to my matrix class) Fri Jun 03, 2011 12:28 am RHS == right hand side. The lines Code: Select all `numerator += (imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose() this->zeroOneMatrix * templatePoint * currentPower; denominator +=(imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose() templatePoint * currentPower;` Both have the common sub-expression Code: Select all `(*imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix).transpose()` The compiler may be overwhelmed by the large amount of templateness, and might not recognize that this sub-expression only needs to be computed once. You're also computing the pow() twice per element if both UpdateTranslationMatrix and UpdateAngleCoefficient are called for a given index. However, none of this matters if these aren't hot-spots in your code. Have you managed to profile your code? A good profiler should be able to point you to your bottleneck.
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen - performance issues (compared to my matrix class) Fri Jun 03, 2011 7:48 am hi, here is slightly optimized version. Check this page to see how to write inner products and get a scalar instead of the 1x1 matrix: http://eigen.tuxfamily.org/dox/QuickRef ... cOperators I also changed some copy per values to const references and factorized duplicated expressions. I believe the bottleneck of your code are the random memory accesses. Since you are looping over the same pattern multiple times, perhaps it would be worth it to pack the data once into a Matrix<double,2,Dynamic> and then leverage big matrix operations. Same for the powers that could be pre-computed once into a VectorXd. If it is still slow, you could check without SSE, and alos check the assembly to see if MSVC is not messing up. Code: Select all Eigen::Vector2d AlgorithmPointBased::UpdateTranslationMatrix( int clusterIndex ) { double membershipSum = 0,outcome = 0; double currentPower = 0; Eigen::Vector2d outcomePoint(0,0); Eigen::Vector2d templatePoint; Eigen::Matrix2d temp = prototypeVector[clusterIndex]->rotationMatrixprototypeVector[clusterIndex]->scalingMatrix; for (int i=0;i< imageDataVector.size();i++) { currentPower =0; membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m); const Eigen::Vector2d& vec = (templateCluster->templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]); outcomePoint += (imageDataVector[i] - temp vec)currentPower ; } outcomePoint /= membershipSum; return outcomePoint; } double AlgorithmPointBased::UpdateAngleCoefficient( int clusterIndex ) { double membershipSum = 0; double currentPower = 0; const vector<Eigen::Vector2d>& prototypePointVecotr = prototypeVector[clusterIndex]->prototypePointVector; const vector<Eigen::Vector2d>& templatePointVector = templateCluster->templatePointVector; double numerator = 0 double denominator = 0; for (int i=0;i< imageDataVector.size();i++) { membershipSum += currentPower = pow(membershipMatrix[clusterIndex][i],m); const Eigen::Vector2d& templatePoint = templatePointVector[prototypeVector[clusterIndex]->assosiatedPointIndexVector[i]]; const Eigen::Vector2d diff = (imageDataVector[i] - prototypeVector[clusterIndex]->translationMatrix) currentPower; numerator += (diff.transpose() * this->zeroOneMatrix * templatePoint).value(); denominator += diff.dot(templatePoint); } return atan(numerator / denominator); }
nocturn Registered Member Posts 4 Karma 0	Re: Eigen - performance issues (compared to my matrix class) Sat Jun 04, 2011 12:03 pm Thanks for answer.Now it works a bit faster, however version with my own matrix class works a way faster. Here is a screen from profiler. Without sse is very slow - 108 s. If it is still slow, you could check without SSE, and alos check the assembly to see if MSVC is not messing up. Could You tell me how to check assembly. Regarding this I believe the bottleneck of your code are the random memory accesses. Since you are looping over the same pattern multiple times, perhaps it would be worth it to pack the data once into a Matrix<double,2,Dynamic> and then leverage big matrix operations. Same for the powers that could be pre-computed once into a VectorXd. Could You tell me sth more about packing data into Matrix<double,2,Dynamic> what I would gain ?? Notice that function UpdateTranslationMatrix and UpdateAngleCoefficient are called a lot of times.
ggael Moderator Posts 3447 Karma 19 OS	Re: Eigen - performance issues (compared to my matrix class) Sun Jun 05, 2011 10:30 am hm... when I see your profiler is able to see functions as coeff(), operator(), rows(), and so on I'm wondering whether you are benchmarking in debug or release mode, i.e., without or with optimizations enabled.

Page 1 of 1 (9 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], Google [Bot], q.ignora, watchstar