This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Cost of min/max/cast/array

Tags: None
(comma "," separated)
michaldobrogost
Registered Member
Posts
6
Karma
0

Cost of min/max/cast/array

Sat Feb 22, 2014 12:10 am
Hi All,

I have observed some performance slow downs using Eigen. I would have expected the library to be able to unroll the overhead here (maybe with the exception of aliasing betweeing in/out). The performance is twice as slow when compiled with msvc2012 x64 against manually coding the matrix multiplication / max / min and cast. Are my expectations unreasonable here? What am I missing?

Code: Select all
void eigen1(const Matrix3f & mat, T* in, size_t size, T* out) {
    for (size_t i = 0; i < size*4; i += 4) {
        Map<Vector3T> src(in + i);
        Map<Vector3T> dest(out + i);
        dest = ((mat * src.cast<float>()).array() + offset).max(0.0).min(clamp).cast<T>();
    }
}


The full code can be viewed at http://pastebin.com/8Ssq1uwB

Cheers,

Michal
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Cost of min/max/cast/array

Sun Feb 23, 2014 9:43 pm
Make sure you benchmarked with optimization ON (release mode). Then, the "manual" function is broken and does not perform the right computation. After fixing it I get with gcc 4.8 -O3 -DNDEBUG:

eigen1 0.0616467
eigen2 0.0552015
eigen3 0.0575281
eigen4 0.0738412
manual 0.0728137

so perhaps MSVC fails to inline some function. Have a look at the generated assembly for the eigen1 function and seach for "call".
michaldobrogost
Registered Member
Posts
6
Karma
0

Re: Cost of min/max/cast/array

Mon Feb 24, 2014 10:27 pm
Hi Gael,

Thanks for taking a look! The code was compiled in release mode with /Ox (full optimization) and NDEBUG was defined. Even with the 'favor speed over size' option (/Ot) there is indeed a 'call' in the assembly listing (to something like ArrayWrapper::CoeffBasedProduct).

I am guessing there's not much that the library can do to help the compiler inline here. Have you guys had success with the MSVC team incorporating optimizations required by eigen?

Cheers,

Michal
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Cost of min/max/cast/array

Mon Feb 24, 2014 10:58 pm
__forceinline can help MSVC. What is the precise call?
michaldobrogost
Registered Member
Posts
6
Karma
0

Re: Cost of min/max/cast/array

Tue Feb 25, 2014 10:43 pm
This is what I get in the assembly:

call ??0?$ArrayWrapper@$$CBV?$CoeffBasedProduct@AEBV?$Matrix@M$02$02$0A@$02$02@Eigen@@AEBV?$Matrix@M$02$00$0A@$02$00@2@$05@Eigen@@@Eigen@@QEAA@AEBV?$CoeffBasedProduct@AEBV?$Matrix@M$02$02$0A@$02$02@Eigen@@AEBV?$Matrix@M$02$00$0A@$02$00@2@$05@1@@Z

; Eigen::ArrayWrapper<Eigen::CoeffBasedProduct<Eigen::Matrix<float,3,3,0,3,3> const & __ptr64,Eigen::Matrix<float,3,1,0,3,1> const & __ptr64,6> const >::ArrayWrapper<Eigen::CoeffBasedProduct<Eigen::Matrix<float,3,3,0,3,3> const & __ptr64,Eigen::Matrix<float,3,1,0,3,1> const & __ptr64,6> const >
michaldobrogost
Registered Member
Posts
6
Karma
0

Re: Cost of min/max/cast/array

Thu Feb 27, 2014 10:54 pm
Is the information I provided helpful?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Cost of min/max/cast/array

Fri Feb 28, 2014 9:15 am
yes, file src/Core/ArrayWrapper.h, line 52, you might try to add EIGEN_STRONG_INLINE:

EIGEN_STRONG_INLINE ArrayWrapper(ExpressionType& matrix) : m_expression(matrix) {}

This will likely defer the non-inlining issue further, so might have to repeat this step.
michaldobrogost
Registered Member
Posts
6
Karma
0

Re: Cost of min/max/cast/array

Sat Mar 01, 2014 2:06 am
Using EIGEN_STRONG_INLINE at:
1) src/Core/MatrixBase.h (line 322) and the other array() method
2) src/Core/MatrixBase.h (line 503) and all other constructors there did not help
3) src/Core/products/CoeffBasedProduct.h (line 148) and the other constructor there did not help

((2) and (3) may not be needed)

Results in no call in the generated code and a substantial improvement. The code is still slower than the manual version when using Map.

eigen1 - 180 ms (Map for input and output)
eigen2 - 180 ms (no input Map)
eigen3 - 160 ms (no output Map)
eigen4 - 150 ms (manual implemenation of min(), max(), array() +)
manual - 160 ms
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Cost of min/max/cast/array

Sat Mar 01, 2014 5:20 pm
So the one on src/Core/ArrayWrapper.h is not needed? or do you need this one + (1) ?
michaldobrogost
Registered Member
Posts
6
Karma
0

Re: Cost of min/max/cast/array

Tue Mar 04, 2014 12:44 am
To clarify, I need both (0) and (1) but not the others to prevent the call:

0) src/Core/ArrayWrapper.h, line 52,
1) src/Core/MatrixBase.h (line 322) and the other array() method
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Cost of min/max/cast/array

Tue Mar 04, 2014 4:24 pm
Thank you.

https://bitbucket.org/eigen/eigen/commits/5b2464f21d02/
Changeset: 5b2464f21d02
User: ggael
Date: 2014-03-04 17:24:00
Summary: Help MSVC to inline some trivial functions


Bookmarks



Who is online

Registered users: Baidu [Spider], Bing [Bot], Google [Bot]