![]() Registered Member ![]()
|
I'm using the new Ref feature and have found an unexpected bottleneck. Here's the snippet:
From some tests with my profiler, it looks like the (x-y) step causes a temporary to be allocated. Since it's passed directly into an expression, I didn't think a temporary would be needed and is certainly not desirable here. This is an inner loop that gets called often, so the alloc/dealloc is a serious performance issue in my case. I'm computing a covariance matrix that is a function of all the pairwise distances between points. The inputs x and y are columns of a big matrix, and Ref allows them to be passed in without a copy, which profiling confirms. An outer loop not shown iterates over pairs of columns. This function is virtual, so the template approach is not feasible, and I don't know if it would fix the issue. I can avoid the memory allocation by forcing it to compute the vector difference using a static temporary, which is not ideal:
Of course, I could also put the implementation of the norm here by hand, which is simple enough for now. Thanks! |
![]() Moderator ![]()
|
That's rather strange temporaries are created there, I'll check. Meanwhile, I would suggest you to compute the covariance matrix as a matrix product C = D.transpose() * D and then apply DistKernel to C. This will be much faster!
|
![]() Moderator ![]()
|
The performance issue you observed is fixed by this changeset:
https://bitbucket.org/eigen/eigen/commits/2d62a0ee5095/ Changeset: 2d62a0ee5095 User: ggael Date: 2013-08-11 17:52:43 Summary: Ref<> objects must be nested by reference because they potentially store a temporary object |
![]() Registered Member ![]()
|
Thanks very much, on both counts! I'll have to look at whether I can use a product/apply strategy within the class hierarchy I'm using.
|
Registered users: Bing [Bot], Google [Bot], Sogou [Bot], Yahoo [Bot]