Registered Member
|
Hi you all, I'm interested if there is a (fast) method to compute, a 2D convolution between two matrices, possibly also defining a kernel, like in Matlab using the conv2 function.
Thanks again! |
Registered Member
|
I came up with a better solution...
|
Moderator
|
thanks, this will probably be useful to other users though the proper handling of boundaries would be very nice!
|
Registered Member
|
Hi all,
I hope it's ok to reopen this very old thread . I was wondering how the above code example will likely be optimized by Eigen. Especially, can the component-wise product and sum be vectorized? As I understand it, I.block(row,col,KSizeX,KSizeY ) will only be properly aligned in memory for specific values of row and col. Or is there a temporary object created each time a block is extracted from the full matrix? I checked the assembly output for this bit of code and could only see scalar simd instructions (mulss/addss). Could missing alignment be the cause of this or is there probably some optimization flag missing in my project? In the first case, would it be worth putting the block in the inner loop into a newly created matrix to allow for alignment? Thanks a lot and best regards, Matthias |
Moderator
|
The line:
I.block(row,col,KSizeX,KSizeY ).cwiseProduct(kernel)).sum() should indeed be vectorized, unless KSizeX/KSizeY are not suitable for vectorization. Copying the block into an aligned buffer is point-less as this operation is memory bounded anyway. To guarantee optimal vectorization, it would be better to process the input image per packet, like:
|
Registered users: bartoloni, Bing [Bot], Google [Bot], Yahoo [Bot]