Vectorization of strided data • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

Vectorization of strided data

Page 1 of 1 (3 posts)

Tags:

corybrinck Registered Member Posts 2 Karma 0	Vectorization of strided data Mon Aug 22, 2016 9:09 pm I noticed that vectorization is lost for strided maps/blocks. Has adding packet load/store functions for strided data been considered? It of course wouldn't be as fast as unstrided loads, but could still offer a significant improvement in speed. I have an application that does lots of sub-sampling and sub-blocking of complex<float> data using the Map class with dynamic stride. If I switch to complex<double> it actually runs significantly faster because the Packet1cd vectorization kicks in. Especially with the new AVX support I would expect significant improvements could be had for strided operations with the addition of strided loads/stores.
ggael Moderator Posts 3447 Karma 19 OS	Re: Vectorization of strided data Tue Aug 23, 2016 5:22 pm If you are using GCC, compiling with -fcx-limited-range might help a lot because then non-vectorized complex scalar products will be inlined by the compiler. I won't repeat myself, so please see the following page for more details on std::complex speed issues: http://stackoverflow.com/questions/3765 ... erformance Then regarding strided load/stores, I guess that you would need quite complicated expression to get visible speed up. Since a single complex<> multiplication is already quite involved, this could indeed be worth a try.
corybrinck Registered Member Posts 2 Karma 0	Re: Vectorization of strided data Wed Aug 24, 2016 3:32 pm Thanks for the link. I was wondering why operator* wasn't already vectorized for std::complex by the compiler.

Page 1 of 1 (3 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], Google [Bot], Sogou [Bot]