This forum has been archived. All content is frozen. Please use KDE Discuss instead.

ARM NEON support buggy?

Tags: None
(comma "," separated)
kmhofmann
Registered Member
Posts
6
Karma
0

ARM NEON support buggy?

Thu Dec 09, 2010 1:43 pm
Hi,

I am trying to enable ARM NEON vectorization support for development on an Android phone using the Android NDK and the current Eigen trunk (or 3.0-beta2). It seems to me that ARM NEON support is still quite rough around the edges, i.e. not really functional yet.

After enabling vectorization support by adding -mfpu=neon -mfloat-abi=softfp to the compiler options, I immediately ran into several compiler errors.
Some of them I was able to fix by patching my compiler while others required modifications to the Eigen source code.

Out of the box, compiling a translation unit of my program using the Android toolchain gave me errors related to casting a 'float*' to a 'float32_t*'. I figured that these errors are due to the issue discussed here:
viewtopic.php?f=74&t=91183
so I applied the described patch to the toolchain and rebuilt it. As a consequence, these errors were gone but others remained, as listed here:

Code: Select all
In file included from Eigen/Core:251:
Eigen/src/Core/arch/NEON/PacketMath.h:179: error: template-id 'pload<float>' for '__builtin_neon_sf __vector__ Eigen::internal::pload(const float*)' does not match any template declaration
Eigen/src/Core/arch/NEON/PacketMath.h:180: error: template-id 'pload<int>' for '__builtin_neon_si __vector__ Eigen::internal::pload(const int*)' does not match any template declaration
Eigen/src/Core/arch/NEON/PacketMath.h: In function 'Packet Eigen::internal::ploaddup(const typename Eigen::internal::unpacket_traits<Packet>::type*) [with Packet = __builtin_neon_sf __vector__]':
Eigen/src/Core/arch/NEON/PacketMath.h:189: error: 'hi' was not declared in this scope
Eigen/src/Core/arch/NEON/PacketMath.h: At global scope:
Eigen/src/Core/arch/NEON/PacketMath.h:192: error: template-id 'ploaddup<__builtin_neon_si __vector__>' for '__builtin_neon_si __vector__ Eigen::internal::ploaddup(const float*)' does not match any template declaration
In file included from Eigen/Core:252:
Eigen/src/Core/arch/NEON/Complex.h:119: error: template-id 'pload<std::complex<float> >' for 'Eigen::internal::Packet2cf Eigen::internal::pload(const std::complex<float>*)' does not match any template declaration
Eigen/src/Core/arch/NEON/Complex.h:120: error: template-id 'ploadu<std::complex<float> >' for 'Eigen::internal::Packet2cf Eigen::internal::ploadu(const std::complex<float>*)' does not match any template declaration
Eigen/src/Core/arch/NEON/Complex.h: In function 'Eigen::internal::Packet2cf Eigen::internal::pcplxflip(const Eigen::internal::Packet2cf&)':
Eigen/src/Core/arch/NEON/Complex.h:148: error: 'a' was not declared in this scope


In order to remove these errors, I made the following changes to the code:

> diff Complex.h Complex.h.FIX

Code: Select all
119,120c119,120
< template<> EIGEN_STRONG_INLINE Packet2cf pload <std::complex<float> >(const std::complex<float>* from) { EIGEN_DEBUG_ALIGNED_LOAD return Packet2cf(pload((const float*)from)); }
< template<> EIGEN_STRONG_INLINE Packet2cf ploadu<std::complex<float> >(const std::complex<float>* from) { EIGEN_DEBUG_UNALIGNED_LOAD return Packet2cf(ploadu((const float*)from)); }
---
> template<> EIGEN_STRONG_INLINE Packet2cf pload <Packet2cf>(const std::complex<float>* from) { EIGEN_DEBUG_ALIGNED_LOAD return Packet2cf(pload<Packet4f>((const float*)from)); }
> template<> EIGEN_STRONG_INLINE Packet2cf ploadu<Packet2cf>(const std::complex<float>* from) { EIGEN_DEBUG_UNALIGNED_LOAD return Packet2cf(ploadu<Packet4f>((const float*)from)); }
146c146
< EIGEN_STRONG_INLINE Packet2cf pcplxflip/*<Packet2cf>*/(const Packet2cf& x)
---
> EIGEN_STRONG_INLINE Packet2cf pcplxflip/*<Packet2cf>*/(const Packet2cf& a)


> diff PacketMath.h PacketMath.h.FIX

Code: Select all
179,180c179,180
< template<> EIGEN_STRONG_INLINE Packet4f pload<float>(const float* from) { EIGEN_DEBUG_ALIGNED_LOAD return vld1q_f32(from); }
< template<> EIGEN_STRONG_INLINE Packet4i pload<int>(const int*     from) { EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s32(from); }
---
> template<> EIGEN_STRONG_INLINE Packet4f pload<Packet4f>(const float* from) { EIGEN_DEBUG_ALIGNED_LOAD return vld1q_f32(from); }
> template<> EIGEN_STRONG_INLINE Packet4i pload<Packet4i>(const int*     from) { EIGEN_DEBUG_ALIGNED_LOAD return vld1q_s32(from); }
187c187
<   float32x2_t lo, ho;
---
>   float32x2_t lo, hi;
192c192
< template<> EIGEN_STRONG_INLINE Packet4i ploaddup<Packet4i>(const float* from)
---
> template<> EIGEN_STRONG_INLINE Packet4i ploaddup<Packet4i>(const int* from)
194c194
<   int32x2_t lo, ho;
---
>   int32x2_t lo, hi;



I am not sure how well the ARM NEON related code has been put under scrutiny, but the several typos in variable name declarations should be easily caught in some unit tests.
And while I also was able to remove all template-related syntax errors, I am not 100% certain that my changes are semantically correct, i.e. I did not attempt to understand the internals of the code but merely followed the template logic.
Can one of the maintainers please have a look at the original code vs. my suggested patch and comment?

The remaining problem is that my program (which uses Eigen for matrix addition, multiplication, inversion and SVD) now compiles without problems using vectorization support, but I only get bogus results. A non-vectorized build executes correctly.
What are the best ways to find out where exactly the problems occur w.r.t. ARM NEON code? Are there any helper macros or functions for debugging this?

Thanks for any help!
Cheers,

Michael
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: ARM NEON support buggy?

Thu Dec 09, 2010 2:41 pm
I am trying to enable ARM NEON vectorization support for development on an Android phone using the Android NDK and the current Eigen trunk (or 3.0-beta2). It seems to me that ARM NEON support is still quite rough around the edges, i.e. not really functional yet.


Can you please run the test suite on NEON,
http://eigen.tuxfamily.org/index.php?title=Tests
you may have to do cmake -DEIGEN_TEST_NEON if it's not enabled by default.

Please report any failure on bugzilla.

It used to be working, perhaps there have been regression as it's not tested as often as x86 is.

Code: Select all
    119,120c119,120
    < template<> EIGEN_STRONG_INLINE Packet2cf pload <std::complex<float> >(const std::complex<float>* from) { EIGEN_DEBUG_ALIGNED_LOAD return Packet2cf(pload((const float*)from)); }
    < template<> EIGEN_STRONG_INLINE Packet2cf ploadu<std::complex<float> >(const std::complex<float>* from) { EIGEN_DEBUG_UNALIGNED_LOAD return Packet2cf(ploadu((const float*)from)); }
    ---
    > template<> EIGEN_STRONG_INLINE Packet2cf pload <Packet2cf>(const std::complex<float>* from) { EIGEN_DEBUG_ALIGNED_LOAD return Packet2cf(pload<Packet4f>((const float*)from)); }
    > template<> EIGEN_STRONG_INLINE Packet2cf ploadu<Packet2cf>(const std::complex<float>* from) { EIGEN_DEBUG_UNALIGNED_LOAD return Packet2cf(ploadu<Packet4f>((const float*)from)); }
    146c146



Indeed, this pload<> wants the packet type here, so your patch is needed. Please file a bug, attach your patch (please generate with diff -u or hg diff) and send mail to Konstantinos (find his address in Core/arch/NEON/PacketMath.h)

We must fix this.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
kmhofmann
Registered Member
Posts
6
Karma
0

Re: ARM NEON support buggy?

Thu Dec 09, 2010 4:31 pm
bjacob wrote:Can you please run the test suite on NEON,
http://eigen.tuxfamily.org/index.php?title=Tests
you may have to do cmake -DEIGEN_TEST_NEON if it's not enabled by default.

This is not going to be that trivial, as the only NEON device I have access to is an Android phone, for which I can only generate native shared libraries through the Google-supplied NDK build system, which are then linked to an executable to be run under the Dalvik virtual machine.
No idea if I can somehow use cmake to build or run the tests or even how to easily remap stdout/stderr. Will still give it a try either this or next week.

bjacob wrote:Indeed, this pload<> wants the packet type here, so your patch is needed. Please file a bug, attach your patch (please generate with diff -u or hg diff) and send mail to Konstantinos (find his address in Core/arch/NEON/PacketMath.h)

Ok, will do asap. This stuff really needs to be fixed.

Cheers,
Michael
kmhofmann
Registered Member
Posts
6
Karma
0

Re: ARM NEON support buggy?

Tue Dec 14, 2010 9:42 pm
As a quick update, I haven't run the test suite yet (not completely sure how, plus lack of time).
That said, Konstantinos's fix following my bug report (http://eigen.tuxfamily.org/bz/show_bug.cgi?id=129) removed my issues in enabling ARM NEON vectorization support in my program. Everything compiles and works for me now.

Cheers,
Michael


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Evergrowing, Google [Bot], ourcraft