This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Eigen+Xcode+AltiVec ??

Tags: None
(comma "," separated)
mpmcl
Registered Member
Posts
19
Karma
0

Eigen+Xcode+AltiVec ??

Tue Jan 19, 2010 8:11 pm
I am stuck with a G5 Mac at work and would *love* to find an Xcode+Eigen project somewhere that has AltiVec enabled successfully. So far, I have not been able to get everything right although, without AltiVec, I have no problems. Then, it is straightforward Objective-C++ (Cocoa).

Does anyone know of any sample code online that I could peruse?

Thanks in advance.

N.B. I am using Eigen v2.0.11 if that matters.
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Tue Jan 19, 2010 11:22 pm
What CPU and compiler (version) are you using?

What specific problems do you have, e.g. compilation errors? Crashes?


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 1:10 pm
I am using GCC 4.2.

All I was looking for was an Xcode example so that I could see the various build settings in that context. I am not getting crashes. Also, there was no mention of the Accelerate framework which I found confusing.
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 3:18 pm
GCC 4.2 sounds good, it's the minimal GCC version for vectorization.

can you run this test program:

Code: Select all
#include<Eigen/Core>
int main()
{
#ifdef __ALTIVEC__
  std::cout << "altivec detected" << std::endl;
#else
  std::cout << "altivec not detected" << std::endl;
#endif
#ifdef EIGEN_VECTORIZE_ALTIVEC
  std::cout << "altivec enabled" << std::endl;
#else
  std::cout << "altivec disabled" << std::endl;
#endif
}


If you get "not detected" can you pass the -maltivec option to GCC and retry? There should be an option in Xcode for doing that.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 6:50 pm
I tried two Xcode) projects: a C++ tool and a Cocoa-document app. In both cases, I used GCC 4.2 (not the default) and built Universal binaries targeting Leopard with the following header (since I never actually *installed* Eigen).

#include <iostream>
#include "../eigen-2.0.11/Eigen/Core"

I got

altivec detected
altivec enabled

both times without any additional compiler flags needed. I guess what I'm missing is some example specifying all the alignment stuff with a simple matrix computation. Then I can follow that pattern. [Sorry to be such an Eigen newbie.]
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 6:54 pm
Eigen takes care of all the alignment stuff for you.

It looks as if AltiVec is perfectly enabled for you right now. In order to check the ASM yourself, see this:
http://eigen.tuxfamily.org/index.php?ti ... #Using_GCC


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 8:53 pm
Since I got the aforementioned "AltiVec enabled" message, I wrote a function that should use AltiVec. It has one line, viz.,

u = v + 3*w;

where all three variables are Vector4d.

The assembly code does not show any evidence of AltiVec being used. If I enable AltiVec extensions in Xcode, the assembly does not change. If I try to NOT vectorize by defining EIGEN_DONT_VECTORIZE, the only thing that changes in the disassembly are some offsets which suggests that alignment only is being affected.

Xcode does not have a -maltivec option (it is ignored) but does have a -faltivec option. However, it has no effect in this case.

In addition, the debugger window showing the vector registers does not change when the line above executes.

Here is the disassembly of my test function. It looks like just PPC assembly to me.

0x00001d18 <+0000> nop
0x00001d1c <+0004> nop
0x00001d20 <+0008> nop
0x00001d24 <+0012> nop
0x00001d28 <+0016> nop
0x00001d2c <+0020> mflr r0
0x00001d30 <+0024> stmw r29,-12(r1)
0x00001d34 <+0028> stw r0,8(r1)
0x00001d38 <+0032> stwu r1,-112(r1)
0x00001d3c <+0036> mr r30,r1
0x00001d40 <+0040> bcl- 20,4*cr7+so,0x1d44 <_Z3fooRN5Eigen6MatrixIdLi4ELi1ELi2ELi4ELi1EEES2_S2_+44>
0x00001d44 <+0044> mflr r31
0x00001d48 <+0048> stw r3,136(r30)
0x00001d4c <+0052> stw r4,140(r30)
0x00001d50 <+0056> stw r5,144(r30)
0x00001d54 <+0060> lwz r29,140(r30)
0x00001d58 <+0064> addis r2,r31,0
0x00001d5c <+0068> addi r2,r2,13540
0x00001d60 <+0072> lfd f0,0(r2)
0x00001d64 <+0076> stfd f0,56(r30)
0x00001d68 <+0080> lwz r2,144(r30)
0x00001d6c <+0084> addi r0,r30,64
0x00001d70 <+0088> mr r3,r0
0x00001d74 <+0092> addi r0,r30,56
0x00001d78 <+0096> mr r4,r0
0x00001d7c <+0100> mr r5,r2
0x00001d80 <+0104> bl 0x49bc <dyld_stub__ZN5EigenmlERKdRKNS_10MatrixBaseINS_6MatrixIdLi4ELi1ELi2ELi4ELi1EEEEE>
0x00001d84 <+0108> addi r2,r30,64
0x00001d88 <+0112> addi r0,r30,76
0x00001d8c <+0116> mr r3,r0
0x00001d90 <+0120> mr r4,r29
0x00001d94 <+0124> mr r5,r2
0x00001d98 <+0128> bl 0x39f4 <_ZNK5Eigen10MatrixBaseINS_6MatrixIdLi4ELi1ELi2ELi4ELi1EEEEplINS_12CwiseUnaryOpINS_21ei_scalar_multiple_opIdEES2_EEEEKNS_13CwiseBinaryOpINS_16ei_scalar_sum_opIdEES2_T_EERKNS0_ISC_EE>
0x00001d9c <+0132> addi r0,r30,76
0x00001da0 <+0136> lwz r3,136(r30)
0x00001da4 <+0140> mr r4,r0
0x00001da8 <+0144> bl 0x3ea0 <_ZN5Eigen6MatrixIdLi4ELi1ELi2ELi4ELi1EEaSINS_13CwiseBinaryOpINS_16ei_scalar_sum_opIdEES1_NS_12CwiseUnaryOpINS_21ei_scalar_multiple_opIdEES1_EEEEEERS1_RKNS_10MatrixBaseIT_EE>
0x00001dac <+0148> lwz r0,136(r30)
0x00001db0 <+0152> addis r2,r31,0
0x00001db4 <+0156> lwz r3,17956(r2)
0x00001db8 <+0160> mr r4,r0
0x00001dbc <+0164> bl 0x49ac <dyld_stub__ZN5EigenlsINS_6MatrixIdLi4ELi1ELi2ELi4ELi1EEEEERSoS3_RKNS_10MatrixBaseIT_EE>
0x00001dc0 <+0168> lwz r1,0(r1)
0x00001dc4 <+0172> lwz r0,8(r1)
0x00001dc8 <+0176> mtlr r0
0x00001dcc <+0180> lmw r29,-12(r1)
0x00001dd0 <+0184> blr
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 9:06 pm
OK, just 2 things:
EIGEN_DONT_VECTORIZE doesn't affect alignment at all, we align in both cases. EIGEN_DONT_ALIGN would disable alignment.

Can you enable optimization (-O2): this will make the assembly much much shorter and easier to read.


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 9:38 pm
Here is the function at source code level

void foo(Vector4d &u, Vector4d &v, Vector4d &w)
{
u = v + 3*w;
cout << u;
}

Here is the disassembly with -O2 (there is also -O3). I see only regular registers and floating-point registers.

0x000025b0 <+0000> mflr r0
0x000025b4 <+0004> lfd f0,0(r4)
0x000025b8 <+0008> lfd f13,0(r5)
0x000025bc <+0012> bcl- 20,4*cr7+so,0x25c0 <_Z3fooRN5Eigen6MatrixIdLi4ELi1ELi2ELi4ELi1EEES2_S2_+16>
0x000025c0 <+0016> mr r9,r4
0x000025c4 <+0020> mr r4,r3
0x000025c8 <+0024> mflr r10
0x000025cc <+0028> nop
0x000025d0 <+0032> nop
0x000025d4 <+0036> mtlr r0
0x000025d8 <+0040> nop
0x000025dc <+0044> nop
0x000025e0 <+0048> nop
0x000025e4 <+0052> addis r2,r10,0
0x000025e8 <+0056> lfs f12,6116(r2)
0x000025ec <+0060> mr r2,r3
0x000025f0 <+0064> addis r3,r10,0
0x000025f4 <+0068> lwz r3,7624(r3)
0x000025f8 <+0072> fmadd f13,f13,f12,f0
0x000025fc <+0076> stfd f13,0(r2)
0x00002600 <+0080> lfd f13,8(r9)
0x00002604 <+0084> lfd f0,8(r5)
0x00002608 <+0088> fmadd f0,f0,f12,f13
0x0000260c <+0092> stfd f0,8(r2)
0x00002610 <+0096> lfd f0,16(r9)
0x00002614 <+0100> lfd f13,16(r5)
0x00002618 <+0104> fmadd f13,f13,f12,f0
0x0000261c <+0108> stfd f13,16(r2)
0x00002620 <+0112> lfd f0,24(r5)
0x00002624 <+0116> lfd f13,24(r9)
0x00002628 <+0120> fmadd f0,f0,f12,f13
0x0000262c <+0124> stfd f0,24(r2)
0x00002630 <+0128> b 0x376c <dyld_stub__ZN5EigenlsINS_6MatrixIdLi4ELi1ELi2ELi4ELi1EEEEERSoS3_RKNS_10MatrixBaseIT_EE>
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Wed Jan 20, 2010 10:07 pm
Indeed, this code is not vectorized :/

I'm not a altivec expert but this is very strange. Can you try just "u=v+w" and see if that works?


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Thu Jan 21, 2010 1:33 pm
Deleting the matrix output just deletes the last line of the disassembly above and changes the penultimate line from b to blr. Also, as noted, manually enabling -faltivec, in the Xcode build options, has no effect.

I do not know how Eigen interacts with the extended GCC 4.2 that ships with Macs these days. However, I dumped the following from the Apple-supplied man page for gcc.

*****
-faltivec
This flag is provided for compatibility with Metrowerks CodeWarrior
and MrC compilers as well as previous Apple versions of GCC. It
causes the -mpim-altivec option to be turned on.

-maltivec
-mno-altivec
Generate code that uses (does not use) AltiVec instructions, and
also enable the use of built-in functions that allow more direct
access to the AltiVec instruction set. You may also need to set
-mabi=altivec to adjust the current ABI with AltiVec ABI
enhancements.

-mpim-altivec
-mno-pim-altivec
Enable (or disable) built-in compiler support for the syntactic
extensions as well as operations and predicates defined in the
Motorola AltiVec Technology Programming Interface Manual (PIM).
This includes the recognition of "vector" and "pixel" as (context-
dependent) keywords, the definition of built-in functions such as
"vec_add", and the use of parenthesized comma expression as AltiVec
literals. Note that unlike the option -maltivec, the extension
does not require the inclusion of any special header files; if
"<altivec.h>" is included, a warning will be issued and the
contents of the header will be ignored. The preprocessor shall
provide an "__APPLE_ALTIVEC__" manifest constant when -mpim-altivec
is specified. (APPLE ONLY)

In addition, the -mpim-altivec option disables the inlining of
functions containing AltiVec instructions into functions that do
not make use of the vector unit. Certain other optimizations, such
as inline vectorization of "memset" and "memcpy" calls, are also
disabled. These adjustments make it possible to compile programs
whose use of AltiVec instructions is preceded by a run-time check
for the presence of AltiVec functionality, and that can therefore
be made to run on G3 processors. Note that all of these
optimizations may be re-enabled by supplying the -maltivec option,
or an -mcpu option specifying a processor that supports AltiVec
instructions.
*****

I tried turning off -faltivec and using -maltivec instead. The result was the same: no AltiVec code.

In all of this, I started with a clean project template. There is nothing special that I did to it.
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Thu Jan 21, 2010 1:55 pm
Just discovered that, if I change my Vector4d variables to Vector4f, then i *do* get AltiVec to work. Here is the disassembly:

*****
0x00002650 <+0000> mfvrsave r0
0x00002654 <+0004> nop
0x00002658 <+0008> nop
0x0000265c <+0012> nop
0x00002660 <+0016> stw r0,-8(r1)
0x00002664 <+0020> oris r0,r0,49164
0x00002668 <+0024> nop
0x0000266c <+0028> nop
0x00002670 <+0032> mtvrsave r0
0x00002674 <+0036> mflr r0
0x00002678 <+0040> lvx v12,r0,r5
0x0000267c <+0044> lvx v13,r0,r4
0x00002680 <+0048> bcl- 20,4*cr7+so,0x2684 <_Z3fooRN5Eigen6MatrixIfLi4ELi1ELi2ELi4ELi1EEES2_S2_+52>
0x00002684 <+0052> lwz r12,-8(r1)
0x00002688 <+0056> mflr r10
0x0000268c <+0060> mtlr r0
0x00002690 <+0064> addis r2,r10,0
0x00002694 <+0068> lfs f0,2152(r2)
0x00002698 <+0072> addi r2,r1,-48
0x0000269c <+0076> stfs f0,-48(r1)
0x000026a0 <+0080> lvx v0,r0,r2
0x000026a4 <+0084> vspltw v1,v0,0
0x000026a8 <+0088> vspltisw v0,0
0x000026ac <+0092> vmaddfp v0,v12,v1,v0
0x000026b0 <+0096> vaddfp v0,v0,v13
0x000026b4 <+0100> stvx v0,r0,r3
0x000026b8 <+0104> mtvrsave r12
0x000026bc <+0108> blr
*****

I see the vector registers changing as well. Sadly, I really need doubles for my app.
User avatar
bjacob
Registered Member
Posts
658
Karma
3

Re: Eigen+Xcode+AltiVec ??

Thu Jan 21, 2010 2:26 pm
Oh, right!

AltiVec only supports floats, not double, if I remember well.

At the very least I can tell you that _our_ AltiVec support is only for floats, as you can see in the file Eigen/src/Core/arch/AltiVec/PacketMath.h


Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list!
phr
Registered Member
Posts
7
Karma
1

Re: Eigen+Xcode+AltiVec ??

Thu Jan 21, 2010 2:39 pm
Eigen doesn't use the Accelerate Framework (as far as I can see). Wouldn't this be a problem when targeting a universal binary?
mpmcl
Registered Member
Posts
19
Karma
0

Re: Eigen+Xcode+AltiVec ??

Thu Jan 21, 2010 2:50 pm
FWIW, the Apple docs state that their vectorized BLAS library, optimized using ATLAS, *does* fully implement BLAS so (presumably) it can handle doubles and complex numbers as well. I have not tried to confirm this since I do not know how to penetrate their dylib in the Xcode debugger (gdb).

This could be something to keep in mind when you are doing your comparison tests for Eigen vs. other libraries.


Bookmarks



Who is online

Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]