This forum has been archived. All content is frozen. Please use KDE Discuss instead.

gcc 4.2.1 and noalias on Mac OSX

Tags: None
(comma "," separated)
MarkusS
Registered Member
Posts
9
Karma
0
OS

gcc 4.2.1 and noalias on Mac OSX

Tue Jan 11, 2011 9:51 pm
Hello,

I have a strange problem with eigen3 (tried both beta1 and 2) on my Mac (OSX 10.6.6 with gcc 4.2.1). Despite the fact that I use .noalias(), a simple matrix multiplication creates three malloc's?!

Here is my test program:
Code: Select all
#include <iostream>
#include <stdlib.h>
#include <Eigen/LU>

int main(int argc, char *argv[])
{
    if (argc != 3) {
        std::cerr << "ERROR: Usage " << argv[0] << " n loops" << std::endl;
        return -1;
    }

    int n = atoi(argv[1]);
    int loops = atoi(argv[2]);
   
    Eigen::MatrixXd A = Eigen::MatrixXd::Random(n, n);
    Eigen::MatrixXd B = Eigen::MatrixXd::Random(n, n);
    Eigen::MatrixXd AB(n, n);

    for (int i=0; i < loops; ++i)
        AB.noalias() = A * B;

    return 0;
}


compiled with:
Code: Select all
g++ -g aliasing.cpp -o aliasing


valgrind reports 4+loops*3 malloc's (i.e. 4 mallocs for loops=0 and 34 mallocs for loops=10). Without the .noalias(), valgrind reports 4+loops*4 malloc's, so noalias does something, but obviously not the whole thing?

Under Linux with gcc 4.4.3 valgrind reports only 3 mallocs with noalias, even for loops > 0 as expected...

Switching to a newer gcc is not an (easy) option for us right now.
Any ideas what's going on here?


Thanks,

Markus
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Hi,

this is perfectly normal. The matrix product has to allocate 3 small workspace buffers. On linux and for not too large matrices they are allocated on the stack. If you specify the matrix size at compile time, then stack allocation is guaranteed. In the future we could add a special product function where the user will be able to specify a pre-allocated workspace object.
MarkusS
Registered Member
Posts
9
Karma
0
OS
Ggael,

thanks a lot for your quick reply. Indeed, for square matrices > 50x50 I also get mallocs on linux, but only 3+loops*2 (?).

Is there a way to force the stack allocation also on Macs for dynamic matrix products? We would like to use eigen in a real time control application and have to avoid mallocs, but our dynamic matrices are < 50x50, so the 'Linux-behavior' would be fine for us.

BTW: A special product function with a user provided pre-allocated workspace would be nice.

Thanks again,

Markus
MarkusS
Registered Member
Posts
9
Karma
0
OS
To answer my own question:
OSX also has alloca (at least since 10.3), so after changing line 431 in Core/util/Memory.h from:
Code: Select all
#if (defined __linux__)

to:
Code: Select all
#if (defined __linux__) || (defined __APPLE__)

I get the same behavior on my Mac as on my Linux box. Could that be included in Eigen?

Sadly this fixes only the issue with malloc's for matrix multiplications, not for other temporaries created in methods like inverse, llt, etc.

Did somebody look into writing a private allocator for Eigen? This looks easier to me than creating versions of all methods with user supplied temporaries to avoid memory fragmentation/the performance hit of repeated mallocs/frees?

Thanks,

Markus
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
for LU or Cholesky solving, you can already preallocated a PartialPivLU or LLT object with appropriate sizes and use these objects all over the places (you need one per thread). Of course this only works if the sizes does not changes.


Bookmarks



Who is online

Registered users: bartoloni, Bing [Bot], Evergrowing, Google [Bot], ourcraft