This forum has been archived. All content is frozen. Please use KDE Discuss instead.

trouble with heap-allocated temporaries

Tags: None
(comma "," separated)
xaffeine
Registered Member
Posts
24
Karma
0
Working on a real-time audio-processing application, I find that Eigen sometimes creates temporaries on the heap, which slows things down too much. I know this because the VTune profiler tells me so.

Examples of triggering code include the following. Assume A, B, C, D, and E are all dynamically-sized Array<float> that are always of the same size (12, 1). Inp is a float scalar.

A = (B * A) + (C * D);

A = Inp + (B * C) - (D * E);

I want to eliminate all unnecessary heap allocations. Is there a way to rewrite these lines without creating a non-simd loop and without incurring additional memory writes?

Failing that, is there a way to at least get something like set_is_malloc_allowed() to tell me where these things are happening? I tried calling set_is_malloc_allowed( false ); But this doesn't work, as it seems never to call check_that_malloc_is_allowed(). I do #define EIGEN_RUNTIME_NO_MALLOC before including anything and I call Eigen::internal::set_is_malloc_allowed( false ) early in the program.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
hm, if A has the right dimensions, and that they are really Array<> and not Matrix<> then no heap allocation will occur for the two examples you showed and this does agree with EIGEN_RUNTIME_NO_MALLOC not returning any error. Your heap allocations probably come from elsewhere.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Here is a selfcontained example:
Code: Select all
#include <iostream>
#define EIGEN_RUNTIME_NO_MALLOC
#include <Eigen/Dense>
using namespace Eigen;

int main()
{
  int N = 12;
  ArrayXf A(N), B(N), C(N), D(N), E(N);
  float Inp = 2.2;
  internal::set_is_malloc_allowed(false);
  //A = (B * A) + (C * D).eval();
  A = (B * A) + (C * D);
  A = Inp + (B * C) - (D * E);
  internal::set_is_malloc_allowed(true);
  return 0;
}

uncomment the line with eval() to see that the detection do work.
xaffeine
Registered Member
Posts
24
Karma
0
Aha! I had two problems. One was that #define EIGEN_RUNTIME_NO_MALLOC didn't work. I had to predefine it on the command line. I guess we'll blame Microsoft for that.

Second, I had some lurking code. Somebody had defined global operator* and global operator+ templates. Those implementations were invoking copy constructors on some of my arrays!

Thanks for the help.


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], Yahoo [Bot]