Registered Member
|
Hi,
I have a code where most of the computational time is spent on integrals like: I(x)= int_0^1 f(u,x) du where u is a double and x and f(u,x) are typically a Matrix<double,dim,1> and a Matrix<double,dim,dim>, respectively. I'm computing these integrals by quadrature so what I really do is: I(x)= f(u(0),x) *w(0) + f(u(1),x) *w(1) + ... + f(u(N),x) *w(N) where u(0).. u(N) and w(0)...w(N) are quadrature positions and weights (eg. Gauss-Legendre), known at compile time. Right now I have a simple for loop that adds the partial sums but I was thinking that maybe I can recursively assemble the final expression for I(x) at compile time and then exploit Eigen internal expression templates mechanism to evaluate the expression for some x. Is this something that makes sense and can be done? I'm not very familiar with expression templates and Eigen internals. The following is a totally disastrous attempt that I tried (forget quadrature for now, I just do sums). Works but is much slower than the for loop. Any idea is appreciated. Thanks for you help!
|
Moderator
|
hm, if the dimension in which you work is small, as is our example(R^3), then there is no point is using expression templates. Such small objects are like scalars and very optimized by the compiler.
Also in your example you are not at all building an expression. What you did is "simply" explicitly unrolling the loop, but with N=350 you are killing the instruction cache. What you could do is to partially unroll the loop, i.e., nest your meta unroller inside a true for loop: for(inti=0;i+8<N;i+=8) y+= Integrator<8,foo>::eval(x); // another for loop to process the remaining evaluations... This only makes sense if foo1(u,x) is cheap enough. Another option, would be to use large VectorXd of length N, and template foo such that it can take as input vectors (or arrays): static Eigen::Matrix<double,3,Dynamic> eval(const VectorXd & u, const Eigen::Matrix<double,3,Dynamic> & x){ return x*u.asDiagonal() + x + (x.transpose() * x) * x + ...; } this way, the evaluation will be fully vectorized by Eigen. You might also consider to return the value using a non const reference argument in order to avoid the cost of the return by value. |
Registered Member
|
Hi Gael, thanks for your explanation.
my dimension is small (dim=3) and my quadrature order is typically in the range 8-32. But I can't judge if my foo is cheap enough, maybe you can help me. My actual integral is some flavor of Green's function:
Here the PreviouslyStored matrixes are quantities that depend on the quadrature position k. So what I was originally (and erroneously) trying to achieve was to put together the expression for the total sum (integral) so that Eigen could evaluate that efficiently. But, if I understand you correctly, there is no need to do any of this, because the compiler is already optimizing the loop (for small objects) right? Also, I didn't quite get the use of the VectorXd, is that something that applies only to the (oversimplified) code that I wrote in the previous post? Thanks again. |
Registered users: abc72656, Bing [Bot], daret, Google [Bot], Sogou [Bot], Yahoo [Bot]