Registered Member
|
Hi,
I just started using Eigen recently and am really impressed by it, thanks a lot for this great piece of software! I use Eigen for a machine learning project, in which I need the log-sum-exp function quite often. Before coming to the question, please let me give a short introduction: In Bayesian statistics, one often has to deal with very low probabilities. In order to avoid numerical underflow, the usual trick is to perform all calculations in the log domain. Let's say we've just calculated some discretized probability density and want to normalize it in order to make it sum up to 1:
This is one of the situations where the logsumexp function appears. Unfortunately, because of the small coefficients in logp, the exp() of which would be 0, the calculation can not be performed directly, one rather uses the following trick:
Using this algorithm, the smallest coefficient in the sum will be exp(0)=1 and we get meaningful results. So far so good. Now for the question: what if we are interested in logsumexp(x+y), where x and y are huge vectors. Could Eigen's cost model determine which one of the following to possibilities is better:
Maybe it would even make sense to take a chunk of the data that is slightly smaller than the cache, find and remember m1, perform s1=sum(exp()), do the next chunk and if the new m2 is greater, correct s1, and so on. Just thinking.... |
Registered Member
|
No, Eigen won't do this transformation for you. Actually, all reductions (like taking the sum or the max of all coeffs of a vector) are done immediately, they are not lazy.
On the plus side, Eigen makes it easier for you to do precisely what you want, for example if you want to do this cache-friendly optimization you have in mind. Indeed, once you have coded your logsumexp() function, making it work only on a segment of a vector is just a matter of doing .segment(start, size). You could even write a logsumexp() function that takes an Eigen expression, instead of just a plain vector, and pass .segment(start,size) directly to it. and finally, if you take a small enough size, you could even make it a compile-time constant and use .segment(start), and Eigen would automatically unroll the loop for you.
Join us on Eigen's IRC channel: #eigen on irc.freenode.net
Have a serious interest in Eigen? Then join the mailing list! |
Registered users: Bing [Bot], Evergrowing, Google [Bot], rblackwell