This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Sample from multivariate normal

Tags: None
(comma "," separated)
Swdshchf
Registered Member
Posts
5
Karma
0

Sample from multivariate normal

Fri May 27, 2011 4:02 pm
I'm trying to generate samples from a real-valued, multivariate normal distribution. I have a function for generating a sample from a zero-mean, unit-variance univariate normal distribution `randN()`. So, following the process in Wikipedia, I first find the eigendecomposition of the covariance:
Code: Select all
Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigenSolver(covarMat);
(covariance matrices are symmetric; so they're always self-adjoint, I think). I then get
Code: Select all
Eigen::MatrixXd rot = eigenSolver.eigenvectors();
Eigen::VectorXd scl = eigenSolver.eigenvalues();
for {int ii=0;ii<size;++ii) {
  scl(ii,0) = sqrt(scl(ii,0);
}

For each sample, I fill a vector with independent samples and then scale, rotate, and translate to get the properly biased sample:
Code: Select all
Eigen::VectorXd samples;
for (int ii=0;ii<size;++ii) {
  samples(ii,0) = randN()*scl(ii,0);
}
Eigen::VectorXd returnVal = meanVec + rot*samples;

This seems to work. However, I wonder if it couldn't be made a lot more efficient. In particular, according to Wikipedia, if the covariance matrix is positive-definite (most of the time it is) then I should be able to use a Cholesky decomposition. But sometimes there will be dimensions in the data with 0 variance. Is there a fast way to check the covariance matrix and see which decomposition I should use? If it's relevant, `size` can be in the hundreds for my purposes.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Sample from multivariate normal

Fri May 27, 2011 7:46 pm
hm... I think that for this purpose you should really use an eigen decomposition. Indeed, Cholesky does not give you an orthogonal decomposition and then I think the resulting samples won't be normal but radially compressed and dilated unless you pay for some additional efforts...

note that you can write your code as:

VectorXf samples = mean + eigenSolver.eigenvectors() * eigenSolver.eigenvalues().asDiagonal() * VectorXf::NullaryExpr(std::ptr_fun(randN), size)
Swdshchf
Registered Member
Posts
5
Karma
0

Re: Sample from multivariate normal

Fri May 27, 2011 8:39 pm
It's very slick to have pushed it down into a single line like that. I don't think it's quite valid anymore though because you need to take the square-root of the eigenvalues in order for the scaling to be correct.

The non-orthogonality of the Cholesky decomposition shouldn't be a problem. The compressions and dilations should represent the correlations: the same scaling that happens when you multiply by the square-root of the eigenvalues.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Sample from multivariate normal

Sat May 28, 2011 6:57 am
oops indeed I forgot the square root:

VectorXf samples = mean + eigenSolver.eigenvectors() * eigenSolver.eigenvalues().cwiseSqrt().asDiagonal() * VectorXf::NullaryExpr(std::ptr_fun(randN), size)

the attached figure illustrate what I have in mind regarding the non uniformity of the sampling: in this example, with Cholesky the half space x*y>0 is much smaller than the other one (x*y<0) but, unless I'm overseeing something, the same number of samples will be generated for each of these two half spaces...
LLT_vs_eigen.png
Swdshchf
Registered Member
Posts
5
Karma
0
It seems to work acceptably with either method unless the matrix isn't full rank. If I draw several thousand samples from a covariance matrix, using either method and plot the results, both look correct:

Generating covariance:
5.0 .75
.75 1.0

Cholesky method sample covariance:
5.08222 0.770755
0.770755 0.980655

Eigen method sample covariance:
4.86459 0.73129
0.73129 1.01234

If the matrix isn't full rank, things sometimes fall apart under the cholesky method, but the eigen method correctly generates samples along the diagonal x-y axis. So it may still be worthwhile to have a quick method to verify that a matrix is positive-definite.

Generating covariance:
1 1
1 1

Cholesky method sample covariance:
1.01644 1.02044
1.02044 1.99771

Eigen method sample covariance:
0.973271 0.973271
0.973271 0.973271
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS

Re: Sample from multivariate normal

Tue May 31, 2011 7:36 pm
there is the .info() method of LLT that can tell you if something went really wrong (i.e. a diagonal entry is <= 0). If you want a more conservative test you could compare the max and min of the diagonal of the L factor and if their ratio is too large then fallback to eigenvalues...
Swdshchf
Registered Member
Posts
5
Karma
0

Re: Sample from multivariate normal

Tue May 31, 2011 10:01 pm
Great. Thanks a bunch for both the info and the improved ways of structuring the code.


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], q.ignora, watchstar