This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Is there any Library (Eigen) to calculate PCA?

Tags: None
(comma "," separated)
nihad
Registered Member
Posts
1
Karma
0
I have a text file which consists of 907 objects and 1000 feature vector for each object. Is there any Library (Eigen) to calculate Principle Component Analysis(PCA)?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
PCA is essentially an eigenvalue problem, so assuming your data have been loaded into the matrix mat, you can compute the centered covariance matrix like this:

MatrixXd centered = mat.rowwise() - mat.colwise().mean();
MatrixXd cov = centered.adjoint() * centered;

and then perform the eigendecomposition:

SelfAdjointEigenSolver<MatrixXd> eig(cov);

From eig, you have access to the sorted (increasing order) eigenvalues (eig.eigenvalues()) and respective eigenvectors (eig.eigenvectors()). For instance, eig.eigenvectors().rightCols(N) gives you the best N-dimension basis.
zhanxw
Registered Member
Posts
17
Karma
0
Is there a method to calculate largest K eigenvalues and their corresponding eigenvectors?
The small eigenvalues and their eigenvectors are not useful in my application.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
if your matrices are large and you need only a few eigenvalues/vectors you can use the ARPACK wrapper in unsupported/
gregborenstein
Registered Member
Posts
1
Karma
0
I think you mean to apply the eigenvalue solve to the covariance matrix not the original matrix:

SelfAdjointEigenSolver<MatrixXd> eig(cov);
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
corrected.
inspirit
Registered Member
Posts
15
Karma
0
Hello,

i'm having the same problem figuring correct way to compute PCA.
But instead of using EigenDecompostion i want to use SVD for PCA.

the input is MxN Matrix X where
M -- num observations
N -- dimension of each observation

according to definition after meaning the data:
Code: Select all
Eigen::MatrixXd aligned = X.rowwise() - X.colwise().mean();

// we can directly take SVD
Eigen::JacobiSVD<Eigen::MatrixXd> svd(aligned, Eigen::ComputeThinV);

// and here is the question what is the basis matrix and how can i reduce it
// in my understanding it should be:
Eigen::MatrixXd W = svd.matrixV().leftCols(num_components);

// then to project the data:
Eigen::MatrixXd projected = aligned * W; // or should we take a transpose() of W ?
// but in our case it is matrix where each row is a sample
// what if i have just a single feature vector which will be column vector Eigen::VectorXd ? how do we apply projection?


please can anyone explain me the correct way of doing that things?
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
This look correct to me. For a single feature f do: f_projected = W.transpose() * f
inspirit
Registered Member
Posts
15
Karma
0
Thanx for the answer.
to conclude it:
Code: Select all
// this is our basis
Eigen::MatrixXd W = svd.matrixV().leftCols(num_components);

// to project data that was used for basis (rows -- samples; cols -- variables)
Eigen::MatrixXd projected = aligned_data * W;

// to project a single feature vector
Eigen::VectorXd feature; // it should be centred before of course
Eigen::VectorXd feature_projected = W.transpose() * feature;

// in case of row expression do we need to transpose it before doing above?
Eigen::RowVectorXd row_feature;
Eigen::VectorXd row_feature_projected = W.transpose() * row_feature.transpose();
// or we can do it as we did with main data above?
Eigen::VectorXd row_feature_projected = row_feature * W;


please fix me if i'm wrong with above
thanx for your time.
User avatar
ggael
Moderator
Posts
3447
Karma
19
OS
Recall that A^T * B^T = (B*A)^T, so the last two lines are the same. To be precise, one returns a column vector while the other return a row vector, but operator= automatically transposes row to column vectors (and the other way round).
inspirit
Registered Member
Posts
15
Karma
0
yup thats clear now!
thanks!
houssem
Registered Member
Posts
1
Karma
0
ggael wrote:PCA is essentially an eigenvalue problem, so assuming your data have been loaded into the matrix mat, you can compute the centered covariance matrix like this:

MatrixXd centered = mat.rowwise() - mat.colwise().mean();
MatrixXd cov = centered.adjoint() * centered;

and then perform the eigendecomposition:

SelfAdjointEigenSolver<MatrixXd> eig(cov);

From eig, you have access to the sorted (increasing order) eigenvalues (eig.eigenvalues()) and respective eigenvectors (eig.eigenvectors()). For instance, eig.eigenvectors().rightCols(N) gives you the best N-dimension basis.


Hi, I would like to know why you use the adjoint instead of the inverse?


Bookmarks



Who is online

Registered users: Bing [Bot], Evergrowing, Google [Bot], rblackwell