Reply to topic

Least Squares with Singularities

BigDaddyDrew
Registered Member
Posts
14
Karma
0

Least Squares with Singularities

Fri Jun 01, 2012 7:55 pm
Hi all,

I use the following method from the docs to do linear regression:
Code: Select all
X.jacobiSvd( ComputeThinU | ComputeThinV ).solve( y )


It works great, except when my pesky X matrix has singularities. Is there a way to get a least squares solution that deals with this in a sexy way? When running the same regressions in R (for example), it simply squashes one of the variables causing the Singularity.

Thanks in advance!
User avatar ggael
Moderator
Posts
2206
Karma
15
OS
It's already supposed to work that way. What do you get in such a case? What's the value of (X*res-y).borm()/y.norm() where res = X.jacobiSvd( ComputeThinU | ComputeThinV ).solve( y )
BigDaddyDrew
Registered Member
Posts
14
Karma
0
Thanks for the reply Gael.

I'm expecting coefficients on the order of 0.0001. I have two columns that are actually identical. R squishes one of those columns and reports a NaN coefficient, and sensible values for the other coefficients (on the correct order). Eigen gives me +6.3e16 coefficient for one offending column, and -6.3e16 coefficient for the other offending column. The non-offending variables are also impacted: their coefficients come out on the order of 0.1.

I ran the following code (assuming you meant 'norm' and not 'borm' :) ) and received a value of +58.3652. Not sure how to interpret this, sorry. R gives me a value of 0 for the equation, because the numerator portion is 0.

Code: Select all
(X*res-y).norm()/y.norm()
User avatar ggael
Moderator
Posts
2206
Karma
15
OS
This might be an overflow issue. Are you using float or double?
BigDaddyDrew
Registered Member
Posts
14
Karma
0
doubles

More random information:

My X matrix is full of 'observation' type variables (all values 0 or 1)...so that's why it's easy on occasion to have equal columns.

In this example I have 30 variables, and 330 observations.

Excel even gives me sensible results (surprise!). It gives me 0.0 for one of the two offending columns.
User avatar ggael
Moderator
Posts
2206
Karma
15
OS
Hm, ok it seems I can reproduce the issue. In the meantime you can use QR:

res = X.householderQr().solve(y);

or

res = X.colPivHouseholderQr().solve(y);

or even full piv LU:

res = X.fullPivLu().solve(y);
BigDaddyDrew
Registered Member
Posts
14
Karma
0

Re: Least Squares with Singularities

Sat Jun 02, 2012 10:32 pm
Thanks for the suggestions Gael,

Of those, here are my observations for my particular regression:

HouseholdQR matches R, except that it still returns nonsense for the offending variables (coefficients of + and - 5.46e12)

ColPivHouseholdQR matches R exactly (well, ColPiv chose a different variable to squash, and it squashed it with a 0 rather than NaN/NA)

FUllPivLU was way off - my coefficients only correlated 29% with those from R.

I think I'll go with HouseholdQR, since I already have logic to drop bizarre coefficients. As such, I'll interpret these as NaN anyway (rather than ColPivHousholdQR which returns a 0 for an offending column. 0 is an acceptable value in my domain, so I can't detect it as 'bad').
dim_tz
Registered Member
Posts
3
Karma
0
BigDaddyDrew wrote:doubles

More random information:

My X matrix is full of 'observation' type variables (all values 0 or 1)...so that's why it's easy on occasion to have equal columns.

In this example I have 30 variables, and 330 observations.

Excel even gives me sensible results (surprise!). It gives me 0.0 for one of the two offending columns.


Do these methods give directly a least-squares solution though?

 
Reply to topic

Bookmarks



Who is online

Registered users: Alastair, apater, Baidu [Spider], Bing [Bot], Exabot [Bot], garthecho, ggael, Google [Bot], google01103, Horus, jitseniesen, Joif, kde-jriddell, ken300, La Ninje, Mamarok, mmistretta, Nuc!eoN, psbot [Picsearch], rbruce, renatoatilio, sacarde, sinclair, stephans, TheraHedwig, tienhung, wolfi323, Yahoo [Bot]