-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support methods to calculate AIC and BIC values of biglasso fit #12
Comments
Because the
|
Many thanks - yes that will do it, perfect! Could still be nice though to formally define AIC and BIC methods... cheers & thanks again! Ha and would you know if implementing positivity and/or box constraints on the coefficients would be hard to implement in biglasso (see the other issue I posted a while ago)? |
Note that I'm not sure at all of the AIC formula. You can search for the "real" ones in many posts:
None of those works well on my example. It would be easy to make an AIC method for the class biglasso, but we need to agree on some formula (for the linear and logistic regressions). Also note that I'm not sure AIC/BIC can be used for this kind of models: https://en.wikipedia.org/wiki/Bayesian_information_criterion#Limitations |
For the gaussian case (together with glmnet) I was using the formulae used here: For the binomial case the recipe would be similar but would then use the log likelihood directly. The tricky thing I am not sure of myself is how to calculate the effective degree of freedom for the elastic net in general (the ic.glmnet uses effective degrees of the LASSO for everything) - I think the correct formula is given here For ridge regression an effective AIC and effective degrees of freedom can be calculated using the rms package, see If biglasso would return a slot $df with a correct estimate of the effective degrees of freedom for LASSO, ridge or elastic net, then the formulae for AIC or BIC will of course be easy enough... Main thing though would be to get the effective nr of degrees of freedom correct... |
The slot And do you know the formula used by the ncvreg package?
|
Ha yes sorry that's what I meant!
and the degrees of freedom are calculated as in glmnet, ie as the total nr of nonzero coefficients, which in |
For me
To get rid of the downward bias you could probably do adaptive LASSO instead, or just re-estimate the selected covariates using regular least squares... And using BIC with |
Can you format the code? It is difficult to read your answers right now. |
Done! |
This is already supported by data(colon)
X <- as.big.matrix(colon$X)
fit <- biglasso(X, colon$y)
head(AIC(fit))
# 0.3022 0.2932 0.2844 0.276 0.2677 0.2598
# 88.53887 89.06932 87.65349 86.29061 84.97980 83.72012
> head(BIC(fit))
# 0.3022 0.2932 0.2844 0.276 0.2677 0.2598
# 92.79314 95.45072 94.03489 92.67201 91.36120 90.10152 @YaohuiZeng, it seems to me you can close this issue. |
The problem is that the path of
|
Not sure what you mean by "doesn't always have a minimum" (some value is guaranteed to be the smallest, right?). As for it not being smooth, that is certainly true, but I would argue that this is more of a fundamental limitation of AIC/BIC than a problem with the In my experience, this requires looking at a plot of the AIC/BIC results and using your judgment (see below). I've tried to come up with automated ways of choosing the best AIC/BIC/OtherIC, but it's hard to come up with something foolproof. data(colon)
X <- as.big.matrix(colon$X)
fit <- biglasso(X, colon$y)
ll <- log(fit$lambda)
plot(ll, BIC(fit), xlim=rev(range(ll)), pch=19, las=1) Or for something smoother: ss <- smooth.spline(ll, BIC(fit))
plot(ss, xlim=rev(range(ll)), pch=19, las=1) |
What I meant is that for the examples I tested, AIC kept decreasing (so the minimum was the last value). I'll try to test it again. |
OK, that's kind of what I thought. But that's a fundamental flaw of AIC -- it completely breaks down in p > n situations, and its use as a model selection criteria for penalized regression is not justified. |
I was wondering if it would be possible to also support the calculation of AIC and BIC values of a biglasso fit, similar to the way that the ncvreg package provides this (returning a vector of AIC and BIC values for each lambda value used). This would enable selection of the best lambda based on AIC or BIC, which is faster than based on cross validation.
The text was updated successfully, but these errors were encountered: