Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A possible bug in group coefficients for group Lasso #54

Open
SzymonNowakowski opened this issue Jun 24, 2022 · 0 comments
Open

A possible bug in group coefficients for group Lasso #54

SzymonNowakowski opened this issue Jun 24, 2022 · 0 comments

Comments

@SzymonNowakowski
Copy link

SzymonNowakowski commented Jun 24, 2022

Hi,

I believe I have found a possible bug in group coefficients constraint for penalty="grLasso" in grpreg. Penalty vignette states "the coefficients within a group will either all equal zero or none will equal zero". This constraint seems to be broken in a data example I came across.

To reproduce this potential bug please execute:

#load data example from github
library(Rfssa)
load_github_data("https://github.com/SzymonNowakowski/DMRnet/blob/testing_branch/data/promoX.RData")
load_github_data("https://github.com/SzymonNowakowski/DMRnet/blob/testing_branch/data/promoy.RData")

#prepare data in grpreg-accepted format
y <- ifelse(y == levels(y)[2], 1, 0)   
X <- stats::model.matrix(y~., data = data.frame(y=y, X, check.names = TRUE))[, -1, drop=FALSE]
group <- rep(1:57, each = 3)

#run grpreg
library(grpreg)
fit<-grpreg(X, y, group = group, penalty="grLasso", family="binomial")

#examine coefficients
coef(fit)[44:46,1:9] 
#      0.2199        0.2133       0.2069       0.2008       0.1948     0.189       0.1834       0.1779    0.1726
# X15c      0 -2.139903e-17 8.559611e-17 1.069951e-16 2.139903e-16 0.0000000 5.991728e-16 3.423845e-16 0.0000000
# X15g      0  3.998654e-02 7.887908e-02 1.167650e-01 1.537286e-01 0.1898477 2.251939e-01 2.598331e-01 0.2938264
# X15t      0  9.878343e-02 1.948222e-01 2.882890e-01 3.793487e-01 0.4681499 5.548270e-01 6.395019e-01 0.7222856

It is probably a numerical problem - you'll notice some coefficients very close to 0 in the first row X15c of the output, however for the lambdas 0.189 and 0.1726 it is not close to 0, but exactly 0, breaking the abovementioned constraint.

Thank you in advance for having a look into it,
Szymon
PS. The data I used is a subset of Promoter dataset isolated for the purpose of reproducing this behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant