-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GLM tests of scikit-learn #723
Comments
Thanks a lot! For future reference, these are the failing tests:
I can get the 12 failing L-BFGS related tests to pass by not standardizing the design matrix here. 64 failing tests to go. |
All the failing tests seem to be for unpenalized regression with a singular design matrix (either the wide problem: p=12, n=4, or the stacked problem where we duplicate all columns). Is that correct? Maybe this is a dumb question but what is the expected result in this case? I'm not surprised to see the tests failing in this case for glum, but in case we want to support this the tests are great! |
It is often said that singular design matrices don't allow for a solution, but this is wrong, there are just infinitely many solutions. For OLS, there is a particular nice one called minimal norm solution, i.e. the solution/coefficients having minimal L2 norm among all solutions/coefficients. I have at least one PR for the line search in mind that could help at least with a few of those test failures. |
Scikit-learn has some very strict tests for GLMs in https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/linear_model/_glm/tests/test_glm.py. I modified the file to test
glum.GeneralizedLinearRegressor
instead, see https://gist.github.com/lorentzenchr/2e319bcfd4aadfbea64c6330e5b33521. Runningpytest test_glm.py
results in 76 failed, 212 passed, 104 warnings.It might be interesting to include those tests in glum.
The text was updated successfully, but these errors were encountered: