Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic step sizes for SVRG #207

Open
wants to merge 37 commits into
base: development
Choose a base branch
from

Conversation

bagibence
Copy link
Collaborator

Attempt to automatically determine the batch- and step sizes for SVRG when fitting a GLM with Poisson observations and a softplus inverse link function.
Based on this paper.

@bagibence
Copy link
Collaborator Author

This needs to be more exhaustively tested to make sure it works on real datasets.
I have even encountered toy examples where my current implementation didn't work either.

@bagibence
Copy link
Collaborator Author

The regularization strength might have to be added to the L and L_max constants determined.
It could also be useful as bound for the convexity when using ridge.

@BalzaniEdoardo BalzaniEdoardo marked this pull request as ready for review August 26, 2024 12:19
@BalzaniEdoardo
Copy link
Collaborator

This PR provides the infrastructure for computing an optimal stepsize and batch_size for SVRG based on the GLM configurations.

The optimal hyperparameters depends on the loss function L-smoothness. This means that for each model configuration (observation noise, link function, regularization), one may need to to compute a different estimate of the smoothness parameters.

Here, I implemented a look-up table that should be easy to extend whenever new estimates becomes available (for example if we derive the L-smoothness for Gamma + softplus observations).

@BalzaniEdoardo BalzaniEdoardo marked this pull request as draft August 26, 2024 12:44
@BalzaniEdoardo BalzaniEdoardo marked this pull request as ready for review August 26, 2024 15:00
@BalzaniEdoardo
Copy link
Collaborator

BalzaniEdoardo commented Oct 8, 2024

With my edits, I added:

  • A new module solvers, which is private (not in the nemos.__init__) which includes:
    • _svrg.py: the SVRG implementation;
    • _svrg_defaults.py: functions to compute default params for svrg (batch and step size)
    • _compute_defaults.py: includes the lookup table that receives a model as input and checks if defaults are available.
    • added an abstract method for BaseRegressor responsible of selecting the configurations for optimizing the solver parameters. This should be easy to extend.
  • A number of tests that checks the behavior of the GLM defaults over all possible configuration of regularizers, obs models, link function;

else:
assert opt_state.stepsize > 0
assert isinstance(opt_state.stepsize, float)
model.fit(X, y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider removing model.fit() calls in this test (as well as in test_glm_optimal_config_set_initial_state_pytree, and both functions with the same names in the TestPopulationGLM class) since its not being consistently called across all cases, and since no checks are happening after the model is fit

else:
assert (
"stepsize" in result and result["stepsize"] > 0
), "Stepsize should be computed since it was not provided."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I missed it in one of the previous tests, but I didn't notice any test that explicitly tests the result of a computed stepsize. They only check that stepsize > 0. Is it worth it to have a test check the result explicitly, similar to test_calculate_optimal_batch_size_svrg_all_config at the end?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants