-
Notifications
You must be signed in to change notification settings - Fork 111
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solver research direction #254
Comments
So you want to implement the approach from this publication? Or are open to any ideas which might improve solver gains? |
Any ideas. Especially research into why Adam doesn't performs as well as SGD for some (important) problems. |
I believe the choice of optimizer depends on the class of problem - its not an across the board 'this one is best' - so this is is not at all surprising. I am assuming this is to be done via the Cortex layer and not in CUDA or TensorFlow? Downloaded the paper to read. |
It surprised a lot of very experienced practitioners in machine learning at NIPS; for a long time we were all trying to get rid of hyperparameters and there are a large set of problems where Adam and friends do provably converge faster; just not overparameterized machine learning problems. Here I think is the paper that was quite interesting: |
Oh, and if you can figure out concretely why this is and fix it for hyper-parameterless optimizers then you have your Ph. D I think :-); so if I were you I wouldn't worry about cortex vs. tf. |
https://arxiv.org/abs/1710.09278
The text was updated successfully, but these errors were encountered: