-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New benchmark #339
New benchmark #339
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #339 +/- ##
==========================================
+ Coverage 99.45% 99.57% +0.12%
==========================================
Files 69 69
Lines 6418 6402 -16
==========================================
- Hits 6383 6375 -8
+ Misses 35 27 -8 ☔ View full report in Codecov by Sentry. |
Instead of
I am not sure why strong Wolfe errors, that would be interesting to know. We could also remove the two existing benchmarks since I never ran them – in the long run we could set up a benchmark CI, but I have not yet fully understood how they work and which packages one would use for that. |
ext/ManoptLineSearchesExt.jl
Outdated
@@ -59,6 +59,7 @@ function (cs::Manopt.LineSearchesStepsize)( | |||
return α | |||
catch ex | |||
if isa(ex, LineSearches.LineSearchException) | |||
println(ex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer a re-throw of the error here; and maybe we should introduce such errors in our line searches as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then the easiest solution is to just not catch the error. We could have that in our line searches too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
The test failure seems unrelated to my changes (it's the ALM solver). |
Well, ALM has a subsolver that often is L-BFGS. So if you ALM breaks, Quasi Newton changed (maybe an error, but sometimes ALM is also a bit unstable). But it is an effect of changing L-BFGS. |
OK, then I will pay attention to it. |
I also think it is not the optimal solution, but I also do agree, that thoroughly working through that code, making it (1) more Manopt.jl-like and (b) documenting it thoroughly might take more time indeed. |
@@ -345,6 +345,11 @@ function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, iter) | |||
M = get_manifold(mp) | |||
get_gradient!(mp, qns.X, qns.p) | |||
qns.direction_update(qns.η, mp, qns) | |||
if real(inner(M, qns.p, qns.η, qns.X)) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, one of the issues in some cases that LineSearches.jl caught but Manopt didn't is that L-BFGS selected a non-descent direction. IIRC I had a similar issues with CG at some point. This fixes the problem but I don't know if this is the right way to solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How far is that negative? Rounding errors? To me this fix looks more like hard-core reset if error :/ I would prefer an actual fix if this is a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much more than a rounding error. You can try running the test_case_manopt()
function from the PR to observe the issue. The direction gets wrong by the third iteration. I'd prefer a better fix too but it's fairly difficult to figure it out.
It never appears in my tests on flat Euclidean where Manopt's iterations are very close to Optim.jl's iterations. So, I don't know, but maybe it has something to do with vector transports. Both projection transport and parallel transport give this error though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this fix, both Manopt and Optim converge well on Rosenbrock with spherical constraints. Sometimes Manopt is faster, and sometimes Optim. For Euclidean Rosenbrock, Manopt is consistently a bit faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this “fix” is till of the kind “if it is wrong reset, restart” instead of finding out what is wrong. For projection transport I think this can happen, since that only approximates, it should not happen for parallel transport. But with now 3 master students, I will not have much time beyond my own small PRs I have planned (or want to finish).
But I would still prefer to not have such duck-typing fixes in the code. If it is wrong, let's better find the error than just saying “Error appeared? Restart!”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to figure it out but a bug in linesearch doesn't sound like a plausible explanation to me.
VerificationState
sounds like something that would be nightmarishly difficult to implement. I believe such optional sanity checks could be useful in the future, even if they are off by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, I've found an independent reproducer: #346 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I did not want to imply all linesearch is wrong but maybe your improvements had a slight flaw in one specific case. But great that you found an example and good that it even works with PT.
Then we could indeed keep the warning here and keep that (and the duct tape fix) until we fix the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this enough in Zulip, and dressed this in tutorial mode. I hope I even added it here? We could add a tutorial debug for this if you feel that helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, good that you remind me. I thought we already had this check merged in Manopt.jl? This is 100% not a tutorial-only issue because some combinations of line search/manifold/vector transport will lead to non-descent directions here which we need to catch (or the user gets a very hard to debug problem). Those combinations are still useful because they sometimes converge faster than "always descent direction" alternatives.
I'm trying to slowly wrap this up. How would you organize the hyperparameter tuner? It's a mix of somewhat generic code and example-specific code. |
Hi, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a very great start to a thorough Benchmark.
My maybe most central remark or question is
How would we add this in a consistent way?
The benchmark_comparison.jl
for now is a script;
should that become something we can run every now and then on a CI?
Should the results be part of the docs?
They could then be updated on a branch whenever the benchmark is run (either on CI and committed or when run manually).
then one could for example turn the current benchmark into a Quarto notebook.
Does the benchmark run several solvers and several examples? This could maybe be modularized
return storage | ||
end | ||
|
||
optimize(f_rosenbrock, g_rosenbrock!, [0.0, 0.0], LBFGS()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the Optim.jl
run? One could mention that in a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; this will be removed or commented when preparing final version.
|
||
optimize(f_rosenbrock, g_rosenbrock!, [0.0, 0.0], LBFGS()) | ||
|
||
function g_rosenbrock!(M::AbstractManifold, storage, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could maybe be called gradient_Rosenbrock
? Or is the Optim.jl
convention to use g_
for the gradient? Then this could be mentioned here.
|
||
x0 = zeros(N) | ||
x0[1] = 0 | ||
optim_state = optimize(f_rosenbrock, g_rosenbrock!, x0, method_optim, options_optim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous all, this could also be one line just the optimize
call whose return value is returned?
using PythonCall | ||
include("benchmark_comparison.jl") | ||
|
||
# This script requires optuna to be available through PythonCall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no experience with optuna, could this be documented a bit what the goal of this is?
|
||
|
||
""" | ||
mutable struct ObjectiveData{TObj,TGrad} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still undocumented and the functor below quite technical. Could we maybe document that a bit more to also make it understandable for either me or also you in some future time ;) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will document it, I'm just not sure what style of describing to use here. I will add a tutorial-style commentary.
or too much pruning (if values here are too low) | ||
regenerate using `lbfgs_compute_pruning_losses` | ||
""" | ||
function lbfgs_study(; pruning_coeff::Float64=0.95) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also looks quite technical, so.I am not so sure what it does and what its purpose in the end is?
Should this later be a tutorial?
@@ -5,6 +5,12 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this | |||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | |||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). | |||
|
|||
## [0.4.53] unreleased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number has to be adapted.
@@ -345,6 +345,11 @@ function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, iter) | |||
M = get_manifold(mp) | |||
get_gradient!(mp, qns.X, qns.p) | |||
qns.direction_update(qns.η, mp, qns) | |||
if real(inner(M, qns.p, qns.η, qns.X)) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discussed this enough in Zulip, and dressed this in tutorial mode. I hope I even added it here? We could add a tutorial debug for this if you feel that helps.
If you would prefer having scripts that can be run easier (than Quarto Notebooks) we could also look into the new Quarto Scripts which I wanted to try on something anyways. |
Currently I'm leaning towards turning it into a "how to choose the right solver (and its options) for your problem" tutorial. I'm not sure how (and for what problems) run it on CI. Note that the optimization script is quite demanding computationally despite all that advanced machinery.
I've experimented with several examples but I haven't decided which one to use for an example. Most likely not Rosenbrock on sphere but I will decide when we select the right format.
I really liked Quarto notebooks until updating Julia broke my settings and I could not get it back to work for a couple of hours (it still doesn't work, I just gave up; it makes the Jupyter notebook but the Julia kernel refuses to run). That might be the main problem with turning it into a tutorial. |
Remember to recompile I would also be fine with these being benchmarks and a bit less tutorial focussed. |
I did recompile |
I just tried again, activating the Conda.jl environment. |
Oh Python depedencies – I nearly never manage to get that right. But from Julia with |
I've moved the interesting parts of this PR to separate PRs so this one can be closed I think. |
BTW, I tried to address some of your comments in the version submitted to ManoptExamples. |
To compare performance of Manopt.jl and Optim.jl. TODO:
StrongWolfe
errors for Manopt.StopWhenGradientInfNormLess
?