Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New benchmark #339

Closed
wants to merge 42 commits into from
Closed

New benchmark #339

wants to merge 42 commits into from

Conversation

mateuszbaran
Copy link
Member

@mateuszbaran mateuszbaran commented Dec 28, 2023

To compare performance of Manopt.jl and Optim.jl. TODO:

  • Why does Manopt perform more iteration? I'm trying to match parameters of Optim.jl and Manopt.jl as much as possible. Line search is different but for now StrongWolfe errors for Manopt.
  • Are we fine with StopWhenGradientInfNormLess?

@mateuszbaran mateuszbaran added the WIP Work in Progress (for a pull request) label Dec 28, 2023
Copy link

codecov bot commented Dec 28, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (8851619) 99.45% compared to head (649bd55) 99.57%.
Report is 2 commits behind head on master.

❗ Current head 649bd55 differs from pull request most recent head 92db97e. Consider uploading reports for the commit 92db97e to get more accurate results

Files Patch % Lines
src/solvers/quasi_Newton.jl 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #339      +/-   ##
==========================================
+ Coverage   99.45%   99.57%   +0.12%     
==========================================
  Files          69       69              
  Lines        6418     6402      -16     
==========================================
- Hits         6383     6375       -8     
+ Misses         35       27       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kellertuer
Copy link
Member

Instead of StopWhenGradientInfNormLess one could also consider a norm= keyword for StopWhenGradientNormLess – but that should still default to the 2-norm, since

  • basis would introduce more complexity for users with individual manifolds
  • we would change the default which I would consider breaking.

I am not sure why strong Wolfe errors, that would be interesting to know.

We could also remove the two existing benchmarks since I never ran them – in the long run we could set up a benchmark CI, but I have not yet fully understood how they work and which packages one would use for that.

@@ -59,6 +59,7 @@ function (cs::Manopt.LineSearchesStepsize)(
return α
catch ex
if isa(ex, LineSearches.LineSearchException)
println(ex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer a re-throw of the error here; and maybe we should introduce such errors in our line searches as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then the easiest solution is to just not catch the error. We could have that in our line searches too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@mateuszbaran
Copy link
Member Author

The test failure seems unrelated to my changes (it's the ALM solver).

@kellertuer
Copy link
Member

Well, ALM has a subsolver that often is L-BFGS. So if you ALM breaks, Quasi Newton changed (maybe an error, but sometimes ALM is also a bit unstable). But it is an effect of changing L-BFGS.

@mateuszbaran
Copy link
Member Author

OK, then I will pay attention to it.

@mateuszbaran
Copy link
Member Author

Ref. JuliaNLSolvers/LineSearches.jl#173 .

@mateuszbaran mateuszbaran mentioned this pull request Jan 1, 2024
@kellertuer
Copy link
Member

I also think it is not the optimal solution, but I also do agree, that thoroughly working through that code, making it (1) more Manopt.jl-like and (b) documenting it thoroughly might take more time indeed.

@@ -345,6 +345,11 @@ function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, iter)
M = get_manifold(mp)
get_gradient!(mp, qns.X, qns.p)
qns.direction_update(qns.η, mp, qns)
if real(inner(M, qns.p, qns.η, qns.X)) > 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, one of the issues in some cases that LineSearches.jl caught but Manopt didn't is that L-BFGS selected a non-descent direction. IIRC I had a similar issues with CG at some point. This fixes the problem but I don't know if this is the right way to solve it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How far is that negative? Rounding errors? To me this fix looks more like hard-core reset if error :/ I would prefer an actual fix if this is a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more than a rounding error. You can try running the test_case_manopt() function from the PR to observe the issue. The direction gets wrong by the third iteration. I'd prefer a better fix too but it's fairly difficult to figure it out.

It never appears in my tests on flat Euclidean where Manopt's iterations are very close to Optim.jl's iterations. So, I don't know, but maybe it has something to do with vector transports. Both projection transport and parallel transport give this error though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this fix, both Manopt and Optim converge well on Rosenbrock with spherical constraints. Sometimes Manopt is faster, and sometimes Optim. For Euclidean Rosenbrock, Manopt is consistently a bit faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this “fix” is till of the kind “if it is wrong reset, restart” instead of finding out what is wrong. For projection transport I think this can happen, since that only approximates, it should not happen for parallel transport. But with now 3 master students, I will not have much time beyond my own small PRs I have planned (or want to finish).

But I would still prefer to not have such duck-typing fixes in the code. If it is wrong, let's better find the error than just saying “Error appeared? Restart!”

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to figure it out but a bug in linesearch doesn't sound like a plausible explanation to me.

VerificationState sounds like something that would be nightmarishly difficult to implement. I believe such optional sanity checks could be useful in the future, even if they are off by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I've found an independent reproducer: #346 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I did not want to imply all linesearch is wrong but maybe your improvements had a slight flaw in one specific case. But great that you found an example and good that it even works with PT.
Then we could indeed keep the warning here and keep that (and the duct tape fix) until we fix the bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this enough in Zulip, and dressed this in tutorial mode. I hope I even added it here? We could add a tutorial debug for this if you feel that helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good that you remind me. I thought we already had this check merged in Manopt.jl? This is 100% not a tutorial-only issue because some combinations of line search/manifold/vector transport will lead to non-descent directions here which we need to catch (or the user gets a very hard to debug problem). Those combinations are still useful because they sometimes converge faster than "always descent direction" alternatives.

@mateuszbaran
Copy link
Member Author

I'm trying to slowly wrap this up. How would you organize the hyperparameter tuner? It's a mix of somewhat generic code and example-specific code. ObjectiveData is technically generic but I'd imagine most people using it would have to tweak it anyway -- there is simply too many fine details to cover every possible use case through an API. Probably ManoptExamples.jl would be the right place, at least for now?

@kellertuer
Copy link
Member

Hi,
I would first have to check what that function does, but will try to find time for that the next few days then, but yeah, ManoptExamples would probably ne a good place. Note that in there we also already have Rosenbrock.

Copy link
Member

@kellertuer kellertuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a very great start to a thorough Benchmark.

My maybe most central remark or question is

How would we add this in a consistent way?
The benchmark_comparison.jl for now is a script;
should that become something we can run every now and then on a CI?
Should the results be part of the docs?
They could then be updated on a branch whenever the benchmark is run (either on CI and committed or when run manually).

then one could for example turn the current benchmark into a Quarto notebook.

Does the benchmark run several solvers and several examples? This could maybe be modularized

return storage
end

optimize(f_rosenbrock, g_rosenbrock!, [0.0, 0.0], LBFGS())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the Optim.jl run? One could mention that in a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; this will be removed or commented when preparing final version.


optimize(f_rosenbrock, g_rosenbrock!, [0.0, 0.0], LBFGS())

function g_rosenbrock!(M::AbstractManifold, storage, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could maybe be called gradient_Rosenbrock? Or is the Optim.jl convention to use g_ for the gradient? Then this could be mentioned here.

benchmarks/benchmark_comparison.jl Show resolved Hide resolved

x0 = zeros(N)
x0[1] = 0
optim_state = optimize(f_rosenbrock, g_rosenbrock!, x0, method_optim, options_optim)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the previous all, this could also be one line just the optimize call whose return value is returned?

using PythonCall
include("benchmark_comparison.jl")

# This script requires optuna to be available through PythonCall
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no experience with optuna, could this be documented a bit what the goal of this is?



"""
mutable struct ObjectiveData{TObj,TGrad}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still undocumented and the functor below quite technical. Could we maybe document that a bit more to also make it understandable for either me or also you in some future time ;) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will document it, I'm just not sure what style of describing to use here. I will add a tutorial-style commentary.

or too much pruning (if values here are too low)
regenerate using `lbfgs_compute_pruning_losses`
"""
function lbfgs_study(; pruning_coeff::Float64=0.95)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also looks quite technical, so.I am not so sure what it does and what its purpose in the end is?
Should this later be a tutorial?

@@ -5,6 +5,12 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.53] unreleased
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number has to be adapted.

@@ -345,6 +345,11 @@ function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, iter)
M = get_manifold(mp)
get_gradient!(mp, qns.X, qns.p)
qns.direction_update(qns.η, mp, qns)
if real(inner(M, qns.p, qns.η, qns.X)) > 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed this enough in Zulip, and dressed this in tutorial mode. I hope I even added it here? We could add a tutorial debug for this if you feel that helps.

@kellertuer
Copy link
Member

If you would prefer having scripts that can be run easier (than Quarto Notebooks) we could also look into the new Quarto Scripts which I wanted to try on something anyways.

@mateuszbaran
Copy link
Member Author

This looks like a very great start to a thorough Benchmark.

My maybe most central remark or question is

How would we add this in a consistent way? The benchmark_comparison.jl for now is a script; should that become something we can run every now and then on a CI? Should the results be part of the docs? They could then be updated on a branch whenever the benchmark is run (either on CI and committed or when run manually).

Currently I'm leaning towards turning it into a "how to choose the right solver (and its options) for your problem" tutorial. I'm not sure how (and for what problems) run it on CI. Note that the optimization script is quite demanding computationally despite all that advanced machinery.

Does the benchmark run several solvers and several examples? This could maybe be modularized

I've experimented with several examples but I haven't decided which one to use for an example. Most likely not Rosenbrock on sphere but I will decide when we select the right format.

If you would prefer having scripts that can be run easier (than Quarto Notebooks) we could also look into the new Quarto Scripts which I wanted to try on something anyways.

I really liked Quarto notebooks until updating Julia broke my settings and I could not get it back to work for a couple of hours (it still doesn't work, I just gave up; it makes the Jupyter notebook but the Julia kernel refuses to run). That might be the main problem with turning it into a tutorial.

@kellertuer
Copy link
Member

I really liked Quarto notebooks until updating Julia broke my settings and I could not get it back to work for a couple of hours (it still doesn't work, I just gave up; it makes the Jupyter notebook but the Julia kernel refuses to run). That might be the main problem with turning it into a tutorial.

Remember to recompile IJulia. That is one of the main reasons I did so much Pkg.activate()... stuff in the documentation. If it happens that a new Julia version comes along, at least recompiling IJulia is crucial.

I would also be fine with these being benchmarks and a bit less tutorial focussed.

@mateuszbaran
Copy link
Member Author

Remember to recompile IJulia. That is one of the main reasons I did so much Pkg.activate()... stuff in the documentation. If it happens that a new Julia version comes along, at least recompiling IJulia is crucial.

I did recompile IJulia. This script actually needs to run with the Conda.jl Python but I'm not sure how to run the notebook in its Jupyter. Or I did something wrong trying.

@mateuszbaran
Copy link
Member Author

I just tried again, activating the Conda.jl environment. quarto render fails throwing "No module named yaml" despite me being able to import it when I run python REPL. Directly running the browser notebook complains about IJulia not being installed despite me just doing build IJulia a moment ago.

@kellertuer
Copy link
Member

Oh Python depedencies – I nearly never manage to get that right. But from Julia with CondaPkg.jl (and their config file) I got it consistent.

@mateuszbaran
Copy link
Member Author

I've moved the interesting parts of this PR to separate PRs so this one can be closed I think.

@mateuszbaran
Copy link
Member Author

BTW, I tried to address some of your comments in the version submitted to ManoptExamples.

@kellertuer kellertuer deleted the mbaran/new-benchmark branch May 4, 2024 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in Progress (for a pull request)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants