-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CMA-ES #373
CMA-ES #373
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #373 +/- ##
==========================================
+ Coverage 99.73% 99.74% +0.01%
==========================================
Files 73 74 +1
Lines 6876 7167 +291
==========================================
+ Hits 6858 7149 +291
Misses 18 18 ☔ View full report in Codecov by Sentry. |
Until now I just have one minor remark: For all algorithms I tried to find speaking names. CPPA is cyclic_proximal_point. ALM and EPM also enjoy their long names,... |
Yes, I see. Unfortunately the five word name seems a bit too long and I don't have a better idea. |
I totally understand this. That is also why I tried to phrase this as careful as possible. |
I think those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, could maybe some of the stopping criteria be combined into one? Also, currently they have a bit cryptic names, but I can check them out later when they have a bit more description.
For example EqualFunValuesCondition
without a reference to CMAES (that is out of context) sounds super random and may even be confused to fit other solvers? I tried to use names StopWhenX
names for stopping criteria; then exporting all of them is also not that bad, and code stays relatively readable.
do you mean stopping criteria like |
I used names from appendix B.3 of Hansen's paper. We could surely have better names but they are fairly complex criteria that don't have nice
No, I mean those from Hansen's paper. |
Sure, that makes sense. |
No. That is too short, we would need something a bit longer I think ;) We could maybe check for an interpreting name? What does this mean that sigma times the largest eigenvalue of the covariance matrix exceeds some value?
Ah, ok. We then should check that paper for more details and (as above) for interpreting and not-too-long names. |
Here is a few ideas for naming
The overview paper, though Euclidean is great, but we could also mentiond https://ieeexplore.ieee.org/document/5299260 |
I'd skip those two for this PR actually. It seems that they are just for terminating a bit earlier than other conditions would when the search stagnates.
Yes, this is correct.
I think we usually say that the algorithm stagnates, not the cost. Maybe
That's a good idea.
I like
Thanks for the link to Colutto's paper, I must have missed it. They seem to skip the parallel transport of covariance matrix and base their work on an older variant of the Euclidean CMA-ES but otherwise it's more or less the same thing. The Dreisigmeyer's paper doesn't mention CMA-ES but it's also about direct optimization on manifolds so maybe it could be mentioned somewhere. |
That is also fine with me, I just thought personally, it could also be a very flat area, where the Evolution continues but the cost stagnates. So both For the Collate paper, I did not yet check that too closely, but it might still be fair to mention them. |
I've checked performance. For non-trivial examples it's bound by either objective calculation or eigendecomposition, so I've reworked the code a bit to make sure only one eigendecomposition per iteration is made. A standard trick in Euclidean CMA-ES is updating decomposition every few iterations rather than every single one but to make it work here we'd still need fast decomposition transport. It could be a fun follow-up project but for this PR I think it's fast enough. |
@kellertuer the latest failure is due to convex bundle method, maybe something you'd like to take a look at?
|
Increase the tolerance, we are reworking that algorithm a bit currently anyways, since those warnings and errors appeared more than we thought. |
Sure, that's good to know. |
CMA-ES seems to handle this problem: JuliaManifolds/ManoptExamples.jl#13 fairly well, and the performance looks competitive compared to Evolutionary.jl so I'd say this can be reviewed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this nice contribution, here is a few comments – I have not yet looked at the details of the stopping criteria, but they probably could also have longer doc strings I think.
Co-authored-by: Ronny Bergmann <[email protected]>
I think I've addressed all your points. |
I would still prefer a nicer constructor for the state with (a) the manifold as first argument to fill defaults properly) and that (b) initialises all internal / coipy things. The idea is that The current constructor does have M first, but retraction and vector transports for ecxample can be set to nice default (and become kwargs) so can the basis and rng, maybe even stop. The rest looks fine so far, I think. |
I've added defaults for some arguments but many of them could potentially be changed if someone wants to experiment. There is quite a lot of logic in |
Yes I saw that part of the logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the thorough work, I approve this for now, a tutorial for each solver would maybe be nice in the long run.
A tutorial could definitely be useful for the more involved solvers, this one is actually fairly straightforward to use. I think we can wait with registering a new Manopt version until #376 is merged. |
Still, a small tutorial for most solvers might be nice – maybe also just one tutorial for all “derivative-free” ones, since their calls are really similar. And sure with registration let's wait for that other PR. |
The Covariance Matrix Adaptation-Evolutionary Strategy algorithm adapted to the Riemannian setting.
TODO:
NoEffectAxis
,NoEffectCoord
,EqualFunValues
,Stagnation
,TolXUp
,TolFun
,TolX
).TODO beyond this PR:
spd_matrix_transport_to
is useful more generally.