Skip to content

Commit 7f8d7c6

Browse files
CMA-ES (#373)
* initial work on CMA-ES * new reference * transport p vectors, better printing and documentation * cond stopping criterion and some docs * add test file to runtests * even more stopping criteria * Add TolFunCondition * poorly conditioned test case and some docs * renaming * a bit of docs * tests for stopping criteria * Performance improvements * bump tolerance * Renaming, new test * some docs * Apply suggestions from code review Co-authored-by: Ronny Bergmann <[email protected]> * address some review comments * addressing review * coverage * add defaults, more asserts * Update docs * minor docs improvements --------- Co-authored-by: Ronny Bergmann <[email protected]>
1 parent 1ea2ac6 commit 7f8d7c6

12 files changed

+1128
-7
lines changed

Changelog.md

+6
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.4.59] - unreleased
9+
10+
### Added
11+
12+
* A Riemannian variant of the CMA-ES (Covariance Matrix Adaptation Evolutionary Strategy) algorithm, `cma_es`.
13+
814
## [0.4.58] - March 18, 2024
915

1016
### Added

docs/make.jl

+1
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,7 @@ makedocs(;
170170
"Alternating Gradient Descent" => "solvers/alternating_gradient_descent.md",
171171
"Augmented Lagrangian Method" => "solvers/augmented_Lagrangian_method.md",
172172
"Chambolle-Pock" => "solvers/ChambollePock.md",
173+
"CMA-ES" => "solvers/cma_es.md",
173174
"Conjugate gradient descent" => "solvers/conjugate_gradient_descent.md",
174175
"Convex bundle method" => "solvers/convex_bundle_method.md",
175176
"Cyclic Proximal Point" => "solvers/cyclic_proximal_point.md",

docs/src/references.bib

+24
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,21 @@ @article{ChambollePock:2011
264264
VOLUME = {40}
265265
}
266266

267+
268+
@article{ColuttoFruhaufFuchsScherzer:2010,
269+
title = {The {CMA}-{ES} on {Riemannian} {Manifolds} to {Reconstruct} {Shapes} in 3-{D} {Voxel} {Images}},
270+
volume = {14},
271+
issn = {1941-0026, 1089-778X},
272+
url = {http://ieeexplore.ieee.org/document/5299260/},
273+
doi = {10.1109/TEVC.2009.2029567},
274+
number = {2},
275+
journal = {IEEE Transactions on Evolutionary Computation},
276+
author = {Colutto, S. and Fruhauf, F. and Fuchs, M. and Scherzer, O.},
277+
month = apr,
278+
year = {2010},
279+
pages = {227--245},
280+
}
281+
267282
@book{ConnGouldToint:2000,
268283
DOI = {10.1137/1.9780898719857},
269284
YEAR = {2000},
@@ -412,6 +427,15 @@ @article{HagerZhang:2006
412427
VOLUME = {2},
413428
YEAR = {2006}
414429
}
430+
431+
@article{Hansen:2023,
432+
AUTHOR = {Hansen, Nikolaus},
433+
TITLE = {The {CMA} {Evolution} {Strategy}: {A} {Tutorial}},
434+
JOURNAL = {ArXiv Preprint},
435+
URL = {http://arxiv.org/abs/1604.00772},
436+
NUMBER = {1604.00772},
437+
YEAR = {2023},
438+
}
415439
@article{HestenesStiefel:1952,
416440
AUTHOR = {M.R. Hestenes and E. Stiefel},
417441
DOI = {10.6028/jres.049.044},

docs/src/solvers/cma_es.md

+57
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# Covariance matrix adaptation evolutionary strategy
2+
3+
```@meta
4+
CurrentModule = Manopt
5+
```
6+
7+
The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimzation include [ColuttoFruhaufFuchsScherzer:2010](@cite).
8+
The algorithm is suitable for global optimization.
9+
10+
Covariance matrix transport between consecutive mean points is handled by `eigenvector_transport!` function which is based on the idea of transport of matrix eigenvectors.
11+
12+
```@docs
13+
cma_es
14+
```
15+
16+
## State
17+
18+
```@docs
19+
CMAESState
20+
```
21+
22+
## Stopping Criteria
23+
24+
```@docs
25+
StopWhenBestCostInGenerationConstant
26+
StopWhenCovarianceIllConditioned
27+
StopWhenEvolutionStagnates
28+
StopWhenPopulationCostConcentrated
29+
StopWhenPopulationDiverges
30+
StopWhenPopulationStronglyConcentrated
31+
```
32+
33+
## [Technical details](@id sec-cma-es-technical-details)
34+
35+
The [`cma_es`](@ref) solver requires the following functions of a manifold to be available
36+
37+
* A [`retract!`](@extref ManifoldsBase :doc:`retractions`)`(M, q, p, X)`; it is recommended to set the [`default_retraction_method`](@extref `ManifoldsBase.default_retraction_method-Tuple{AbstractManifold}`) to a favourite retraction. If this default is set, a `retraction_method=` does not have to be specified.
38+
* A [`vector_transport_to!`](@extref ManifoldsBase :doc:`vector_transports`)`M, Y, p, X, q)`; it is recommended to set the [`default_vector_transport_method`](@extref `ManifoldsBase.default_vector_transport_method-Tuple{AbstractManifold}`) to a favourite retraction. If this default is set, a `vector_transport_method=` does not have to be specified.
39+
* A [`copyto!`](@extref `Base.copyto!-Tuple{AbstractManifold, Any, Any}`)`(M, q, p)` and [`copy`](@extref `Base.copy-Tuple{AbstractManifold, Any}`)`(M,p)` for points and similarly `copy(M, p, X)` for tangent vectors.
40+
* [`get_coordinates!`](@extref `ManifoldsBase.get_coordinates-Tuple{AbstractManifold, Any, Any, ManifoldsBase.AbstractBasis}`)`(M, Y, p, X, b)` and [`get_vector!`](@extref `ManifoldsBase.get_vector-Tuple{AbstractManifold, Any, Any, ManifoldsBase.AbstractBasis}`)`(M, X, p, c, b)` with respect to the [`AbstractBasis`](@extref `ManifoldsBase.AbstractBasis`) `b` provided, which is [`DefaultOrthonormalBasis`](@extref `ManifoldsBase.DefaultOrthonormalBasis`) by default from the `basis=` keyword.
41+
* An [`is_flat`](@extref `ManifoldsBase.is_flat-Tuple{AbstractManifold}`)`(M)`.
42+
43+
## Internal helpers
44+
45+
You may add new methods to `eigenvector_transport!` if you know a more optimized implementation
46+
for your manifold.
47+
48+
```@docs
49+
Manopt.eigenvector_transport!
50+
```
51+
52+
## Literature
53+
54+
```@bibliography
55+
Pages = ["cma_es.md"]
56+
Canonical=false
57+
```

docs/src/solvers/quasi_Newton.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ The [`quasi_Newton`](@ref) solver requires the following functions of a manifold
125125
* A [`vector_transport_to!`](@extref ManifoldsBase :doc:`vector_transports`)`M, Y, p, X, q)`; it is recommended to set the [`default_vector_transport_method`](@extref `ManifoldsBase.default_vector_transport_method-Tuple{AbstractManifold}`) to a favourite retraction. If this default is set, a `vector_transport_method=` or `vector_transport_method_dual=` (for ``\mathcal N``) does not have to be specified.
126126
* By default quasi Newton uses [`ArmijoLinesearch`](@ref) which requires [`max_stepsize`](@ref)`(M)` to be set and an implementation of [`inner`](@extref `ManifoldsBase.inner-Tuple{AbstractManifold, Any, Any, Any}`)`(M, p, X)`.
127127
* the [`norm`](@extref `LinearAlgebra.norm-Tuple{AbstractManifold, Any, Any}`) as well, to stop when the norm of the gradient is small, but if you implemented `inner`, the norm is provided already.
128-
* A [`copyto!](@extref `Base.copyto!-Tuple{AbstractManifold, Any, Any}`)`(M, q, p)` and [`copy`](@extref `Base.copy-Tuple{AbstractManifold, Any}`)`(M,p)` for points and similarly `copy(M, p, X)` for tangent vectors.
128+
* A [`copyto!`](@extref `Base.copyto!-Tuple{AbstractManifold, Any, Any}`)`(M, q, p)` and [`copy`](@extref `Base.copy-Tuple{AbstractManifold, Any}`)`(M,p)` for points and similarly `copy(M, p, X)` for tangent vectors.
129129
* By default the tangent vector storing the gradient is initialized calling [`zero_vector`](@extref `ManifoldsBase.zero_vector-Tuple{AbstractManifold, Any}`)`(M,p)`.
130130

131131
Most Hessian approximations further require [`get_coordinates`](@extref `ManifoldsBase.get_coordinates-Tuple{AbstractManifold, Any, Any, ManifoldsBase.AbstractBasis}`)`(M, p, X, b)` with respect to the [`AbstractBasis`](@extref `ManifoldsBase.AbstractBasis`) `b` provided, which is [`DefaultOrthonormalBasis`](@extref `ManifoldsBase.DefaultOrthonormalBasis`) by default from the `basis=` keyword.

src/Manopt.jl

+27-3
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,20 @@ using Colors
1818
using DataStructures: CircularBuffer, capacity, length, push!, size, isfull
1919
using Dates: Millisecond, Nanosecond, Period, canonicalize, value
2020
using LinearAlgebra:
21-
Diagonal, I, eigen, eigvals, tril, Symmetric, dot, cholesky, eigmin, opnorm
21+
cond,
22+
Diagonal,
23+
I,
24+
Eigen,
25+
eigen,
26+
eigen!,
27+
eigvals,
28+
tril,
29+
Symmetric,
30+
dot,
31+
cholesky,
32+
eigmin,
33+
opnorm,
34+
mul!
2235
using ManifoldDiff:
2336
adjoint_differential_log_argument,
2437
adjoint_differential_log_argument!,
@@ -93,6 +106,7 @@ using ManifoldsBase:
93106
inner,
94107
inverse_retract,
95108
inverse_retract!,
109+
is_flat,
96110
is_point,
97111
is_vector,
98112
log,
@@ -102,6 +116,7 @@ using ManifoldsBase:
102116
mid_point!,
103117
norm,
104118
number_eltype,
119+
number_of_coordinates,
105120
power_dimensions,
106121
project,
107122
project!,
@@ -128,10 +143,10 @@ using Markdown
128143
using Preferences:
129144
@load_preference, @set_preferences!, @has_preference, @delete_preferences!
130145
using Printf
131-
using Random: shuffle!, rand, randperm
146+
using Random: AbstractRNG, default_rng, shuffle!, rand, randn!, randperm
132147
using Requires
133148
using SparseArrays
134-
using Statistics: cor, cov, mean, std
149+
using Statistics: cor, cov, mean, median, std
135150

136151
include("plans/plan.jl")
137152
# solvers general framework
@@ -142,6 +157,7 @@ include("solvers/alternating_gradient_descent.jl")
142157
include("solvers/augmented_Lagrangian_method.jl")
143158
include("solvers/convex_bundle_method.jl")
144159
include("solvers/ChambollePock.jl")
160+
include("solvers/cma_es.jl")
145161
include("solvers/conjugate_gradient_descent.jl")
146162
include("solvers/cyclic_proximal_point.jl")
147163
include("solvers/difference_of_convex_algorithm.jl")
@@ -416,6 +432,8 @@ export adaptive_regularization_with_cubics,
416432
convex_bundle_method!,
417433
ChambollePock,
418434
ChambollePock!,
435+
cma_es,
436+
cma_es!,
419437
conjugate_gradient_descent,
420438
conjugate_gradient_descent!,
421439
cyclic_proximal_point,
@@ -477,18 +495,24 @@ export StopAfter,
477495
StopWhenAll,
478496
StopWhenAllLanczosVectorsUsed,
479497
StopWhenAny,
498+
StopWhenBestCostInGenerationConstant,
480499
StopWhenChangeLess,
481500
StopWhenCostLess,
482501
StopWhenCostNaN,
502+
StopWhenCovarianceIllConditioned,
483503
StopWhenCurvatureIsNegative,
484504
StopWhenEntryChangeLess,
505+
StopWhenEvolutionStagnates,
485506
StopWhenGradientChangeLess,
486507
StopWhenGradientNormLess,
487508
StopWhenFirstOrderProgress,
488509
StopWhenIterateNaN,
489510
StopWhenLagrangeMultiplierLess,
490511
StopWhenModelIncreased,
512+
StopWhenPopulationCostConcentrated,
491513
StopWhenPopulationConcentrated,
514+
StopWhenPopulationDiverges,
515+
StopWhenPopulationStronglyConcentrated,
492516
StopWhenSmallerOrEqual,
493517
StopWhenStepsizeLess,
494518
StopWhenSubgradientNormLess,

src/plans/stopping_criterion.jl

+17-2
Original file line numberDiff line numberDiff line change
@@ -901,9 +901,22 @@ mutable struct StopWhenAny{TCriteria<:Tuple} <: StoppingCriterionSet
901901
StopWhenAny(c::Vector{StoppingCriterion}) = new{typeof(tuple(c...))}(tuple(c...), "")
902902
StopWhenAny(c::StoppingCriterion...) = new{typeof(c)}(c, "")
903903
end
904+
905+
# _fast_any(f, tup::Tuple) is functionally equivalent to any(f, tup) but on Julia 1.10
906+
# this implementation is faster on heterogeneous tuples
907+
@inline _fast_any(f, tup::Tuple{}) = true
908+
@inline _fast_any(f, tup::Tuple{T}) where {T} = f(tup[1])
909+
@inline function _fast_any(f, tup::Tuple)
910+
if f(tup[1])
911+
return true
912+
else
913+
return _fast_any(f, tup[2:end])
914+
end
915+
end
916+
904917
function (c::StopWhenAny)(p::AbstractManoptProblem, s::AbstractManoptSolverState, i::Int)
905918
(i == 0) && (c.reason = "") # reset on init
906-
if any(subC -> subC(p, s, i), c.criteria)
919+
if _fast_any(subC -> subC(p, s, i), c.criteria)
907920
c.reason = string((get_reason(subC) for subC in c.criteria)...)
908921
return true
909922
end
@@ -957,6 +970,8 @@ function Base.:|(s1::StopWhenAny, s2::T) where {T<:StoppingCriterion}
957970
return StopWhenAny(s1.criteria..., s2)
958971
end
959972

973+
is_active_stopping_criterion(c::StoppingCriterion) = !isempty(c.reason)
974+
960975
@doc raw"""
961976
get_active_stopping_criteria(c)
962977
@@ -974,7 +989,7 @@ end
974989
# for non-array containing stopping criteria, the recursion ends in either
975990
# returning nothing or an 1-element array containing itself
976991
function get_active_stopping_criteria(c::sC) where {sC<:StoppingCriterion}
977-
if c.reason != ""
992+
if is_active_stopping_criterion(c)
978993
return [c] # recursion top
979994
else
980995
return []

0 commit comments

Comments
 (0)