This is a Cookiecutter template for creating a Orion Algorithm Plugin.
Install Python requirements to use the template:
$ pip install -r requirements.txt
Create a new project directly from the template on GitHub:
$ cookiecutter gh:Epistimio/cookiecutter-orion.algo
plugin_name []: skopt
author_name []: Xavier Bouthillier
author_short [Author Name]:
author_email []: [email protected]
github_username []: bouthilx
copyright [2021, Author Name]:
short_description [TODO]:
algo_name []: BayesianOptimizer
algo_module_name [bayesianoptimizer]: bayes
Field | Description | |
plugin_name |
Will be used for orion.algo.plugin_name | |
author_name |
For metadata of python package | |
author_short |
For metadata of python package | |
author_email |
For metadata of python package | |
github_username |
Username to build the url for installation | |
copyright |
For the BSD-3 license (You can change the license) | |
short_description |
For metadata of python package | |
algo_name |
Name for the algorithm class | |
algo_module_name |
Name of the algorihtm module |
This will create the following package structure.
orion.algo.{plugin_name} ├── README.rst ├── setup.cfg ├── setup.py ├── MANIFEST.in ├── LICENSE (BSD License) ├── versioneer.py ├── tox.ini ├── dev-requirements.txt ├── doc │ ├── requirements.txt │ └── src │ ├── conf.py │ └── index.rst ├── tests │ ├── requirements.txt │ ├── integration_test.py │ └── benchmark.py └── src └── orion └── algo └── {plugin_name} ├── {algoname}.py ├── __init__.py └── _version.py
The important files to modify are src/orion/algo/{plugin_name}/{module_name}.py
to implement the
algorithm and tests/benchmark/{algo_name}.yaml
to fill the arguments required for the algorithm
you implement.
LICENSE
Note that you are free to change the License, copyright is to your name.
versioneer.py
src/orion/algo/{plugin_name}/_version.py
This serves to version automatically your algo, just ignore these if you don't plan to make releases.
tests
These are the automatically generated tests.
tests/benchmark
Automatically generated benchmark test using the yaml files created in the same folder.
doc
Automatically generated template for documentation
tox.ini
Tox file defining commands to run tests, build doc and publish code.
Tip |
The cookiecutter creates a fully functionning random search plugin. All tests should pass right away. Make sure your environment is setup properly and that all tests are passing as expected before making any changes in the code. |
class YourAlgorithm(BaseAlgorithm):
requires_type = "real"
requires_dist = "linear"
requires_shape = "flattened"
Some algorithms have special requirements on the data. Maybe the search space should be real only,
logarithmic distributions should be linearized, or multidimensional
data should be flattened. The algorithm class can specify
these special requirements through the class attributes
requires_type
, requires_dist
and requires_shape
.
See documentation of build_required_space to see the special
requirements supported.
When Oríon will instantiate an algorithm, it will wrap it
in a SpaceTransformAlgoWrapper and create a transformed space
based on the requirements. The SpaceTransformAlgoWrapper
will
take care of converting the hyperparameters so that they are
compliant with the search space outsite the algorithm and
compliant with the algorithm's requirements when passed to the algorithm.
Suppose the algorithm has requires_type = "real"
, then
Oríon will assign a transformed space to the algorithm in which
all dimensions are transformed to be real. For instance, categorical dimensions are turned to
one-hot vectors. The algorithm can then work in this real space
solely. When the algorithm suggest trials, the
SpaceTransformAlgoWrapper will convert the hyperparameters
of these trials so that they are valid in the original space. For
instance, one-hot vectors are converted back to categorical values.
@property
def space(self):
"""Domain of problem associated with this algorithm's instance."""
return self._space
@space.setter
def space(self, space):
"""Set space."""
self._space = space
At creation time, the algorithm is passed the original space
as argument space
to its constructor __init__
.
The algorithm is then assigned the transformed space through the setter
algorithm.space = transformed_space
.
This is the right place to keep a reference to the original space if you need to.
See skopt's implementation for an example. Keeping a reference to the original space
can be useful for instance to verify if an algorithm has tried every possible values
in the original space. See is_done()
implementation of skopt's implementation for
an example using the original space.
Tests that should pass at this point | Description |
test_get_id |
Test that id of trial is computed properly. |
test_configuration |
Test that configuration contains all expected attributes. |
def __init__(self, space, seed=None, some='other', custom='arguments'):
super(YourAlgorithm, self).__init__(space, seed=seed, some=some, custom=custom)
The initialization of the algorithm must pass space and seed to super().__init__
, but must also
pass any other argument that should be part of the configuration of the algorithm. Any argument passed
to super()
will be assigned as an attribute to the algorithm and will be included in
algo.configuration
, which is used to save the configuration of the algorithm in storage.
def suggest(self, num):
The method to suggest new trials. The argument num
request the number of trials that the algorithm must sample. The algorithm may return less
than num
if it is impossible to return more, and even return an empty list when
it is not possible to suggest a new trial. This may happen for example if there are too many trials
still being executed.
The algorithm should verify with self.has_suggested(trial)
if a trial
was already suggested to make sure trials returned by suggest()
are new ones.
Before returning a new trial, make sure to register it with self.register(trial)
so that self.has_suggested(trial)
works properly.
Tip |
Utility functions in format_trials are convenient to convert sets of hyperparameters from
dictionary format or list of values to Trial objects. |
Tests that should pass at this point | Description |
test_suggest_n |
Test that the algorithm is able to suggest n trials. |
test_has_suggested |
Test that trials suggested are properly registered |
test_n_suggested |
Test that trials suggested are properly registered and counted. |
def observe(self, trials):
The method to observe new status or results of trials.
Observe may be called multiple time for the same trials until they are completed or broken.
See Trial's documentation for more information on their status or results attributes.
An observed trial should be registered using self.register(trial)
so that
self.has_observed(trial)
can work properly. Note that has_observed(trial)
may return
true only if a trial is completed. It will return False if a trial is broken.
All trials registered with self.register(trial)
are stored in a dictionary
self._trials_info
in the format
{self.get_id(trial): (trial, trial.objective.value if trial.objective else None)}
.
Warning |
You should use The ID of a trial may be different in a transformed space than in the original space. This is because the ID is a hash of the trial's hyperparameters, and thus the transformation of the hyperparameters is affecting the hash. Only the ID in the original space matters for Oríon and should be used to detect whether two trials are equal or equivalent in the original search space. |
Tests that should pass at this point | Description |
test_observe |
Test that algorithm can observe without any expection. The validity of observe is not verified here. |
test_has_observed |
Test that observed completed trials are registered properly. |
test_n_observed |
Test that observed completed trials are registered and counted properly. |
test_real_data |
Test that algorithm can work with float values. |
test_int_data |
Test that algorithm can suggest and observe discrete values |
test_cat_data |
Test that algorithm can suggest and observe categorical values |
test_logint_data |
Test that algorithm can work with discrete loguniform dimensions. |
test_logreal_data |
Test that algorithm can work with real loguniform dimensions. |
test_shape_data |
Test that algorithm can work with arrays. |
def seed_rng(self, seed=None):
This method must seed the internal state of the algorithm so that it would always sample the same sequence of points.
You may need to seed global random number generators such as random
or numpy.random
if you
are wrapping a third party library using them.
Tests that should pass at this point | Description |
test_seed_rng |
Test that seeding the algorithm can be seeded properly. |
test_seed_rng_init |
Test that the algorithm is seeded during instantiation. |
@property
def state_dict(self):
The state dict is used to store the algorithm's state in storage so that it can be shared across runners (the main processes running the optimization). Make sure to include any attribute that may change during the execution of the algorithm, including the state of random number generator used. Return copies of these attributes to avoid any side-effects. There is no need to include the values of the algorithm's configuration.
def set_state(self, state_dict):
Stateful attributes of the algorithm are reset using the given state_dict
. Note that
set_state
must be compliant with state_dict
and use
the same structure.
Tests that should pass at this point | Description |
test_state_dict |
Test that state_dict can be used to resume algorithm and sample the same trials deterministically. |
test_has_observed_statedict |
Test that state_dict sets back observed trials. |
test_has_suggested_statedict |
Test that state_dict sets back suggested trials. |
def is_done(self, points, results):
This method is used to signal whether the algorithm is done or not. The base implementation
verifies if all possible values have been tried or whether more than self.max_trials
trials
have been tried. self.max_trials
is an attribute set by Oríon based on the experiments
max_trials
value, you should not add it to your algorithm's configuration.
You will need to overwrite this method if your algorithm use a transformed space, because
transformed space can have a larger domain than the original one
(ex: working with real values instead integers). The original space is the one that matters
for the experiment and it should thus be the space used to infer whether all possible values
have been tried. See skopt's implementation for an example of a custom is_done
implementation to handle transformed space.
Tests that should pass at this point | Description |
test_is_done_cardinality |
Test that algorithms is done when all possible values has been tried. |
test_is_done_max_trials |
Test that algorithm is done when maximum number of trials have been tried. |
test_optimize_branin |
Test that algorithm reaches a reasonable objective when optimizing branin. |
The base algorithm class implements several other methods that rarely requires to be overwritten but can prove very useful. Among In particular, you should pay attention to get_id(trial), fidelity_index, register(trial), has_suggested(trial), and has_observed(trial).
To test the freshly built package, you must first install the requirements. From within the new package, run
$ pip install -r dev-requirements.txt
$ pip install -r tests/requirements.txt
You can then run the unit-tests with
$ pytest tests/integration_test.py
or using tox
,
$ tox -e py36
Note that the algorithm pre-built is random search so that you can start from a fully working environment and test your way through the modifications.
There is also the option of running the toy-benchmark to compare the performance of your algorithm with random search and TPE.
$ python tests/benchmark.py
The benchmark will generate interactive plotly figures saved as Branin_AverageResult.html
and RosenBrock_AverageResult.html
.
Finally, official plugins must follow the same code quality standards than orion.core
. Therefore
there are tests included in the pre-built package for black
, isort
and pylint
. You can
execute them with
$ tox -e black
To fix black
or isort
issues, you can use the tox commands run-black
or run-isort
.
For example:
$ tox -e run-black