- Why should I use Syne Tune, and not Ray Tune, Optuna, ...?
- What are the different installations options supported?
- How can I run on AWS and SageMaker?
- What are the metrics reported by default when calling the
Reporter
? - How can I utilize multiple GPUs?
- What is the default mode when performing optimization?
- How are trials evaluated on a local machine?
- What does the output of the tuning contain?
- Where can I find the output of the tuning?
- How can I enable trial checkpointing?
- Which schedulers make use of checkpointing?
- Is the tuner checkpointed?
- Where can I find the output of my trials?
- How can I plot the results of a tuning?
- How can I specify additional tuning metadata?
- How do I append additional information to the results which are stored?
- I don’t want to wait, how can I launch the tuning on a remote machine?
- How can I run many experiments in parallel?
- How can I access results after tuning remotely?
- How can I specify dependencies to remote launcher or when using the SageMaker backend?
- How can I benchmark experiments from the command line?
- What different schedulers do you support? What are the main differences between them?
- How do I define the search space?
- How can I visualize the progress of my tuning experiment with Tensorboard?
- How can I add a new scheduler?
- How can I add a new tabular or surrogate benchmark?
HPO is an important problem since many years, with a healthy number of commercial and open source tools available. Notable examples for open source tools are Ray Tyne and Optuna. Here are some reasons why you may prefer Syne Tune over these alternatives.
- Lightweight and platform agnostic: Syne Tune is designed to work with different execution back-ends, so you are not locked into a particular distributed system architecture. Syne Tune runs with minimal dependencies.
- Wide range of modalities: Syne Tune supports multi-fidelity HPO, constrained HPO, multi-objective HPO, transfer tuning, cost-aware HPO.
- Simple, modular design: Rather than wrapping all sorts of other HPO frameworks, Syne Tune provides simple APIs and scheduler templates, which can easily be extended to your specific needs.
- Industry-strength Bayesian optimization: Syne Tune has special support for Gaussian process based Bayesian optimization. The same code powers modalities like multi-fidelity HPO, constrained HPO, or cost-aware HPO, having been tried and tested for several years in SageMaker services.
- Special support for researchers: Syne Tune allows for rapid development and comparison between different tuning algorithms. Its blackbox repository and simulator back-end run realistic simulations of experiments many times faster than real time.
If you are an AWS customer, there are additional good reasons to use Syne Tune over the alternatives:
- If you use AWS services or SageMaker frameworks day to day, Syne Tune works out of the box and fits into your normal workflow.
- Syne Tune is developed in collaboration with the team behind the Automatic Model Tuning service.
To install Syne Tune with minimal dependencies from pip, you can simply do:
pip install 'syne-tune[core]'
If you want in addition to install our own Gaussian process based optimizers, Ray Tune or Bore optimizer,
you can run pip install 'syne-tune[X]'
where X
can be
gpsearchers
: For built-in Gaussian process based optimizersaws
: AWS SageMaker dependenciesraytune
: For Ray Tune optimizersbenchmarks
: For installing dependencies required to run all benchmarksblackbox-repository
: Blackbox repository for simulated tuningkde
: For KDE optimizerbotorch
: Bayesian optimization from BOTorchextra
: For installing all the abovebore
: For Bore optimizer
For instance, pip install 'syne-tune[gpsearchers]'
will install Syne Tune along with many built-in Gaussian process
optimizers.
To install the latest version from git, run the following:
pip install git+https://github.com/awslabs/syne-tune.git
For local development, we recommend to use the following setup which will enable you to easily test your changes:
pip install --upgrade pip
git clone https://github.com/awslabs/syne-tune.git
cd syne-tune
pip install -e '.[extra]'
If you want to launch experiments on SageMaker rather than on your local machine, you will need access to AWS and SageMaker on your machine. Make sure that:
awscli
is installed (see this link)- AWS credentials have been set properly (see this link).
- The necessary SageMaker role has been created (see this page for instructions. If you've created a SageMaker notebook in the past, this role should already have been created for you).
The following command should run without error if your credentials are available:
python -c "import boto3; print(boto3.client('sagemaker').list_training_jobs(MaxResults=1))"
You can also run the following example that evaluates trials on SageMaker to test your setup.
python examples/launch_height_sagemaker.py
Whenever you call the reporter to log a result, the worker time-stamp, the worker time since the creation of the reporter and the number of times the reporter was called are logged under the fields st_worker_timestamp
, st_worker_time
, and st_worker_iter
. In addition, when running on SageMaker, a dollar-cost estimate is logged under the field st_worker_cost
.
To see this behavior, you can simply call the reporter to see those metrics:
from syne_tune.report import Reporter
reporter = Reporter()
for step in range(3):
reporter(step=step, metric=float(step) / 3)
# [tune-metric]: {"step": 0, "metric": 0.0, "st_worker_timestamp": 1644311849.6071281, "st_worker_time": 0.0001048670000045604, "st_worker_iter": 0}
# [tune-metric]: {"step": 1, "metric": 0.3333333333333333, "st_worker_timestamp": 1644311849.6071832, "st_worker_time": 0.00015910100000837701, "st_worker_iter": 1}
# [tune-metric]: {"step": 2, "metric": 0.6666666666666666, "st_worker_timestamp": 1644311849.60733, "st_worker_time": 0.00030723599996917983, "st_worker_iter": 2}
To utilize multiple GPUs you can either use the backend LocalBackend
which will run on the GPU available in a local machine. You can also run on a remote AWS machine with multiple GPUs using the local backend and the remote launcher, see I don’t want to wait, how can I launch the tuning on a remote machine? or run with the SageMaker backend which spins-up one training job per trial.
When evaluating trials on a local machine with LocalBackend
, by default each trial is allocated to the least occupied GPU by setting CUDA_VISIBLE_DEVICES
environment variable.
The default mode is min when performing optimization so the target metric is minimized. The mode can be configured when instantiating a scheduler.
When trials are executed locally (e.g. when LocalBackend
is used), each trial is evaluated as a different sub-process.
As such the number of concurrent configurations evaluated at the same time (set by n_workers
) should
account for the capacity of the machine where the trials are executed.
When running locally, the output of the tuning is saved under ~/syne-tune/{tuner-name}/
by default.
When running remotely on SageMaker, the output of the tuning is saved under /opt/ml/checkpoints/
by default and
the tuning output is synced regularly to s3://{sagemaker-default-bucket}/syne-tune/{tuner-name}/
.
If you run remote tuning via the CLI, the tuning output is synced to s3://{sagemaker-default-bucket}/syne-tune/{experiment-name}/{tuner-name}/
, where experiment-name
is the prefix of tuner-name
without the datetime extension (in the example above, experiment-name = 'train-height'
).
To change the path where tuning results are written, you can set the environment variable SYNETUNE_FOLDER
to the
folder that you want.
For instance, the following runs a tuning where results tuning files are written under ~/new-syne-tune-folder
:
export SYNETUNE_FOLDER="~/new-syne-tune-folder"
python examples/launch_height_baselines.py
You can also do the following for instance to permanently change the output folder of Syne Tune:
echo 'export SYNETUNE_FOLDER="~/new-syne-tune-folder"' >> ~/.bashrc && source ~/.bashrc
Syne Tune stores the following files metadata.json
, results.csv.zip
, and tuner.dill
which are respectively metadata of the tuning job, results obtained at each time-step and state of the tuner.
Since trials may be paused and resumed (either by schedulers or when using
spot-instances), the user may checkpoint intermediate results to avoid starting
computation from scratch. Model outputs and checkpoints must be written into a
specific local path given by the command line argument st_checkpoint_dir
.
Saving/loading model checkpoint from this directory enables to save/load the state
when the job is stopped/resumed (setting the folder correctly and uniquely per
trial is the responsibility of the backend), see
checkpoint_example.py
for a working example of a tuning script with checkpointing enabled.
When using SageMaker backend or tuning remotely, we use the
SageMaker checkpoint mechanism
under the hood to sync local checkpoints to s3. Checkpoints are synced to
s3://{sagemaker-default-bucket}/syne-tune/{tuner-name}/{trial-id}/
, where
sagemaker-default-bucket
is the default bucket for SageMaker.
There are some convenience functions which help you to implement checkpointing for your training script. Have a look at the example lstm_wikitext2.py:
- Checkpoints have to be written at the end of certain epochs (namely those
after which the scheduler may pause the trial). This is dealt with by
checkpoint_model_at_rung_level(config, save_model_fn, epoch)
. Here,epoch
is the current epoch, allowing the function to decide whether to checkpoint or not.save_model_fn
stores the current mutable state along withepoch
to a local path (see below). Finally,config
contains arguments provided by the scheduler (see below). - Before the training loop starts (and optionally), the mutable state to start
from has to be loaded from a checkpoint. This is done by
resume_from_checkpointed_model(config, load_model_fn)
. If the checkpoint has been loaded successfully, the training loop may start with epochresume_from + 1
instead of1
. Here,load_model_fn
loads the mutable state from a checkpoint in a local path, returning itsepoch
value if successful, which is returned asresume_from
.
In general, load_model_fn
and save_model_fn
have to be provided as part of
the script. For most PyTorch models, you can use pytorch_load_save_functions
to this end. In general, you will want to include the model, the optimizer,
and the learning rate scheduler. In our example above, optimizer and
learning rate scheduler are home-made, the state of the latter is contained in
mutable_state
.
Finally, the scheduler provides additional information about checkpointing in
config
. You don't have to worry about this:
add_checkpointing_to_argparse(parser)
adds corresponding arguments to the parser.
Checkpointing means storing the state of a trial (i.e., model parameters, optimizer or learning rate scheduler parameters), so that it can be paused and potentially resumed at a later point in time, without having to start training from scratch. The following schedulers make use of checkpointing:
- Promotion-based Hyperband:
HyperbandScheduler(type='promotion', ...)
, as well as other asynchronous multi-fidelity schedulers. The code runs without checkpointing, but in this case, any trial which is resumed is started from scratch. For example, if a trial was paused after 9 epochs of training and is resumed later, training starts from scratch and the first 9 epochs are wasted effort. Moreover, extra variance is introduced by starting from scratch, since weights may be initialized differently. It is not recommended running promotion-based Hyperband without checkpointing. - Population-based training:
PopulationBasedTraining
PBT does not work without checkpointing. - Synchronous Hyperband:
SynchronousGeometricHyperbandScheduler
, as well as other synchronous multi-fidelity schedulers. This code runs without checkpointing, but wastes effort in the same sense as promotion-based asynchronous Hyperband
Yes. When performing the tuning, the tuner state is regularly saved on the experiment path under tuner.dill
(every 10 seconds which can be configured with results_update_interval
).
This allows to use spot-instances when running a tuning remotely with the remote launcher. It also allows to
resume a past experiment or analyse the state of scheduler at any point.
When running LocalBackend
locally, results of trials are saved under ~/syne-tune/{tuner-name}/{trial-id}/
and contains the following files:
- config.json: configuration that is being evaluated in the trial
- std.err: standard error
- std.out: standard output
In addition all checkpointing files used by a training script such as intermediate model checkpoint will also be located there. This is exemplified in the following example:
tree ~/syne-tune/train-height-2022-01-12-11-08-40-971/
~/syne-tune/train-height-2022-01-12-11-08-40-971/
├── 0
│ ├── config.json
│ ├── std.err
│ ├── std.out
│ └── stop
├── 1
│ ├── config.json
│ ├── std.err
│ ├── std.out
│ └── stop
├── 2
│ ├── config.json
│ ├── std.err
│ ├── std.out
│ └── stop
├── 3
│ ├── config.json
│ ├── std.err
│ ├── std.out
│ └── stop
├── metadata.json
├── results.csv.zip
└── tuner.dill
When running tuning remotely with the remote launcher, only config.json
, metadata.json
, results.csv.zip
and tuner.dill
are synced with S3 unless store_logs_localbackend
in which case the trial logs and informations are also persisted.
The easiest way to plot the result of a tuning experiment is to call the following:
tuner = Tuner(
...
tuner_name="plot-results-demo",
)
tuner.run()
tuning_experiment = load_experiment(tuner.name)
tuning_experiment.plot()
This generates a plot of the best value found over time. Note that this you can also plot the results while the experiment is running as results are updated continuously.
By default, Syne Tune stores the time, the names and modes of the metric being tuner, the name of the entrypoint, the name backend and the scheduler name.
You can also add custom metadata to your tuning job by setting metadata
in Tuner
as follow:
tuner = Tuner(
...
tuner_name="plot-results-demo",
metadata={"tag": "special-tag", "user": "alice"},
)
All Syne Tune and user metadata are saved when the tuner starts under metadata.json
.
Results are processed and stored by callbacks passed to Tuner
, in particular see
tuner_callback.py.
In order to add more information to these results, you can inherit from StoreResultsCallback
.
A good example is given in
searcher_callback.py.
If you run experiments with tabulated benchmarks using the SimulatorBackend
, as demonstrated in
launch_nasbench201_simulated.py, results are stored by SimulatorCallback
instead, and you need to inherit from this class, as shown in
searcher_callback.py.
You can use the remote launcher to launch an experiment on a remote machine. The
remote launcher supports both LocalBackend
and SageMakerBackend
. In the
former case, multiple trials will be evaluated on the remote machine (one use-case
being to use a beefy machine), in the latter case trials will be evaluated as
separate SageMaker training jobs.
Here is an example for how to run tuning with the remote launcher: launch_height_sagemaker_remotely.py. It may be more convenient for you to write your own launcher scripts. Examples:
- Local backend: benchmarking/nursery/launch_local/launch_remote.py
- Simulator backend: benchmarking/nursery/benchmark_dehb/launch_remote.py
- SageMaker backend: benchmarking/nursery/launch_sagemaker/launch_remote.py
You can call the remote launcher multiple times to schedule a list of experiments. In some case, you will want more flexibility and directly write your experiment loop, you can check benchmark_loop for an example. Other examples include:
- Local backend: benchmarking/nursery/launch_local/launch_remote.py
- Simulator backend: benchmarking/nursery/benchmark_dehb/launch_remote.py
- SageMaker backend: benchmarking/nursery/launch_sagemaker/launch_remote.py
You can either call load_experiment("{tuner-name}")
which will download files from s3 if the experiment is not found locally. You can also sync directly files from s3 under ~/syne-tune/
folder in batch for instance by running:
aws s3 sync s3://{sagemaker-default-bucket}/syne-tune/{tuner-name}/ ~/syne-tune/ --include "*" --exclude "*tuner.dill"
To get all results without the tuner state (you can ommit the include and exclude if you also want to include the tuner state).
When you run remote code, you often need to install packages (e.g. scipy) or have custom code available.
- To install packages, you can add a file
requirements.txt
in the same folder as your endpoint script. All those packages will be installed by SageMaker when docker container starts. - To include custom code (for instance a library that you are working on), you can set the parameter
dependencies
on the remote launcher or on a SageMaker framework to a list of folders. The folders indicated will be compressed, sent to s3 and added to the python path when the container starts. You can see launch_remote.py for an example setting dependencies in a SageMaker estimator.
The most flexible way to do so is to write a custom remote launcher script. This is explained here. Other examples are here and here for the SageMaker backend. Examples for comparing different methods using the simulator backend (on tabulated benchmarks) are here and here. You can easily configure them to your experimental setup.
We refer to HPO algorithms as schedulers. A scheduler decides which configurations to assign to new trials, but also when to stop a running or resume a paused trial. Some schedulers delegate the first decision to a searcher. The most important differences between schedulers in the single-objective case are:
- Does the scheduler stop trials early or pause and resume trials (
HyperbandScheduler
) or not (FIFOScheduler
). The former requires a resource dimension (e.g., number of epochs; size of training set) and slightly more elaborate reporting (e.g., evaluation after every epoch), but can outperform the latter by a large margin. - Does the searcher suggest new configurations by uniform random sampling
(
searcher='random'
) or by sequential model-based decision-making (searcher='bayesopt'
,searcher='kde'
). The latter can be more expensive if a lot of trials are run, but can also be more sample-efficient.
An overview of this landscape is given here.
Here is a tutorial for multi-fidelity schedulers. Further schedulers provided by Syne Tune include:
- Population based training (PBT)
- Multi-objective asynchronous successive halving (MOASHA)
- Constrained Bayesian optimization
- Bayesian optimization by density-ratio estimation (BORE)
- Regularized evolution
- Median stopping rule
- Synchronous Hyperband
- Differential Evolution Hyperband (DEHB)
- Hyper-Tune
- Transfer learning schedulers
- Wrappers for Ray Tune schedulers
Most of those methods can be accessed with short names by from baselines.py.
While the training script defines the function to be optimized, some care needs to be taken to define the search space for the hyperparameter optimization problem. This being a global optimization problem without gradients easily available, it is most important to reduce the number of parameters. Some advice is given here.
A powerful approach is to run experiments in parallel. Namely, split your hyperparameters into groups A, B, such that HPO over B is tractable. Draw a set of N configurations from A at random, then start N HPO experiments in parallel, where in each of them the search space is over B only, while the parameters in A are fixed. Syne Tune supports massively parallel experimentation, see examples in benchmarking/nursery/.
To visualize the progress of Syne Tune in
Tensorboard, you can pass the
TensorboardCallback
to the Tuner
object:
tuner = Tuner(
...
callbacks=[TensorboardCallback()],
)
Note that, you need to install TensorboardX to use this callback. You can install it by:
pip install tensorboardX
This will log all metrics that are reported in your training script via the report(...) function. Now, to open Tensorboard, run:
tensorboard --logdir ~/syne-tune/{tuner-name}/tensorboard_output
If you want to plot the cumulative optimum of the metric you want to optimize, you can pass the target_metric
argument to TensorboardCallback. This will also report the best found hyperparameter configuration over time.
This is explained in detail in this tutorial. Please do consider contributing back your efforts to the Syne Tune community, thanks!
To add a new dataset of tabular evaluations, you need to
- write a blackbox recipe able to regenerate it by extending
BlackboxRecipe
. You need in particular to provide the name of the blackbox, the reference so that users are prompted to cite the appropriated paper and a code that can generate it from scratch, seelcbench.py
for an example. - add your new recipe class in
recipes
to make it available in Syne Tune.