Benchmark loop

This folder shows one way to run quick experiments running different scheduler on different benchmarks and plot results once they are done.

To run all experiments, you can run the following:

pip install -r benchmarking/benchmark_loop/requirements.txt
python benchmarking/benchmark_loop/benchmark_main.py --experiment_tag "my-new-experiment" --num_seeds 2

Which will run all combinations of methods/benchmark/seeds on your local computer (may take a few hours).

Once all evaluations are done, you can pull results by running:

python benchmarking/benchmark_loop/plot_results.py --experiment_tag "my-new-experiment"

you will obtain a plot like this one showing the confidence intervals of performance over time:

You can also only run only one scheduler by doing python benchmarking/benchmark_loop/benchmark_main.py --method RS, see benchmark_main.py to see all options supported.

To launch the evaluation remotely, you can also run python benchmarking/benchmark_loop/launch_remote.py --experiment_tag "my-new-experiment" which evaluate everything in a remote machine.

To evaluate other methods/benchmarks, you can edit the following files:

baselines.py: dictionary of HPO methods to be evaluated
benchmark_definitions.py: dictionary of simulated benchmark to evaluate
benchmark_main.py: script to launch evaluations, run all combinations by default
launch_remote.py: script to launch evaluations on a remote instance
plot_results.py: script to plot results obtained
requirements.txt: dependencies to be installed when running on a remote machine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Benchmark loop

Files

README.md

Latest commit

History

README.md

File metadata and controls

Benchmark loop