Skip to content

Latest commit

 

History

History
37 lines (25 loc) · 1.63 KB

File metadata and controls

37 lines (25 loc) · 1.63 KB

Benchmark loop

This folder shows one way to run quick experiments running different scheduler on different benchmarks and plot results once they are done.

To run all experiments, you can run the following:

pip install -r benchmarking/benchmark_loop/requirements.txt
python benchmarking/benchmark_loop/benchmark_main.py --experiment_tag "my-new-experiment" --num_seeds 2

Which will run all combinations of methods/benchmark/seeds on your local computer (may take a few hours).

Once all evaluations are done, you can pull results by running:

python benchmarking/benchmark_loop/plot_results.py --experiment_tag "my-new-experiment"

you will obtain a plot like this one showing the confidence intervals of performance over time:

alt text

You can also only run only one scheduler by doing python benchmarking/benchmark_loop/benchmark_main.py --method RS, see benchmark_main.py to see all options supported.

To launch the evaluation remotely, you can also run python benchmarking/benchmark_loop/launch_remote.py --experiment_tag "my-new-experiment" which evaluate everything in a remote machine.

To evaluate other methods/benchmarks, you can edit the following files:

  • baselines.py: dictionary of HPO methods to be evaluated
  • benchmark_definitions.py: dictionary of simulated benchmark to evaluate
  • benchmark_main.py: script to launch evaluations, run all combinations by default
  • launch_remote.py: script to launch evaluations on a remote instance
  • plot_results.py: script to plot results obtained
  • requirements.txt: dependencies to be installed when running on a remote machine.