-
Notifications
You must be signed in to change notification settings - Fork 5
Benchmarking tests
pytest is used to run some benchmarks for key time-sensitive features of Pepys. We use the pytest-benchmark plugin to do this.
This provides a 'test fixture' called benchmark
which automatically appears if you have an argument to your test function called benchmark
. This fixture is a function that can be called to run a given function with benchmarking - for example:
def test_benchmark(benchmark):
benchmark(function_to_run)
This will run the function under benchmarking, and display it in the benchmarking summary at the end of the pytest run.
To fail the test if it takes too long we need to access some properties of the benchmark after we've run it, by doing something like this:
if benchmark.stats.stats.mean > 5:
pytest.fail(
f"Mean benchmark run time of {benchmark.stats.stats.mean}s exceeded maximum time of 5s"
)
(Yes, the .stats.stats
bit is necessary, for no good reason)
A few notes on issues that might trip you up:
- Functions still need to have
test
in the name, so pytest picks them up as tests. The convention I've used is that all benchmarking functions are calledtest_something_benchmark
- for exampletest_highlighter_on_whole_file_benchmark
- The
benchmark
fixture will not work with a class-based test (usingsetUp
andtearDown
methods etc). Other methods can be used to provide setup/teardown functionality if needed - The function passed to benchmark will be run multiple times - so make sure it does all the setting up needed, and leaves things in a state where it can be done again (eg. deletes a SQLite file if created, so the next import doesn't skip all the files)
- You can use a decorator on the benchmarking test function to specify how many rounds and iterations to do - these are used to run the test function multiple times to get a better idea of the average run time. For example
@pytest.mark.benchmark(min_rounds=1, max_time=2.0, warmup=False)
- Even with a decorator setting
min_rounds=1
anditerations=1
, pytest-benchmark will still run the function multiple times, to try and get an idea of how long it takes so that it can increase themin_rounds
if necessary. To stop this (which we don't want for a function that takes many minutes to run), you can use code like this:
benchmark.pedantic(
run_import,
args=(processor, os.path.join(FILE_DIR, "benchmark_data/bulk_data.rep"),),
iterations=1,
rounds=1,
)
which uses the 'pedantic' mode to run the function just for one iteration and one round - ie. just once.