Benchmarking tests

pytest is used to run some benchmarks for key time-sensitive features of Pepys. We use the pytest-benchmark plugin to do this.

This provides a 'test fixture' called benchmark which automatically appears if you have an argument to your test function called benchmark. This fixture is a function that can be called to run a given function with benchmarking - for example:

def test_benchmark(benchmark):
    benchmark(function_to_run)

This will run the function under benchmarking, and display it in the benchmarking summary at the end of the pytest run.

To fail the test if it takes too long we need to access some properties of the benchmark after we've run it, by doing something like this:

if benchmark.stats.stats.mean > 5:
            pytest.fail(
                f"Mean benchmark run time of {benchmark.stats.stats.mean}s exceeded maximum time of 5s"
            )

(Yes, the .stats.stats bit is necessary, for no good reason)

A few notes on issues that might trip you up:

Functions still need to have test in the name, so pytest picks them up as tests. The convention I've used is that all benchmarking functions are called test_something_benchmark - for example test_highlighter_on_whole_file_benchmark
The benchmark fixture will not work with a class-based test (using setUp and tearDown methods etc). Other methods can be used to provide setup/teardown functionality if needed
The function passed to benchmark will be run multiple times - so make sure it does all the setting up needed, and leaves things in a state where it can be done again (eg. deletes a SQLite file if created, so the next import doesn't skip all the files)
You can use a decorator on the benchmarking test function to specify how many rounds and iterations to do - these are used to run the test function multiple times to get a better idea of the average run time. For example @pytest.mark.benchmark(min_rounds=1, max_time=2.0, warmup=False)
Even with a decorator setting min_rounds=1 and iterations=1, pytest-benchmark will still run the function multiple times, to try and get an idea of how long it takes so that it can increase the min_rounds if necessary. To stop this (which we don't want for a function that takes many minutes to run), you can use code like this:

benchmark.pedantic(
        run_import,
        args=(processor, os.path.join(FILE_DIR, "benchmark_data/bulk_data.rep"),),
        iterations=1,
        rounds=1,
    )

which uses the 'pedantic' mode to run the function just for one iteration and one round - ie. just once.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking tests

Clone this wiki locally