JUnit Reducer Action is a GitHub Action wrapping a command-line tool that aggregates the JUnit test XML reports from your CI runs and reduces them to a single, lighter set of reports to be downloaded later during CI, to steer your test splitting algorithm (e.g., split_tests). The most typical use case is to regularly update a 'running average' of your recent test reports, which can be downloaded to your test runners in less time and without running an ongoing race condition risk.
Typically, you'll be using this Action within a scheduled cron workflow to take the last X
days of JUnit XML reports, reduce them and upload the results. It's recommended to accumulate the JUnit XML reports from individual CI runs in a cloud storage service like AWS S3 or Google Cloud Storage, as opposed to the caching API available within GitHub Actions, which is designed as an overwrite key-value store.
All inputs are optional. However, these are the most important:
Name | Description | Type | Default |
---|---|---|---|
include |
Glob pattern to find JUnit XML reports to reduce | string |
./**/*.xml |
output-path |
Output path for the reduced JUnit XML reports | string |
./output/ |
exclude |
Glob pattern to omit from included JUnit XML reports | string |
|
version |
Version of junit-reducer command-line tool to use (e.g., v1.2.0 or latest ) |
string |
latest |
Name | Description | Type | Default |
---|---|---|---|
op-suites-time |
Reducer operation for test suite time values. Options: max , mean , median , min , mode or sum |
string |
mean |
op-cases-time |
Reducer operation for test case time values. Options: max , mean , median , min , mode or sum |
string |
mean |
op-suites-assertions |
Reducer operation for test suite assertion counts. Options: max , mean , median , min , mode or sum |
string |
mean |
op-suites-errors |
Reducer operation for test suite error counts. Options: max , mean , median , min , mode or sum |
string |
mean |
op-suites-failed |
Reducer operation for test suite failure counts. Options: max , mean , median , min , mode or sum |
string |
mean |
op-suites-skipped |
Reducer operation for test suite skipped counts. Options: max , mean , median , min , mode or sum |
string |
mean |
op-suites-tests |
Reducer operation for test suite test counts. Options: max , mean , median , min , mode or sum |
string |
mean |
reduce-suites-by |
Specify a key (corresponding to the attribute on the test suite nodes) to group and reduce test suites by. Options: filepath , name or name+filepath |
string |
name+filepath |
reduce-cases-by |
Specify a key (corresponding to the attribute on the test case nodes) to group and reduce test cases by. Options: classname , file or name |
string |
name |
rounding-mode |
Rounding mode for counts (assertions , errors , failed , skipped and tests ) that should be integers in the final result. Options: ceil , floor or round |
string |
round |
name: junit-test-report-averaging
run-name: Create Average JUnit Test Reports
on:
schedule:
# Run every morning at 8AM
- cron: '0 8 * * *'
jobs:
reduce-reports:
runs-on: ubuntu-latest
steps:
# Configure with the Cloud storage provider of your choice.
- name: Setup AWS CLI
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.YOUR_AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.YOUR_AWS_SECRET_ACCESS_KEY }}
aws-region: eu-west-2
# Download all test reports from all CI runs.
# It is recommended to set up a lifecycle rule, to remove objects older
# than a certain age from this bucket/path. This will help to keep the test reports
# current and keep this job from taking too long.
- name: Download test timings
run: |
aws s3 cp s3://your-junit-report-bucket/ci-runs-reports/ reports/ \
--recursive
- name: Reduce reports
uses: willgeorgetaylor/junit-reducer-action@v1
with:
include: ./reports/**/*.xml
output-path: ./average-reports/
# Upload the reduced set of test reports to a dedicated bucket/path.
# In your actual CI process, the CI runners will copy the contents of
# this path locally, to be ingested by the test splitter.
- name: Upload single set of averaged reports
run: |
aws s3 sync ./average-reports s3://your-junit-report-bucket/average-reports/ \
--size-only \
--cache-control max-age=86400
As your test suite grows, you may want to start splitting tests between multiple test runners, to be executed concurrently. While it's relatively simple to divide up your test suites by files, using lines of code (LOC) as a proxy for test duration, the LOC metric is still just an approximation and will result in uneven individual (and therefore overall slower) test run times as your codebase and test suites change.
The preferable approach for splitting test suites accurately is to use recently reported test times, and the most popular format for exchanging test data (including timings) between tools is the JUnit XML reports format. While JUnit itself is a Java project, the schema that defines JUnit reports is equally applicable to any language and reports can be generated by most testing frameworks for JavaScript, Ruby, Python etc.
In busier projects, CI will be uploading reports frequently, so even if you take a small time window (for example, the last 24 hours), you could end up with 20MB+ of test reports. These reports need to be downloaded to every runner in your concurrency set, only to then perform the same splitting operation to yield the exact same time estimates. This means unnecessary and expensive work is being performed by each concurrent runner, potentially delaying the total test time by minutes and increasing CI costs.
You can solve this speed issue by creating a set of reports that looks like the set produced by a single CI run. Importantly, the values for time
taken by test suite (as well as other counts, like errors and tests) are reduced from the wider set of reports, typically by finding the mean
of all of the aggregate time
values. Other reducer operations, like min
/ max
/ mode
/ median
/ sum
, are available to handle non-standard distributions.
In very busy projects, there is also a more problematic race condition possible, with larger downloads and test runners starting at different times. As CI runs from other commits upload their reports to the same remote source that you're downloading from, if any of your concurrent runners download reports with different values, the input data is misaligned and the splitting operation is corrupted. However, because the download and splitting operation is being performed in a distributed manner (across all of the runners concurrently) this misalignment will result in some tests in your run being skipped.
This risk is mitigated by computing the averaged reports in one place, and updating that set as part of a scheduled job. This is exactly the approach outlined in the example workflow section.
This action invokes willgeorgetaylor/junit-reducer to reduce the test reports.
If you prefer to use the command-line tool directly, there are instructions for that on the repository readme.