Summary
Add a lightweight benchmark comparison workflow using benchstat so contributors can compare benchmark output against a baseline instead of reading raw numbers manually.
Why this matters
- raw
go test -bench output is hard to review consistently
- we have benchmarks, but not a reliable comparison story
- before and after perf claims should be measurable, especially for indexing and targeted analysis work
Proposed shape
- add a script or make target that:
- captures a baseline benchmark output
- captures a candidate benchmark output
- runs
benchstat on the two files
- document a simple workflow for comparing:
main vs current branch
- before vs after a local optimization
- keep this opt-in; no CI gate required initially
Nice to have
- support corpus or profile labels in output
- support both bootstrap and larger benchmark corpora
- optionally save outputs under a temp or ignored directory
Acceptance criteria
- contributors can run one documented command to compare two benchmark runs
- results include clear per-benchmark deltas
- docs explain how to use the comparison workflow for perf-sensitive PRs
- the workflow works with the existing benchmark suite without changing default CI
Summary
Add a lightweight benchmark comparison workflow using
benchstatso contributors can compare benchmark output against a baseline instead of reading raw numbers manually.Why this matters
go test -benchoutput is hard to review consistentlyProposed shape
benchstaton the two filesmainvs current branchNice to have
Acceptance criteria