Unify benchmark scripts with improved robustness #688

jserv · 2026-01-05T20:02:59Z

This consolidates separate benchmark scripts into a single unified 'tests/bench.py' with:

Registration pattern for extensible benchmark definitions
Adaptive run count: single run if >5min, time budget of 10min max
Thread-safe progress indicator with spinner animation
Proper subprocess management with explicit kill on timeout
Parallel execution support (--parallel N)
Robust error handling with preserved logs on failure
time.monotonic() for accurate elapsed time measurement

It also fixes LOG_TRACE -> LOG_WARN to avoid excessive logging overhead during benchmark runs.

Summary by cubic

Unified all benchmark scripts into a single tests/bench.py runner with parallel execution and adaptive timing, and reduced logging noise to speed up runs. CI now calls the unified runner and writes a single JSON output.

New Features
- Extensible @register_benchmark pattern.
- Adaptive runs with a 10-minute cap; single run if a pass takes >5 minutes.
- Thread-safe progress spinner; quiet mode for CI.
- Safe subprocess timeouts with explicit kill; logs preserved on failure.
- Parallel execution (--parallel N) with ordered output; JSON saved to benchmark_output.json.
Refactors
- Removed tests/dhrystone.sh, tests/coremark.py, and tests/bench-aggregator.py.
- Updated workflow to run: python3 tests/bench.py --json --quiet.
- Switched log level from LOG_TRACE to LOG_WARN to reduce overhead.
- Standardized timing with time.monotonic().

^{Written for commit 352a2af. Summary will update on new commits.}

This consolidates separate benchmark scripts into a single unified 'tests/bench.py' with: - Registration pattern for extensible benchmark definitions - Adaptive run count: single run if >5min, time budget of 10min max - Thread-safe progress indicator with spinner animation - Proper subprocess management with explicit kill on timeout - Parallel execution support (--parallel N) - Robust error handling with preserved logs on failure - time.monotonic() for accurate elapsed time measurement It also fixes LOG_TRACE -> LOG_WARN to avoid excessive logging overhead during benchmark runs.

cubic-dev-ai

No issues found across 6 files

jserv

Benchmarks

Details

Benchmark suite	Current: `f4aff45`	Previous: `d8129e5`	Ratio
`Dhrystone`	`1635` Average DMIPS over 10 runs	`1615` Average DMIPS over 10 runs	`0.99`
`Coremark`	`965.643` Average iterations/sec over 10 runs	`961.802` Average iterations/sec over 10 runs	`1.00`

This comment was automatically generated by workflow using github-action-benchmark.

jserv force-pushed the refine-bench branch from a8564e4 to 352a2af Compare January 5, 2026 20:04

cubic-dev-ai bot reviewed Jan 5, 2026

View reviewed changes

jserv added this to the release-2026.1 milestone Jan 5, 2026

jserv mentioned this pull request Jan 5, 2026

Modernize benchmarking infrastructure and hardware targets #689

Open

5 tasks

jserv commented Jan 5, 2026

View reviewed changes

jserv requested review from ChinYikMing and visitorckw January 5, 2026 21:18

jserv force-pushed the refine-bench branch from f4aff45 to 352a2af Compare January 5, 2026 21:34

jserv merged commit c414c25 into master Jan 8, 2026
43 checks passed

jserv deleted the refine-bench branch January 8, 2026 08:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unify benchmark scripts with improved robustness #688

Unify benchmark scripts with improved robustness #688

Uh oh!

jserv commented Jan 5, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

jserv left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Unify benchmark scripts with improved robustness #688

Unify benchmark scripts with improved robustness #688

Uh oh!

Conversation

jserv commented Jan 5, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jserv left a comment

Choose a reason for hiding this comment

Benchmarks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jserv commented Jan 5, 2026 •

edited by cubic-dev-ai bot

Loading