Skip to content

Conversation

@jserv
Copy link
Contributor

@jserv jserv commented Jan 5, 2026

This consolidates separate benchmark scripts into a single unified 'tests/bench.py' with:

  • Registration pattern for extensible benchmark definitions
  • Adaptive run count: single run if >5min, time budget of 10min max
  • Thread-safe progress indicator with spinner animation
  • Proper subprocess management with explicit kill on timeout
  • Parallel execution support (--parallel N)
  • Robust error handling with preserved logs on failure
  • time.monotonic() for accurate elapsed time measurement

It also fixes LOG_TRACE -> LOG_WARN to avoid excessive logging overhead during benchmark runs.


Summary by cubic

Unified all benchmark scripts into a single tests/bench.py runner with parallel execution and adaptive timing, and reduced logging noise to speed up runs. CI now calls the unified runner and writes a single JSON output.

  • New Features

    • Extensible @register_benchmark pattern.
    • Adaptive runs with a 10-minute cap; single run if a pass takes >5 minutes.
    • Thread-safe progress spinner; quiet mode for CI.
    • Safe subprocess timeouts with explicit kill; logs preserved on failure.
    • Parallel execution (--parallel N) with ordered output; JSON saved to benchmark_output.json.
  • Refactors

    • Removed tests/dhrystone.sh, tests/coremark.py, and tests/bench-aggregator.py.
    • Updated workflow to run: python3 tests/bench.py --json --quiet.
    • Switched log level from LOG_TRACE to LOG_WARN to reduce overhead.
    • Standardized timing with time.monotonic().

Written for commit 352a2af. Summary will update on new commits.

This consolidates separate benchmark scripts into a single unified
'tests/bench.py' with:
- Registration pattern for extensible benchmark definitions
- Adaptive run count: single run if >5min, time budget of 10min max
- Thread-safe progress indicator with spinner animation
- Proper subprocess management with explicit kill on timeout
- Parallel execution support (--parallel N)
- Robust error handling with preserved logs on failure
- time.monotonic() for accurate elapsed time measurement

It also fixes LOG_TRACE -> LOG_WARN to avoid excessive logging overhead
during benchmark runs.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 6 files

Copy link
Contributor Author

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks

Details
Benchmark suite Current: f4aff45 Previous: d8129e5 Ratio
Dhrystone 1635 Average DMIPS over 10 runs 1615 Average DMIPS over 10 runs 0.99
Coremark 965.643 Average iterations/sec over 10 runs 961.802 Average iterations/sec over 10 runs 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@jserv jserv merged commit c414c25 into master Jan 8, 2026
43 checks passed
@jserv jserv deleted the refine-bench branch January 8, 2026 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants