ci: add scheduled benchmark runs (#1639) #1645

Elvis339 · 2026-01-28T11:25:55Z

Why this should be merged

This patch is closing #1639 it adds automated benchmark triggers to catch performance regressions without manual intervention.

Daily (40M→41M): Quick 1M block test for fast regression detection
Weekly (50M→60M): Comprehensive 10M block stress test
Runner: avago-runner-i4i-2xlarge-local-ssd for consistent NVMe performance

Note: should be merged after #1493

How this works

configure-benchmark job: Shell case statement matches cron schedule → benchmark parameters (block range, runner, timeout)
benchmark job: Triggers AvalancheGo's C-Chain reexecution workflow with resolved parameters
github-action-benchmark: Publishes results to benchmark-data branch (bench/ for main, dev/bench/{branch}/ for feature branches)
workflow_run trigger: After benchmark completes, gh-pages.yaml auto-rebuilds to include new results

Manual dispatch still works via workflow inputs.

How this was tested

Verified S3 paths exist
Confirmed storage (~443 GB) fits i4i.4xlarge capacity (3.75 TB)
Manual workflow dispatch validation (pending)

RodrigoVillar

Could you please rebase this PR on top of #1642 (or whatever PR it's dependent on)? I'm not what's needed to be reviewed vs code that's being reviewed in a different PR.

RodrigoVillar · 2026-01-28T15:10:08Z

.github/workflows/track-performance.yml

+          summary-always: true
+          auto-push: true
+          fail-on-alert: true


q: what's the rationale for setting summary-always and fail-on-alert to true?

Both are about visibility and catching regressions early. summary-always ensures results are always visible in the job summary. fail-on-alert ensures we don't silently regress - if performance drops significantly, the workflow fails rather than quietly recording bad data.

if performance drops significantly

Hmm what does it mean for performance to drop significantly?

Good question, I've added explicit alert-threshold: "150%" to define what "significant" means - the workflow will fail if performance degrades by more than 50% compared to the previous run.

fail-on-alert signals that the workflow failed specifically because of a performance regression (vs. an infrastructure issue), and alert-threshold defines the sensitivity. We can tune this based on observed variance once we have more data points.

rkuris

Nit: Overall, this seems like a lot of somewhat-fragile code to parse json instead of just specifying the parameters in the normal github style.

I think we can save a ton of money by picking a much smaller instance size. I think i4i.xlarge will work at roughly 1/8 the cost.

.github/benchmark-schedules.json

.github/workflows/track-performance.yml

Implements workflow to trigger C-Chain reexecution benchmarks in AvalancheGo and track Firewood performance over time. Supports task-based and custom parameter modes. Results stored in GitHub Pages via github-action-benchmark. # Conflicts: # .github/workflows/track-performance.yml

…stfile commands

- Refactor `bench-cchain` to trigger track-performance.yml instead of directly calling AvalancheGo's workflow - Add input validation and helpful error messages for test/custom params - Use commit SHA instead of branch name for reproducibility - Fix AvalancheGo workflow input: use `with-dependencies` format ("firewood=abc,libevm=xyz") instead of separate firewood-ref/libevm-ref - Remove status-cchain, list-cchain, help-cchain (use GitHub UI instead) - Remove debug logging from track-performance.yml

Remove local developer tooling (justfile recipe, flake.nix, METRICS.md) to reduce PR scope. These will be submitted in a follow-up PR after the CI workflow changes are merged.

Daily (40M→41M) and weekly (50M→60M) benchmarks with JSON-based config and matrix strategy for cleaner workflow management.

…coded branch name used for temp. testing

- Replace `avago-runner-i4i-4xlarge-local-ssd` with `avago-runner-i4i-2xlarge-local-ssd` for daily and weekly benchmarks. - Add new input options for `firewood` and `libevm` commits/branches/tags. - Extend runner choices with additional configurations in the workflow.

- Remove `.github/benchmark-schedules.json` in favor of inline configurations. - Replace matrix-based strategy with direct output handling for cleaner and more explicit workflow logic. - Preserve manual and scheduled benchmark support with optimized input handling.

Elvis339 · 2026-01-28T18:26:32Z

Nit: Overall, this seems like a lot of somewhat-fragile code to parse json instead of just specifying the parameters in the normal github style.

Updated was trying to stay consistent with AvalancheGo's JSON config approach but I agree with you inline with native GitHub outputs feels more natural here. No strong need for consistency on this at this point in time.

…ale workspace issues

Elvis339 · 2026-01-29T19:09:34Z

Test case manual dispatch with custom dependencies

Command:

gh workflow run track-performance.yml \
  --ref es/scheduled-perf-tracking \
  -f test=firewood-101-250k \
  -f runner=avago-runner-i4i-2xlarge-local-ssd \
  -f firewood=v0.0.18 \
  -f avalanchego=firewood-benchmark-base

Run: https://github.com/ava-labs/firewood/actions/runs/21489879261

Predefined test: firewood-101-250k
Runner: avago-runner-i4i-2xlarge-local-ssd
Firewood: v0.0.18
AvalancheGo: firewood-benchmark-base branch

Graph: https://ava-labs.github.io/firewood/dev/bench/es-scheduled-perf-tracking/

- Add workflow_dispatch trigger to gh-pages.yaml for manual runs - Trigger gh-pages deployment from track-performance.yml after benchmark results are pushed to benchmark-data branch This ensures benchmark results are immediately visible on GitHub Pages without waiting for a push to main.

Avoids conflict with daily run (which includes Friday) and ensures 10M block benchmark results are available early in the week.

Elvis339 self-assigned this Jan 28, 2026

Elvis339 added the DO NOT MERGE This PR is not meant to be merged in its current state label Jan 28, 2026

Elvis339 requested review from demosdemon and rkuris as code owners January 28, 2026 11:25

Elvis339 added the performance label Jan 28, 2026

Elvis339 requested review from RodrigoVillar and aaronbuchwald as code owners January 28, 2026 11:25

Elvis339 added the c-chain label Jan 28, 2026

RodrigoVillar reviewed Jan 28, 2026

View reviewed changes

rkuris requested changes Jan 28, 2026

View reviewed changes

Elvis339 added 19 commits January 28, 2026 18:53

ci: track performance

d64e77d

docs

a55cb7d

ci(perf): add benchmark workflow with nix-based just commands

d82ed73

docs

8c42437

docs

04d6367

address PR

ac94798

lint: descriptive link text

126d564

docs

a261cdf

ci: update workflow for C-Chain reexecution benchmarks and improve ju…

3b95709

…stfile commands

docs

e7fec1b

debug

f56b9ac

temp

1406e14

docs

aa785b1

docs

119cc33

ci(gh-pages): temp. add workflow_dispatch to rebuild Pages

344bcf3

ci(gh-pages): remove temp. set workflow_dispatch

29ee34c

chore: split PR - extract local tooling to separate PR

1d648ea

Remove local developer tooling (justfile recipe, flake.nix, METRICS.md) to reduce PR scope. These will be submitted in a follow-up PR after the CI workflow changes are merged.

ci(track-performance): add scheduled C-Chain reexecution benchmarks

6ee9ccc

Daily (40M→41M) and weekly (50M→60M) benchmarks with JSON-based config and matrix strategy for cleaner workflow management.

Elvis339 force-pushed the es/scheduled-perf-tracking branch from 2af94f6 to 6ee9ccc Compare January 28, 2026 17:55

chore(bench-cchain-reexecution): use AVALANCHEGO_REF instead of hard-…

c5efed7

…coded branch name used for temp. testing

RodrigoVillar mentioned this pull request Jan 28, 2026

ci: add benchmark workflow #1643

Open

Elvis339 added 3 commits January 28, 2026 19:03

chore(bench-cchain-reexecution): clarify help text wording

5d39f47

Elvis339 requested review from RodrigoVillar and rkuris January 28, 2026 18:28

Elvis339 added DO NOT MERGE This PR is not meant to be merged in its current state and removed DO NOT MERGE This PR is not meant to be merged in its current state labels Jan 28, 2026

Elvis339 added 2 commits January 28, 2026 20:38

docs

0843424

ci(track-performance): limit runners to ephemeral storage to avoid st…

93e4443

…ale workspace issues

Elvis339 added 2 commits January 29, 2026 20:26

ci(track-performance): remove preset runner options

d028428

Elvis339 had a problem deploying to github-pages January 29, 2026 19:47 — with GitHub Actions Failure

Elvis339 added 4 commits January 29, 2026 21:00

ci(gh-pages): repository_dispatch

42a1d9d

test

07ce5ca

revert

13a196f

ci(track-performance): set alert-threshold 150%

34fe2de

Elvis339 requested a review from bernard-avalabs as a code owner January 30, 2026 17:56

ci(track-performance): schedule weekly run on Saturday 00:00 ET

408809a

Avoids conflict with daily run (which includes Friday) and ensures 10M block benchmark results are available early in the week.

Elvis339 removed the DO NOT MERGE This PR is not meant to be merged in its current state label Jan 30, 2026

Elvis339 and others added 5 commits January 30, 2026 19:04

Merge branch 'main' into es/scheduled-perf-tracking

1d90538

ci(github-pages): auto rebuild benchmarks when results are pushed

22e2be1

ci(gh-pages): auto-deploy

54265b9

ci(gh-pages): auto deploy

a44160d

rm temp branch

4e887c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add scheduled benchmark runs (#1639) #1645

ci: add scheduled benchmark runs (#1639) #1645

Elvis339 commented Jan 28, 2026 •

edited

Loading

Uh oh!

RodrigoVillar left a comment •

edited

Loading

Uh oh!

RodrigoVillar Jan 28, 2026

Uh oh!

Elvis339 Jan 28, 2026 •

edited

Loading

Uh oh!

RodrigoVillar Jan 28, 2026

Uh oh!

Elvis339 Jan 30, 2026

Uh oh!

rkuris left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Elvis339 commented Jan 28, 2026

Uh oh!

Elvis339 commented Jan 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ci: add scheduled benchmark runs (#1639) #1645

Are you sure you want to change the base?

ci: add scheduled benchmark runs (#1639) #1645

Conversation

Elvis339 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this should be merged

How this works

How this was tested

Uh oh!

RodrigoVillar left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RodrigoVillar Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Elvis339 Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RodrigoVillar Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Elvis339 Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

rkuris left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Elvis339 commented Jan 28, 2026

Uh oh!

Elvis339 commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Elvis339 commented Jan 28, 2026 •

edited

Loading

RodrigoVillar left a comment •

edited

Loading

Elvis339 Jan 28, 2026 •

edited

Loading

Elvis339 commented Jan 29, 2026 •

edited

Loading