-
Notifications
You must be signed in to change notification settings - Fork 25
ci(perf): Track Firewood Performance via AvalancheGo Benchmarks #1493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements workflow to trigger C-Chain reexecution benchmarks in AvalancheGo and track Firewood performance over time. Supports task-based and custom parameter modes. Results stored in GitHub Pages via github-action-benchmark.
|
Unblocked, test runs: |
| - c-chain-reexecution-firewood-101-250k | ||
| - c-chain-reexecution-firewood-33m-33m500k | ||
| - c-chain-reexecution-firewood-33m-40m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: why do we have these predefined tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without predefined tasks, you'd need to specify all the granular S3 parameters manually (block-dir-src, current-state-dir-src, etc.). These predefined tasks exist in AvalancheGo's Taskfile for convenience and reusability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you pick these particular ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -f runner="${{ inputs.runner }}" \ | ||
| $LIBEVM_FLAG | ||
| sleep 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: what's the purpose of this sleep here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like there's a race between when you run 'gh workflow run' and 'gh run list' as it might not yet be running. 10s might work, sometimes, but if no runners are available it might take longer. It does seem like we should wait a second and loop up to 60 seconds until it shows up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly sometimes the runners are busy so I've increased this to 3 min.
Line 163 in 19d921e
| for i in {1..18}; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment here explaining why we need to sleep here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced sleep with bit more robust implementation with comment:
| workflow_dispatch: | ||
| inputs: | ||
| firewood: | ||
| description: 'Firewood commit/branch/tag to test (leave empty for current HEAD)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Current HEAD of what? main? Or whatever is actually checked out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently checked out commit, I updated docs to reduce ambiguity.
| description: 'Firewood commit/branch/tag to test (leave empty to use the commit that triggered the workflow)' |
| -f runner="${{ inputs.runner }}" \ | ||
| $LIBEVM_FLAG | ||
| sleep 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like there's a race between when you run 'gh workflow run' and 'gh run list' as it might not yet be running. 10s might work, sometimes, but if no runners are available it might take longer. It does seem like we should wait a second and loop up to 60 seconds until it shows up.
| comment-on-alert: false | ||
|
|
||
| - name: Summary | ||
| if: always() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think you need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed you're right, removed.
| id: download | ||
| shell: nix develop ./ffi --command bash {0} | ||
| run: | | ||
| just benchmark-download "${{ steps.trigger.outputs.run_id }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it would be more readable if this command were the same as the name on line 132 -- "just download-benchmark-results"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good insight, I updated related commands to be more readable and intuitive:
- benchmark-download → download-benchmark-results
- benchmark-trigger → trigger-benchmark
- benchmark-wait → wait-benchmark
- benchmark-list → list-benchmarks
| auto-push: true | ||
| gh-pages-branch: benchmark-data | ||
| benchmark-data-dir-path: ${{ steps.download.outputs.data-dir }} | ||
| # Don't fail the workflow if there's an issue with benchmark storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like we should know if this happens, otherwise we'll silently stop seeing results, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch added warning:
| if [ "${{ steps.store.outcome }}" == "failure" ]; then |
justfile
Outdated
| echo "LibEVM: {{ libevm }}" | ||
| fi | ||
|
|
||
| echo "Triggering reexecution benchmark in AvalancheGo..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this before the others and indent the parameters?
| echo "Triggering reexecution benchmark in AvalancheGo..." | |
| echo "Triggering reexecution benchmark in AvalancheGo with parameters:" | |
| echo " - Firewood: $FIREWOOD" | |
| # ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 208 in 8008264
| echo "Triggering reexecution benchmark in AvalancheGo with parameters:" |
justfile
Outdated
| echo "Triggering reexecution benchmark in AvalancheGo..." | ||
| RUN_ID=$(just benchmark-trigger "$FIREWOOD" "{{ avalanchego }}" "{{ task }}" "{{ runner }}" "{{ libevm }}") | ||
| echo " Run ID: $RUN_ID" | ||
| echo " URL: https://github.com/ava-labs/avalanchego/actions/runs/$RUN_ID" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be more verbose here so someone knows what the URL does. Maybe something like:
| echo " URL: https://github.com/ava-labs/avalanchego/actions/runs/$RUN_ID" | |
| echo " Monitor progress here: https://github.com/ava-labs/avalanchego/actions/runs/$RUN_ID" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 224 in 8008264
| echo " Monitor progress here: https://github.com/ava-labs/avalanchego/actions/runs/$RUN_ID" |
justfile
Outdated
| benchmark-wait run_id: check-nix | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| nix run ./ffi#gh -- run watch "{{ run_id }}" --repo ava-labs/avalanchego --exit-status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can hang if it never finishes, which is possible if someone introduces some infinite loop somewhere.
I think github will let it run for a day, which is way too long. 2h seems like plenty but perhaps a shorter time makes sense.
| nix run ./ffi#gh -- run watch "{{ run_id }}" --repo ava-labs/avalanchego --exit-status | |
| nix run ./ffi#gh -- timeout -k 1m 2h run watch "{{ run_id }}" --repo ava-labs/avalanchego --exit-status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 185 in 8008264
| timeout -k 1m 2h nix run ./ffi#gh -- run watch "{{ run_id }}" --repo ava-labs/avalanchego --exit-status |
justfile
Outdated
| echo " URL: https://github.com/ava-labs/avalanchego/actions/runs/$RUN_ID" | ||
| echo "" | ||
|
|
||
| echo "Waiting for benchmark completion..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This looks polished. I probably would have just thrown a "set -x" here and called it a day but I like this better.
Co-authored-by: rodrigo <77309055+RodrigoVillar@users.noreply.github.com> Signed-off-by: Elvis <43846394+Elvis339@users.noreply.github.com>
Co-authored-by: rodrigo <77309055+RodrigoVillar@users.noreply.github.com> Signed-off-by: Elvis <43846394+Elvis339@users.noreply.github.com>
Co-authored-by: Ron Kuris <ron.kuris@avalabs.org> Signed-off-by: Elvis <43846394+Elvis339@users.noreply.github.com>
…ava-labs/firewood into es/enable-firewood-dev-workflow
|
We need to get this merged first: ava-labs/avalanchego#4650 before merging this PR. DO NOT MERGE. |
| cp -rv docs/assets ./_site | ||
| - name: Include benchmark data | ||
| run: | | ||
| git fetch origin benchmark-data || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we ignoring errors here?
| git checkout origin/benchmark-data -- dev bench 2>/dev/null || true | ||
| cp -rv dev _site/ 2>/dev/null || true | ||
| cp -rv bench _site/ 2>/dev/null || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. Seems like if these fail, the build should fail
| name: Track Performance | ||
|
|
||
| on: | ||
| workflow_dispatch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this manually dispatched? Shouldn't this happen automatically on pushes to main, or is this just an intermediate step for testing purposes?
If the latter, please change the PR description to show the followup or create another PR/task and link it.
| - c-chain-reexecution-firewood-101-250k | ||
| - c-chain-reexecution-firewood-33m-33m500k | ||
| - c-chain-reexecution-firewood-33m-40m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you pick these particular ones?
| c-chain-benchmark: | ||
| runs-on: ubuntu-latest | ||
| permissions: | ||
| contents: write # Required for github-action-benchmark to push to gh-pages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was helpful as a good explanation of what this task does. IMO this would be better documented as a block comment or even better by changing the name from "c-chain-benchmark" to "record-bench-to-gh-pages".
| -f runner="${{ inputs.runner }}" \ | ||
| $LIBEVM_FLAG | ||
| # Wait for GitHub API to register the workflow run (retry up to 3min) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same logic we use in just, or are we already out of sync with what just does?
| id: download | ||
| shell: nix develop ./ffi --command bash {0} | ||
| run: | | ||
| just download-benchmark-results "${{ steps.trigger.outputs.run_id }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious as to what happens if the benchmark fails to start at all
| - name: Summary | ||
| run: | | ||
| if [ "${{ steps.store.outcome }}" == "failure" ]; then | ||
| echo "::warning::Benchmark storage failed - results were not saved to GitHub Pages" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we keep going anyway. Is this expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is intentional. The continue-on-error: true on the store step lets us still show the summary even if publishing to GitHub Pages fails.
The idea is: if the benchmark ran and we got results, we want to surface what was tested and provide links to the AvalancheGo run - even if we couldn't save to gh-pages. The warning ::warning::Benchmark storage failed... makes the failure visible without hiding the useful context.
Alternative would be failing the workflow, but then we lose the summary which makes debugging harder.
| echo "## Firewood Performance Benchmark Results" >> $GITHUB_STEP_SUMMARY | ||
| echo "" >> $GITHUB_STEP_SUMMARY | ||
| echo "**Configuration:**" >> $GITHUB_STEP_SUMMARY | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Group all these together (probably with the ones below too):
| echo "## Firewood Performance Benchmark Results" >> $GITHUB_STEP_SUMMARY | |
| echo "" >> $GITHUB_STEP_SUMMARY | |
| echo "**Configuration:**" >> $GITHUB_STEP_SUMMARY | |
| { | |
| echo "## Firewood Performance Benchmark Results" | |
| echo | |
| echo "**Configuration:**" | |
| # ... | |
| } > $GITHUB_STEP_SUMMARY |
It's a little particular about spacing though. There must be a space after the opening { and a newline or a semicolon ; before the closing }.
justfile
Outdated
|
|
||
| # Wait for GitHub to register the run (retry up to 3min) | ||
| for i in {1..18}; do | ||
| sleep 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consolidate this loop and reduce the possibility of these getting out of sync?
A great example might be lets assume we want to change this to check every 5 seconds. How many places have to be updated?
- Add bench-cchain-reexecution.sh script to trigger and monitor AvalancheGo benchmarks - Add just commands: bench, bench-status, bench-list, bench-help - Add track-performance.yml workflow to publish results to GitHub Pages - Support predefined tests and custom block ranges via env vars
| # just bench <test> - predefined test | ||
| # FIREWOOD_REF=v0.1.0 just bench firewood-101-250k - with specific ref | ||
| # Custom: START_BLOCK=101 END_BLOCK=250000 BLOCK_DIR_SRC=<s3> CURRENT_STATE_DIR_SRC=<s3> just bench | ||
| bench test="": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just provides the golden path for developers to run benchmarks locally. Same commands work in CI.
| fetch-depth: 0 # Needed for github-action-benchmark | ||
|
|
||
| - name: Trigger C-Chain Reexecution Benchmark | ||
| run: ./scripts/bench-cchain-reexecution.sh trigger "${{ inputs.test }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using just here it's a thin wrapper around this script. Avoids installing Nix/just in CI for no benefit.
Why
Track C-Chain reexecution benchmark performance over time. Catch regressions before production.
Closes #1494
How
Firewood → triggers AvalancheGo benchmark → downloads results → publishes to GitHub Pages
Changes
scripts/bench-cchain-reexecution.sh- trigger, wait, download.github/workflows/track-performance.yml- orchestrates benchmark + publishjustfile-bench,bench-status,bench-list,bench-helpUsage
just bench firewood-101-250k
FIREWOOD_REF=v0.1.0 just bench firewood-33m-40m
just bench-status <run_id>
Related
AvalancheGo: ava-labs/avalanchego#4650