Skip to content

Conversation

aaronbuchwald
Copy link
Collaborator

This PR updates the Prometheus params passed to run monitored tmpnet command.

Follow up to: #4149

@aaronbuchwald aaronbuchwald marked this pull request as ready for review September 25, 2025 18:21
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Updates Prometheus configuration parameters in C-chain reexecution benchmark workflows to align with changes introduced in PR #4149.

  • Adds prometheus_url parameter alongside existing push URL parameters
  • Updates parameter naming from kebab-case to snake_case for consistency
  • Defines new prometheus-url input in the reusable action

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
.github/workflows/c-chain-reexecution-benchmark-gh-native.yml Updates Prometheus parameter names and adds prometheus_url parameter
.github/workflows/c-chain-reexecution-benchmark-container.yml Updates Prometheus parameter names and adds prometheus_url parameter
.github/actions/c-chain-reexecution-benchmark/action.yml Adds new prometheus-url input definition

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@aaronbuchwald aaronbuchwald marked this pull request as draft September 25, 2025 18:31
@aaronbuchwald
Copy link
Collaborator Author

TODO: align underscores and dashes for parameters between re-execution and run monitored tmpnet command to reduce confusion.

@aaronbuchwald
Copy link
Collaborator Author

TODO: align underscores and dashes for parameters between re-execution and run monitored tmpnet command to reduce confusion.

Opened a PR here, but would prefer to keep kebab case as it appears to be more standard as mentioned in the candidate PR description: #4349

@aaronbuchwald aaronbuchwald marked this pull request as ready for review September 25, 2025 20:16
@maru-ava
Copy link
Contributor

maru-ava commented Sep 26, 2025

Have you seen metrics being collected with this PR where it wasn't collected from master? That would be awfully strange, because only the push URL is used for metrics collection. The non-push URL is only used to query Prometheus for a non-zero set of the expected labels, and afaik the re-execute jobs are not performing such a check.

@aaronbuchwald
Copy link
Collaborator Author

Have you seen metrics being collected with this PR where it wasn't collected from master? That would be awfully strange, because only the push URL is used for metrics collection. The non-push URL is only used to query Prometheus for a non-zero set of the expected labels, and afaik the re-execute jobs are not performing such a check.

Hmm, you're right this is working fine, so not sure what the reported issue is and suspect it's related to a unique setup.

@aaronbuchwald
Copy link
Collaborator Author

Closing for now as @maru-ava pointed out this is not the issue. May re-open with the suggested check that metrics have bene successfully collected, but not sure where this is.

auto-merge was automatically disabled September 26, 2025 14:34

Pull request was closed

@github-project-automation github-project-automation bot moved this to Done 🎉 in avalanchego Sep 26, 2025
@aaronbuchwald aaronbuchwald reopened this Sep 26, 2025
@github-project-automation github-project-automation bot moved this from Done 🎉 to In Progress 🏗️ in avalanchego Sep 26, 2025
@aaronbuchwald
Copy link
Collaborator Author

Closing for now as @maru-ava pointed out this is not the issue. May re-open with the suggested check that metrics have bene successfully collected, but not sure where this is.

Re-opened with the recommended check and populating network_uuid to fit tmpnet's expected set of labels.

@aaronbuchwald
Copy link
Collaborator Author

Strange, tmpnet.CheckMetricsExist(...) is now failing in an unexpected way.

I can see the metrics show up including when adding the network_uuid filter here: https://grafana-poc.avax-dev.network/d/Gl1I20mnk/c-chain?orgId=1&refresh=10s&var-filter=is_ephemeral_node%7C%3D%7Cfalse&var-filter=gh_repo%7C%3D%7Cava-labs%2Favalanchego&var-filter=gh_run_id%7C%3D%7C18042855154&var-filter=gh_run_attempt%7C%3D%7C1&var-filter=gh_job_id%7C%3D%7Cc-chain-reexecution&var-filter=network_uuid%7C%3D%7Cb82518dd-ebca-4c8b-a263-2d17611bcb85&from=2025-09-26T15:59:56.000Z&to=2025-09-26T16:14:56.000Z&timezone=America%2FNew_York&var-datasource=P1809F7CD0C75ACF3&var-chain=C
Screenshot 2025-09-30 at 13 16 01

However, it fails to make a request to get the metrics and thus triggers a failure in CI: https://github.com/ava-labs/avalanchego/actions/runs/18042855154/job/51346204250?pr=4343#step:3:1486.

Running locally, this check works fine with the following output and using the shared Prometheus env vars:

[09-30|13:13:58.046] INFO c-chain-reexecution c/vm_reexecute_test.go:207 shutting down DB
[09-30|13:14:10.050] INFO prometheus tmpnet/check_monitoring.go:178 checking if metrics exist {"url": "https://prometheus-poc.avax-dev.network", "query": "count({network_uuid=\"9decc783-93f4-4b0b-8913-53f08d902580\"})"}
[09-30|13:14:10.137] INFO prometheus tmpnet/check_monitoring.go:45 collected count is non-zero {"type": "metrics", "count": 696}
BenchmarkReexecuteRange/[101,10000]-Config-default-Runner-dev-12                       1               190.3 mgas/s
PASS
ok      github.com/ava-labs/avalanchego/tests/reexecute/c       18.045s

I suspect this indicates that the environment variables (perhaps the username/password?) are not being set as expected within this CI run.

@aaronbuchwald aaronbuchwald added this pull request to the merge queue Sep 30, 2025
Merged via the queue into master with commit e207817 Sep 30, 2025
35 checks passed
@aaronbuchwald aaronbuchwald deleted the aaronbuchwald/update-reexecution-prometheus-config branch September 30, 2025 18:03
@github-project-automation github-project-automation bot moved this from In Progress 🏗️ to Done 🎉 in avalanchego Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done 🎉
Development

Successfully merging this pull request may close these issues.

3 participants