Skip to content

Conversation

@zirain
Copy link
Member

@zirain zirain commented Dec 9, 2025

xref: #7606

add default stats_tag config to improve the prometheus metric output.

  • add rule for to_zone and from_zone
# TYPE envoy_sds_default_example_cert_init_fetch_timeout counter
envoy_cluster_zone_1__upstream_rq_time_count{envoy_cluster_name="tracing"} 12 # before
envoy_cluster_zone_upstream_rq_time_count{from_zone="1",envoy_cluster_name="default/tracing-datadog/default/backend"} 0 #after
  • add rule for socket_match_name
# TYPE envoy_cluster_total_match_count counter
envoy_cluster_default_total_match_count{envoy_cluster_name="default/datadog-tracing"} 7 #before
envoy_cluster_total_match_count{socket_match_name="default",envoy_cluster_name="default/datadog-tracing"} 7 #after
  • add rule for circuit breakers priority
# TYPE envoy_cluster_circuit_breakers_default_cx_open gauge
envoy_cluster_circuit_breakers_default_cx_open{envoy_cluster_name="default/datadog-tracing"} 0 #before
envoy_cluster_circuit_breakers_cx_open{priority="default",envoy_cluster_name="default/datadog-tracing"} 0 #after

Before:
before.txt

After:

after.txt

@zirain zirain requested a review from a team as a code owner December 9, 2025 05:40
@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.30%. Comparing base (b9123a8) to head (136dd59).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7701      +/-   ##
==========================================
+ Coverage   72.28%   72.30%   +0.02%     
==========================================
  Files         234      234              
  Lines       34480    34480              
==========================================
+ Hits        24924    24931       +7     
+ Misses       7767     7760       -7     
  Partials     1789     1789              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jukie
jukie previously approved these changes Dec 9, 2025
Comment on lines +26 to +36
stats_config:
use_all_default_tags: true
stats_tags:
- regex: \.zone(\.(([^\.]+)\.))
tag_name: from_zone
- regex: \.zone\.[^\.]+\.(([^\.]+)\.)
tag_name: to_zone
- regex: "^cluster(\\..+\\.(.+))\\.total_match_count$"
tag_name: socket_match_name
- regex: "circuit_breakers\\.((.+?)\\.).+"
tag_name: priority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it look like for a user to opt-out from this? One consideration is that zone metrics are likely only relevant when using Zone Aware Routing and would only work if TopologyInjector is enabled (which is the default so not a huge deal).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's something we need to figure out.
This PR made some breaking changes, which IMO it's good one, should be and a feature gate for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By breaking change you mean the metric name right? That doesn't seem too disruptive so as long as it's mentioned in the release notes I'm still okay with it.

Having this as the default seems reasonable so I don't think a feature gate is necessary. Users could modify the EnvoyProxy config to revert this if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 100% agree with you.

cc @envoyproxy/gateway-maintainers PTAL

@jukie jukie requested review from a team December 10, 2025 02:13
@arkodg
Copy link
Contributor

arkodg commented Dec 11, 2025

@zirain can you add a diff of before/after to the PR description, so its easier to understand the breaking stat changes

@zirain
Copy link
Member Author

zirain commented Dec 11, 2025

@zirain can you add a diff of before/after to the PR description, so its easier to understand the breaking stat changes

updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants