-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve gorouter Route Registry Message Metrics #445
Comments
Submodule src/code.cloudfoundry.org/gorouter e768fec0..5174c184: > Fix integration maxheader test for gorouter (cloudfoundry#445) > Remove extra printf > Set ReadHeaderTimeout on gorouter and route-services servers Submodule src/code.cloudfoundry.org/routing-acceptance-tests b6732e58..2fc354ee: > Add read header timeout to address G112 (cloudfoundry#24)
At first I want to apologize for the late reply :-/ If I understood correctly from the issue description and I would suggest that you don't touch the existing metrics, add new metrics for the things that you need and hide them behind some config parameter. We've done something already with @stefanlay when we added config param to control the emission of HTTP start stop events... I know that this might not be the best solution, but at the moment with the current tools at hand, I don't see any other option. |
I agree with Jovan that the existing metrics should not be replaced, if at all possible. Using a tag to identify the action, as an alternative to adding new metrics seems fine to me. I don't view new tags as a breaking change, and they should allow querying along the dimensions that you're interested in. However, if there are limitations with the dropsonde library that makes tags hard to use, I'd suggest the easiest way forward to be what Jovan suggested: keep the old metrics as they are, add new metrics per-action using a feature flag to toggle them on and off. |
Thanks for all the input! In our setup we won't be able to support tags with the loggregator stack because it relies on the graphite line protocol. We are now investing in making the gorouter metrics available via prometheus (#451) to be able to scrape them directly, including the tags. |
Just a small clarification: Loggregator it self doesn't rely on Graphite. The problem which we have is that we have an internal consumer which relies on Graphite which doesn't support tags. |
Proposed Change
As a developer and/or operator,
I want to know the number of routes added, updated and not modified via route register Registry Messages. I also want to know the number of routes removed and endpoints removed via route unregister Registry Messages.
So that I can measure the number of each action.
Currently, gorouter is emitting one metric for Registry.Register, and one metric for Registry.Unregister. Hence, we know the sum of routes added, updated and unmodified, and the number of Endpoints unregistered, including endpoint and route unregistration.
As performance cost for these actions differ (e.g. route update is more expensive than route unmodified), it is favorable to enhance the current metric for (un)registration, so that it can be split by actions.
Acceptance criteria
Route.Register
are enhanced by the update action (added, updated, unmodified)Route.Unregister
are enhanced by the update action (endpoint removed, route removed)Proposed solution
Pull-Request: cloudfoundry/gorouter#456
For
route.Register
, we replaceregistry_message[.component]
with these metrics:registry_message.endpoint-added[.component]
registry_message.endpoint-updated[.component]
registry_message.endpoint-not-updated[.component]
For
route.Unregister
, we replaceunregistry_message[.component]
with these metrics:unregistry_message.endpoint-unregistered[.component]
unregistry_message.endpoint-not-unregistered[.component]
Additionally, when the route is [removed|https://github.com/cloudfoundry/gorouter/blob/aa6650d27dc7ba82f6b75bcae1a061f7594c5b58/registry/registry.go#L203], we emit:
unregistry_message.route-unregistered[.component]
The log messages have also been renamed, so metric names and log messages are the same in each case.
The text was updated successfully, but these errors were encountered: