-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Healthcheck: correctly label nodes removed from metrics #4178
Conversation
Commits a75df89 and 3cdd20d added and started setting new labels (dc/rack) to the healthcheck metrics. Unfortunately, those labels weren't taken into consideration when removing labels of removed clusters/nodes. This resulted in not removing those metrics until SM restart. This commit fixes this issue and brings a small refactor to the way in which healthcheck labels are applied, so that it's more difficult to make such mistake in the future. Fixes #4017
@karol-kokoszka @VAveryanov8 not sure about automated integration testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a test for that in integration-tests, it would catch this bug before releasing scylla-manager with the bug.
But I agree it's too much effort to add it and to cover that scenario. Especially that we are not really testing metrics in e2e scenarios.
Commits a75df89 and 3cdd20d added and started setting new labels (dc/rack) to the healthcheck metrics. Unfortunately, those labels weren't taken into consideration when removing labels of removed clusters/nodes. This resulted in not removing those metrics until SM restart. This commit fixes this issue and brings a small refactor to the way in which healthcheck labels are applied, so that it's more difficult to make such mistake in the future. Fixes #4017 (cherry picked from commit a16e511)
Commits a75df89 and 3cdd20d added and started setting new labels (dc/rack) to the healthcheck metrics. Unfortunately, those labels weren't taken into consideration when removing labels of removed clusters/nodes. This resulted in not removing those metrics until SM restart. This commit fixes this issue and brings a small refactor to the way in which healthcheck labels are applied, so that it's more difficult to make such mistake in the future. Fixes #4017 (cherry picked from commit a16e511)
Commits a75df89 and 3cdd20d added and started setting new labels (dc/rack) to the healthcheck metrics. Unfortunately, those labels weren't taken into consideration when removing labels of removed clusters/nodes. This resulted in not removing those metrics until SM restart. This commit fixes this issue and brings a small refactor to the way in which healthcheck labels are applied, so that it's more difficult to make such mistake in the future. Fixes #4017 (cherry picked from commit a16e511)
Commits a75df89 and 3cdd20d added and started setting new labels (dc/rack) in healthcheck metrics.
Unfortunately, those labels weren't taken into consideration when removing metrics of removed clusters/nodes:
scylla-manager/pkg/service/healthcheck/runner.go
Lines 116 to 121 in cafa851
This resulted in not removing those metrics until SM restart.
This commit fixes this issue and brings a small refactor to the way in which healthcheck labels are applied, so that it's more difficult to make such mistake in the future.
Fixes #4017