Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert grouping test with amtool #3003

Open
freeseacher opened this issue Jul 13, 2022 · 4 comments · May be fixed by #4208
Open

Alert grouping test with amtool #3003

freeseacher opened this issue Jul 13, 2022 · 4 comments · May be fixed by #4208

Comments

@freeseacher
Copy link

What did you do?
Now we are activly using amtool config routes test and find it extremely usefull, but recently found that we should check if alert grouping is expected too.
for example
now we are checking that

% amtool config routes test --config.file alertmanager.yaml --tree \
--verify.receivers wire-team-opsgenie 'team=wire'
Matching routes:
.
└── default-route
    └── {team=~"^(?:^(wire)$)$"}  receiver: wire-team-opsgenie
wire-team-opsgenie

it will be usefull if we can pass something like

% amtool config routes test --config.file alertmanager.yaml \
--tree --verify.receivers wire-team-opsgenie \
--verify.grouping=env,cluster,priority 'team=wire'

Matching routes:
.
└── default-route
    └── {team=~"^(?:^(wire)$)$"}  receiver: wire-team-opsgenie
wire-team-opsgenie, grouping: [env,cluster,priority]

@gotjosh
Copy link
Member

gotjosh commented Jul 14, 2022

I'm not sure I follow the usefulness of this - on your example where you include the grouping, what changed?

@freeseacher
Copy link
Author

the main reason of it is for routing with custom subroutes.
for example i have something like

- receiver: wire-team-opsgenie
  group_by:
    - env
    - cluster
    - priority
  match_re:
    team: ^(wire)$
  routes:
    - receiver: wire-team-opsgenie
      group_by:
        - alertname
        - cve
        - cluster
      match:
        alert_topic: security
    - receiver: wire-team-opsgenie
      group_by:
        - alertname
        - service
        - project
        - team
      match:
        alertname: QuotaCanBeReached

You can see each alert will be sent to same receiver but with different grouping.
After opsgenie we create jira issue and alert grouping is a key to know we already had the same incident previously. So instead of opening new jira issue we can append to already created.
That is why its crucial to check if grouping is correct when changing am configs.

i propose two things

  1. show reciever grouping when displaying routing tree may be here https://github.com/prometheus/alertmanager/blob/main/cli/routing.go#L89
{team=~"^(?:^(wire)$)$"}  receiver: wire-team-opsgenie
wire-team-opsgenie, *grouping: [env,cluster,priority]*
  1. add new key verify.grouping that can check if receiver got expected grouping. maybe something like --verify.grouping[0]=[alertname,cve,cluster] will do the trick

@freeseacher
Copy link
Author

hey folks! any updates on this one ?

@heartwilltell
Copy link

hey folks! any updates on this one ?

I have created a PR to implement your feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants