-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFE] Customizable criticality strings in detectors program text #305
Comments
Hello @chungktran I understood the issue and this is legitimate to ask. That said, the current configuration is an intended opinionated implementation. for now the severity of an alert follows the notification binding policy. if we take again your example, the system-common module has a cpu detector using unchangeable
The alert will still has the same severity for true but there is no oncall raised, only a non disupritve slack message because it is not important in this case. In addition to the complexity it can involve to make this dynamic, keep it static is also a way to standardize a recommended severity (and its notification destination) and force the contributor to think about the most relevant severity when it creates new detectors. please tell me if this notification binding makes sens for you and else why it cannot work for you, thanks. |
@xp-1000 Thanks for your feedback. I'll see if I can have my requirements met by doing what you suggested. If you don't mind; please keep this request open for now. |
@xp-1000 I reviewed the variables and see if I can make what you suggested works for us; unfortunately it doesn't. The reason for my use case is this. I have a team in OpsGenie that takes all alerts and make alert routing rules based on Priority. With the current design of the detectors I have to create a different team in OpsGenie for CRIT and MAJOR alerts because the detectors don't allow the With my proposal I could have just one OpsGenie team and make CRIT detectors send |
hello @chungktran Ok thank your for the feedback! As I said before, the way we create / template our detectors now are opinionated and "per severity per rule" oriented so as you can imagine this will be difficult to make this fully customizable. For example, I think about the variable naming itself which contain the severity rule name (e.g. If I right understood your use case, if you are able to change the severity attribute only: https://registry.terraform.io/providers/splunk-terraform/signalfx/latest/docs/resources/detector#severity this should be good for you, am I right ? This will be a little weird because you will still have to use variables of the original severity for a detector/rule where you changed to another one. (e.g. changing a severity rule from Is this enough for you ? honestly make the whole thing fully dynamic and able to change including variables names will be hard and may be the only solution is to use terraform cdk to enjoy all features from a real language. |
Is your feature request related to a problem? Please describe.
Currently detectors program text publishes CRIT, MAJOR, or WARN strings (I do not see INFO or DEBUG); which in turn used by
detect_label
in therule
s to send out notifications. I would like to propose to turn those into strings variables so that a CRIT can be overridden to be a MAJOR for example. The reason for this request is that an event may be CRIT for one team does not necessarily means it's also a CRIT for another team. A good example for this is thesmart-agent_system-common
detectors.Describe the solution you'd like
Using the
smart-agent_system-common
as an example.In this file:
https://github.com/claranet/terraform-signalfx-detectors/blob/master/modules/smart-agent_system-common/variables-gen.tf
, define extra parameters for each detector. For example,heartbeat_crit_value
,heartbeat_major_value
, andheartbeat_warn_value
. Majority of the modules will only need the first two extra parameters.Then in the module's detector tf file,
https://github.com/claranet/terraform-signalfx-detectors/blob/master/modules/smart-agent_system-common/detectors-gen.tf
, update theprogram_text
of each resource to.publish('${var.heartbeat_crit_value}')
or.publish('${var.heartbeat_major_value}')
.With the changes above a CRIT event can be turned into any event instead by just overriding the variable.
Describe alternatives you've considered
I have read through the modules and have not found a solution for this request.
The text was updated successfully, but these errors were encountered: