Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator Level 4 Capability Level - Deep Insights: Monitoring and Alerting #205

Open
4 tasks
rm3l opened this issue Feb 16, 2024 · 2 comments
Open
4 tasks
Labels
jira Issue will be sync'ed to Red Hat JIRA

Comments

@rm3l
Copy link
Member

rm3l commented Feb 16, 2024

See https://sdk.operatorframework.io/docs/overview/operator-capabilities/#level-4---deep-insights

Goal
Setup full monitoring and alerting for your operand. All resources such as Prometheus rules (alerts) and Grafana dashboards should be created by the operator when the operand CR is instantiated.

TODO

  • Add ability in CRD to create a ServiceMonitor resource (follow-up to Integration with OpenShift logging and Monitoring #180)
  • Implement Prometheus metrics in RHDH Operator for Backstage CR reconciliation failure/success
  • Implement Grafana dashboards to monitor a) Whether the operator is up and running as well as how long it has been running, b) memory & CPU consumption by the operator
  • Implement alerts so that when the operator is down, certain actions get triggered, eg, a notification gets sent to the user's slack channel, a Jira ticket is created, etc.
@github-actions github-actions bot added the jira Issue will be sync'ed to Red Hat JIRA label Feb 16, 2024
@rm3l rm3l changed the title Add ability in CRD to create a ServiceMonitor resource. [Epic] Monitoring Feb 16, 2024
@jianrongzhang89 jianrongzhang89 changed the title [Epic] Monitoring [Epic] RHDH Operator Monitoring Feb 16, 2024
@gazarenkov
Copy link
Member

Do we also consider upstream (vanilla K8s) here or Openshift only?

@rm3l
Copy link
Member Author

rm3l commented Feb 26, 2024

Do we also consider upstream (vanilla K8s) here or Openshift only?

Not only OpenShift, I think. It should work on both.

@rm3l rm3l changed the title [Epic] RHDH Operator Monitoring RHDH Operator Monitoring Feb 26, 2024
@rm3l rm3l changed the title RHDH Operator Monitoring Operator Level 4 Capability Level - Deep Insights: Monitoring and Alerting Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira Issue will be sync'ed to Red Hat JIRA
Projects
None yet
Development

No branches or pull requests

2 participants