Skip to content

jorgelon/metrics-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Metrics stack

This is a non production and opinionated deployment of a prometheus instance with kubernetes and other applications metrics
It is meant to be deployed under the "monitoring" namespace and a kubeadm kubernetes cluster but it can work in other Kubernetes clusters.

## Components

| Name                                     | Annotations           | Link                                                              |
|------------------------------------------|-----------------------|-------------------------------------------------------------------|
| Kustomize                                |                       | <https://github.com/kubernetes-sigs/kustomize>                    |
| Prometheus Operator                      | (official manifests)  | <https://prometheus-operator.dev/>                                |
| Grafana Operator                         | (official manifests)  | <https://grafana.github.io/grafana-operator/>                     |
| Metrics server                           |                       | <https://github.com/kubernetes-sigs/metrics-server>               |
| Kube State Metrics                       | (hand made manifests) | <https://github.com/kubernetes/kube-state-metrics>                |
| Prometheus Node exporter                 | (hand made manifests) | <https://github.com/prometheus/node_exporter>                     |
| Some argocd related settings             | (like waves)          | <https://argo-cd.readthedocs.io/en/stable/user-guide/sync-waves/> |
| Stakater Reloader annotations in grafana |                       | <https://github.com/stakater/Reloader>                            |

### Additional rules and dashboards

| Name                          | Link                                                            |
|-------------------------------|-----------------------------------------------------------------|
| Prometheus Alert Rules        | <https://github.com/bdossantos/prometheus-alert-rules>          |
| Awesome Prometheus Alerts     | <https://samber.github.io/awesome-prometheus-alerts/rules.html> |
| Grafana Dashboards Kubernetes | <https://github.com/dotdc/grafana-dashboards-kubernetes>        |
| Kubernetes Monitoring Mixin   | <https://github.com/kubernetes-monitoring/kubernetes-mixin>     |
| Monitoring Mixins             | <https://monitoring.mixins.dev/>                                |

### How to use it

- Clone a tag of this repository
- Create a kustomization.yaml that loads the stack, the desired apps and addons. See notes below about configuration in every app
- Configure an AlertmanagerConfig resource to send alerts somewhere
- Expose the desired services (grafana, karma, prometheus,..) using ingress, gateway api or other solution
- Add the required label app.kubernetes.io/part-of: metrics-stack

Example:

```yaml
resources:
  - ../../releases/edge/addons/karma
  - ../../releases/edge/apps/control-plane
  - ../../releases/edge/apps/core
  - ../../releases/edge/apps/coredns
  - ../../releases/edge/stack
  - alertmanagerconfig.yaml
  - grafana-env.yaml
  - ingress.yaml
  - pvc-grafana.yaml
patches:
  - path: overlays/prometheus.yaml
  - path: overlays/grafana-datasource-loki.yaml
labels:
  - pairs:
      app.kubernetes.io/part-of: metrics-stack # required
      app.kubernetes.io/managed-by: kustomize
      app.kubernetes.io/instance: MY-CLUSTER
      app.kubernetes.io/version: PUT-THE-TAG-HERE
```

## Prometheus

The prometheus instance is very basic and with no spec.storage section. Use kustomize patches to personalize it.

## Grafana

- The grafana instance is deployed to use a PVC called grafana your must create before.
- It can configured via environment variables using a secret called "grafana-env"
- If has not being create to use more that 1 replica

## Alertmanager

- Create an AlertmanagerConfig resource to send alerts where you want.

## Keycloak

This stack does not includes the ServiceMonitor. You must create it.

## x509-certificate-exporter

- Cloudnative PG self signed certificates

Cloudnative PG operator by default autorenew their certificates, so the x509-certificate-exporter gives false positives.
In order to ignore them you can patch the whole PrometheusRule

```yaml
  - alert: "CertificateRenewal"
    expr: (x509_cert_not_after - time()) < (28 * 86400) and x509_cert_not_after{secret_name!="cnpg-webhook-cert",subject_CN!="streaming_replica"}
  - alert: "CertificateExpiration"
    expr: (x509_cert_not_after - time()) < (14 * 86400) and x509_cert_not_after{secret_name!="cnpg-webhook-cert",subject_CN!="streaming_replica"}
```

## ETCD

The etcd app only works if etcd metrics are exposed in 0.0.0.0. Sometimes etcd metrics are exposed in 127.0.0.1. In that case make your own way to send the metrics to prometheus and your can use the dashboard under the etcd folder.

The etcd dashboard is the official one but a little bit modified

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages