Skip to content

Allow Grafana dashboards to filter by namespace #33

@NT-florianernst

Description

@NT-florianernst

Hello there,

in the course of integrating InspectIT Ocelot into the Novatec CPJ context, cf. TC-301, I found that the Grafana dashboards for Prometheus don't allow filtering by Kubernetes namespace but instead always display services found over all namespaces.

I understand that such filtering won't be needed in many contexts, e.g. when monitoring a plain Docker-based environment or when no containers are in the loop at all, but in a Kubernetes context with namespace cluster subpartitioning where similar workloads (with identical names) are running in separate namespaces, as used during CPJ or as used for running multiple stages in a single cluster, this really is a must-have (as long as one doesn't want to have a separate monitoring infrastructure per namespace). So for CPJ purposes I ad-hoc adjusted the Grafana dashboards for Prometheus to allow such filtering, you will find my forked versions attached for the purposes of illustration. I'm not posting a PR as I don't deem them complete enough for integration (see below).
These versions introduce a namespace variable querying label_values(namespace) and adjust the service variable to take heed of that via querying label_values({namespace="$namespace"},service), i.e. a selector not bound to a specific metric, and then add that namespace variable to all dashboard expressions where service was referenced.

What these versions are not covering is that they don't yet default to showing data from all namespaces (if there are any) / all data (if there are no namespaces at all). If they did, they'd be equivalent functionality-wise in their defaults to the current dashboards but still allow the filtering that is required in a Kubernetes cluster context, and then they'd be basically suitable for replacing the current ones.
Also, it seems that some dashboard parts cannot yet handle having multiple separately-labeled metrics for a specific service, as can easily happen using horizontal (auto-)scaling on Kubernetes. Those then display the dreaded Only queries that return single series/table is supported or display multiple identically-named graphs for separate containers, making it hard to distinguish which values apply to which container.
Furthermore, these versions don't cover the Grafana dashboards for InfluxDB, of course, as that was out-of-scope for my purposes so far.

As such, I'm filing this issue here mainly to raise awareness of the troubles involved in a Kubernetes context, without having a complete solution for them. But I'm willing to further investigate how these troubles could best be tackled if there is interest to see that integrated into Ocelot.

All the best,
Florian

PS: here my local versions of the dashboards (renamed to .txt as GitHub won't accept them as .json)
ocelot-gc.json.txt
ocelot-http.json.txt
ocelot-jvm.json.txt
ocelot-self.json.txt
ocelot-service.json.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions