Skip to content

Commit 3cd94b8

Browse files
committed
docs: correct metric names to match actual exported Prometheus names
The OpenTelemetry Prometheus exporter appends `_total` to counter instruments (see open-telemetry/opentelemetry-go#3360), so the metrics exposed at /metrics do not match the names listed in this document. - Align the metric names in the table and sample output with the names the driver actually registers in pkg/secrets-store/stats_reporter.go. - Add the `provider` tag to rotation_reconcile_total and rotation_reconcile_error_total, which the reporter already sets. - Fix the k8s secret sync histogram name (k8s_secret_duration_sec). - Note that metrics only appear on driver pods that have served a mount/unmount/rotation, which can be confusing when port-forwarding. Fixes kubernetes-sigs#1937 Signed-off-by: Maksym Lushpenko <iviakciivi@gmail.com>
1 parent 0aad886 commit 3cd94b8

1 file changed

Lines changed: 44 additions & 40 deletions

File tree

docs/book/src/topics/metrics.md

Lines changed: 44 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,18 @@ Prometheus is the only exporter that's currently supported with the driver.
66

77
## List of metrics provided by the driver
88

9+
> **Note:** The OpenTelemetry Prometheus exporter appends a `_total` suffix to counter instruments (see [open-telemetry/opentelemetry-go#3360](https://github.com/open-telemetry/opentelemetry-go/pull/3360)). The names below are the names as they appear on the `/metrics` endpoint.
10+
911
| Metric | Description | Tags |
1012
| ------------------------------- | ------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
11-
| total_node_publish | Total number of successful volume mount requests | `os_type=<runtime os>`<br>`provider=<provider name>` |
12-
| total_node_unpublish | Total number of successful volume unmount requests | `os_type=<runtime os>` |
13-
| total_node_publish_error | Total number of errors with volume mount requests | `os_type=<runtime os>`<br>`provider=<provider name>`<br>`error_type=<error code>` |
14-
| total_node_unpublish_error | Total number of errors with volume unmount requests | `os_type=<runtime os>` |
15-
| total_sync_k8s_secret | Total number of k8s secrets synced | `os_type=<runtime os>`<br>`provider=<provider name>` |
16-
| sync_k8s_secret_duration_sec | Distribution of how long it took to sync k8s secret | `os_type=<runtime os>` |
17-
| total_rotation_reconcile | Total number of rotation reconciles | `os_type=<runtime os>`<br>`rotated=<true or false>` |
18-
| total_rotation_reconcile_error | Total number of rotation reconciles with error | `os_type=<runtime os>`<br>`rotated=<true or false>`<br>`error_type=<error code>` |
13+
| node_publish_total | Total number of successful volume mount requests | `os_type=<runtime os>`<br>`provider=<provider name>` |
14+
| node_unpublish_total | Total number of successful volume unmount requests | `os_type=<runtime os>` |
15+
| node_publish_error_total | Total number of errors with volume mount requests | `os_type=<runtime os>`<br>`provider=<provider name>`<br>`error_type=<error code>` |
16+
| node_unpublish_error_total | Total number of errors with volume unmount requests | `os_type=<runtime os>` |
17+
| sync_k8s_secret_total | Total number of k8s secrets synced | `os_type=<runtime os>`<br>`provider=<provider name>` |
18+
| k8s_secret_duration_sec | Distribution of how long it took to sync k8s secret | `os_type=<runtime os>` |
19+
| rotation_reconcile_total | Total number of rotation reconciles | `os_type=<runtime os>`<br>`provider=<provider name>`<br>`rotated=<true or false>` |
20+
| rotation_reconcile_error_total | Total number of rotation reconciles with error | `os_type=<runtime os>`<br>`provider=<provider name>`<br>`rotated=<true or false>`<br>`error_type=<error code>` |
1921
| rotation_reconcile_duration_sec | Distribution of how long it took to rotate secrets-store content for pods | `os_type=<runtime os>` |
2022

2123
Metrics are served from port 8095, but this port is not exposed outside the pod by default. Use kubectl port-forward to access the metrics over localhost:
@@ -25,39 +27,41 @@ kubectl port-forward ds/csi-secrets-store -n kube-system 8095:8095 &
2527
curl localhost:8095/metrics
2628
```
2729

30+
> **Note:** Metrics are only emitted from driver pods that have actually performed the corresponding work (e.g. a volume mount, unmount, or rotation reconcile). When port-forwarding a single pod for validation, it is possible to hit a pod that has not yet served any requests and therefore exposes no driver-specific metrics. Trigger the relevant action against that pod, or query all pods, to see the metrics.
31+
2832
### Sample Metrics output
2933

3034
```shell
31-
# HELP sync_k8s_secret_duration_sec Distribution of how long it took to sync k8s secret
32-
# TYPE sync_k8s_secret_duration_sec histogram
33-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="0.1"} 0
34-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="0.2"} 0
35-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="0.3"} 0
36-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="0.4"} 1
37-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="0.5"} 1
38-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="1"} 1
39-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="1.5"} 1
40-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="2"} 1
41-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="2.5"} 1
42-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="3"} 1
43-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="5"} 1
44-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="10"} 1
45-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="15"} 1
46-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="30"} 1
47-
sync_k8s_secret_duration_sec_bucket{os_type="linux",le="+Inf"} 1
48-
sync_k8s_secret_duration_sec_sum{os_type="linux"} 0.3115892
49-
sync_k8s_secret_duration_sec_count{os_type="linux"} 1
50-
# HELP total_node_publish Total number of node publish calls
51-
# TYPE total_node_publish counter
52-
total_node_publish{os_type="linux",provider="azure"} 1
53-
# HELP total_node_publish_error Total number of node publish calls with error
54-
# TYPE total_node_publish_error counter
55-
total_node_publish_error{error_type="ProviderBinaryNotFound",os_type="linux",provider="azure"} 2
56-
total_node_publish_error{error_type="SecretProviderClassNotFound",os_type="linux",provider=""} 4
57-
# HELP total_node_unpublish Total number of node unpublish calls
58-
# TYPE total_node_unpublish counter
59-
total_node_unpublish{os_type="linux"} 1
60-
# HELP total_sync_k8s_secret Total number of k8s secrets synced
61-
# TYPE total_sync_k8s_secret counter
62-
total_sync_k8s_secret{os_type="linux",provider="azure"} 1
35+
# HELP k8s_secret_duration_sec Distribution of how long it took to sync k8s secret
36+
# TYPE k8s_secret_duration_sec histogram
37+
k8s_secret_duration_sec_bucket{os_type="linux",le="0.1"} 0
38+
k8s_secret_duration_sec_bucket{os_type="linux",le="0.2"} 0
39+
k8s_secret_duration_sec_bucket{os_type="linux",le="0.3"} 0
40+
k8s_secret_duration_sec_bucket{os_type="linux",le="0.4"} 1
41+
k8s_secret_duration_sec_bucket{os_type="linux",le="0.5"} 1
42+
k8s_secret_duration_sec_bucket{os_type="linux",le="1"} 1
43+
k8s_secret_duration_sec_bucket{os_type="linux",le="1.5"} 1
44+
k8s_secret_duration_sec_bucket{os_type="linux",le="2"} 1
45+
k8s_secret_duration_sec_bucket{os_type="linux",le="2.5"} 1
46+
k8s_secret_duration_sec_bucket{os_type="linux",le="3"} 1
47+
k8s_secret_duration_sec_bucket{os_type="linux",le="5"} 1
48+
k8s_secret_duration_sec_bucket{os_type="linux",le="10"} 1
49+
k8s_secret_duration_sec_bucket{os_type="linux",le="15"} 1
50+
k8s_secret_duration_sec_bucket{os_type="linux",le="30"} 1
51+
k8s_secret_duration_sec_bucket{os_type="linux",le="+Inf"} 1
52+
k8s_secret_duration_sec_sum{os_type="linux"} 0.3115892
53+
k8s_secret_duration_sec_count{os_type="linux"} 1
54+
# HELP node_publish_total Total number of node publish calls
55+
# TYPE node_publish_total counter
56+
node_publish_total{os_type="linux",provider="azure"} 1
57+
# HELP node_publish_error_total Total number of node publish calls with error
58+
# TYPE node_publish_error_total counter
59+
node_publish_error_total{error_type="ProviderBinaryNotFound",os_type="linux",provider="azure"} 2
60+
node_publish_error_total{error_type="SecretProviderClassNotFound",os_type="linux",provider=""} 4
61+
# HELP node_unpublish_total Total number of node unpublish calls
62+
# TYPE node_unpublish_total counter
63+
node_unpublish_total{os_type="linux"} 1
64+
# HELP sync_k8s_secret_total Total number of k8s secrets synced
65+
# TYPE sync_k8s_secret_total counter
66+
sync_k8s_secret_total{os_type="linux",provider="azure"} 1
6367
```

0 commit comments

Comments
 (0)