Skip to content

Add metrics unavailable UI and checks#600

Open
psanghvi17 wants to merge 1 commit into
mainfrom
feat/metrics-unavailable-banner
Open

Add metrics unavailable UI and checks#600
psanghvi17 wants to merge 1 commit into
mainfrom
feat/metrics-unavailable-banner

Conversation

@psanghvi17

@psanghvi17 psanghvi17 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

PR Checklist

  • Linting Test is passing
  • Code is well documented
  • If applicable, a PR in the epinio/docs repository has been opened

Summary

Fixes #
The dashboard and application detail pages display CPU and memory metrics. When a metrics server is not installed, those views showed broken or empty UI with no explanation. This PR detects unavailable metrics and shows a clear warning banner, with graceful fallbacks in the instances table.

Occurred changes and/or fixed issues

Dashboard (pages/c/_cluster/dashboard.vue)

Track metrics availability via a new metricsStatus state (unknown | available | unavailable).
Mark metrics unavailable when the node metrics schema is missing or the metrics API fetch fails.
Show a warning banner with guidance when metrics are unavailable.
Show the existing “available resources” info banner only when metrics are available.
In Rancher extension mode, also detect unavailable metrics from application replica metricsOk flags.
Application detail (detail/applications.vue)

Show a warning banner instead of the CPU/memory stats table when the app has instances but metrics are unavailable.
Format Mill CPUs and RAM instance columns as “not available” when metricsOk is false.
Application model (models/applications.js)

Add metricsOk getter derived from deployment.replicas — returns true when all replicas report metrics as available.
Localization (l10n/en-us.yaml)

Add epinio.intro.metrics.notAvailable (full banner message).
Add epinio.intro.metrics.notAvailableShort (table cell fallback).
No backend changes; uses existing replica metricsOk data and node metrics API responses.

Technical notes summary

Dashboard standalone mode: checks for METRIC.NODE schema and fetches /k8s/clusters/{id}/v1/metrics.k8s.io.nodemetrics; failures set metricsStatus to unavailable.
Dashboard Rancher extension mode: scans loaded applications and checks deployment.replicas[].metricsOk because cluster-level metrics may not be directly available the same way.
Application detail: showMetricsUnavailable is true when the app has instances and app.metricsOk is false (any replica missing metrics).
Stats table (min/max/avg CPU and memory) is hidden when metrics are unavailable; instances table still renders with “not available” in metric columns.
metricsOk defaults to true when there are no replicas yet, so apps without deployments do not show a false warning.

Areas or cases that should be tested

Open Epinio Dashboard — warning banner appears: “Metrics are not enabled…”
Confirm the old broken/empty metrics UI is not shown.
Open an Application detail page with running instances — warning banner appears in place of the stats table.
Instances tab — Mill CPUs and RAM columns show “not available” instead of broken values.

Areas which could experience regressions

Dashboard metrics info banner — logic changed from v-if="showMetricsInfo" to v-else-if="showMetricsInfo" behind the unavailable check; confirm the info banner still appears when metrics work.
calcAvailableResources — new early return when schema is missing; verify it does not block other dashboard initialization.
Rancher extension detection — showMetricsUnavailable uses app replica data in single-product mode; false positives/negatives if metricsOk is missing or stale on API response.
Application stats table — hidden entirely when metrics unavailable; users lose min/max/avg view even if partial data existed before.
Instance table formatters — custom formatters replace milliCPUs and memory column formatters; verify display when metrics are available matches previous behavior.

Screenshot/Video

Show a clear warning when cluster metrics are not available, instead of a broken or empty UI on the dashboard and application detail screens.
- Add a `metricsOk` getter on `EpinioApplicationModel` to aggregate replica metric state (aligned with CLI behaviour)
- Track `metricsStatus` and `showMetricsUnavailable` on the cluster dashboard, with detection for standalone and Rancher-embedded modes
- Display a warning banner on the dashboard and application detail page when metrics are unavailable
- Replace the CPU/RAM stats table with the banner on app detail when metrics are disabled
- Show "not available" in instance table CPU/RAM columns instead of misleading zero values
- Add i18n strings for the unavailable metrics message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant