Skip to content

Latest commit

 

History

History
160 lines (115 loc) · 7.15 KB

File metadata and controls

160 lines (115 loc) · 7.15 KB

OpenVox GUI performance tuning

Shipped in stable 3.10.6 (July 2026). Pre-release train: 3.10.5-dev.1dev.5.

How to keep the web UI snappy as fleets and chart pages grow. This is about the GUI application (FastAPI + React + uvicorn), not Puppet Server / PuppetDB JVM tuning — for those, see TUNING.md and ovox infra.

Symptoms and likely causes

What you feel Likely cause
Graphs take a long time after the spinner PuppetDB query cost + Recharts paint of large series
UI freezes briefly on every auto-refresh Full-page loader unmounting charts on poll (fixed in 3.10.5+) or main-thread chart animation
Whole app sluggish under multi-tab use Single uvicorn worker, uncached dashboard, concurrent PDB load
First navigation to a metrics page is slow Large JS chunk download (mitigated by code-split + vendor chunks)

Overview | Dashboard + graph-heavy Insights pages

Dashboard-specific (largest cold-path win)

Why it used to feel slowest:

  1. Cold API path pulled up to 20k full PuppetDB report documents (metrics/resources/logs included) just to build hourly status trends.
  2. The UI showed a full-page spinner until that entire payload returned — no progressive paint.
  3. Auto-refresh unmounted the page on every poll (loading=true), so the ring + trends chart re-mounted repeatedly.

What we do now on Dashboard:

Change Effect
PuppetDB extract of certname, status, noop, receive_time only Orders-of-magnitude smaller JSON; trends still correct
20s server TTL + single-flight Concurrent tabs/users share one PDB hit
Lighter chart (monotone, height 320) + deferred casual mascot Faster first paint of ring + trends

Shared UI pattern (all graph-heavy pages)

useApi({ cacheKey, cacheValidate }) + default keep-previous-data:

Page session cache key prefix
Dashboard openvox_dashboard_data_v2
Compliance openvox_metrics_compliance_v1_*
Run Performance openvox_metrics_performance_v1_*
Fact Distribution openvox_metrics_fact_overview_v1
Class Coverage openvox_metrics_class_coverage_v1
Heatmap openvox_metrics_heatmap_v1
Classification openvox_metrics_classification_v1
Timeline openvox_metrics_timeline_v1_*
Node Health openvox_metrics_node_health_v1
Environments openvox_metrics_environments_v1
Server Health openvox_metrics_puppetserver_health_v1
OpenVoxDB Health openvox_metrics_puppetdb_health_v1

Return visits in the same tab paint the last-good snapshot immediately, then refresh in the background (“Refreshing…”). Auto-refresh no longer blanks charts.

If Dashboard is still slow on first login of the day, the remaining cost is co-located PuppetDB/CA latency for get_live_nodes() (active nodes ∩ signed certs). Check ovox infra health and PDB heap before raising GUI workers further.

What we optimized in the product

  1. Dashboard /api/dashboard/data — lean report extract + ≈20s TTL (single-flight); UI SWR + session cache (see above).
  2. Metrics / performance endpoint TTL cache (≈45s) — shared warm responses for compliance, fact overview, JMX health, run performance.
  3. GZip middleware — large JSON payloads compress over the wire.
  4. PuppetDB httpx pool — keep-alive connection limits so multi-chart pages reuse TLS sessions.
  5. uvicorn multi-worker + concurrency limits in the systemd unit (--workers, --limit-concurrency, --backlog, LimitNOFILE).
  6. Recharts animations off on operational charts; poll defaults 30s; monitoring history capped; series downsampled before bind.
  7. Vite manual chunks — recharts / Mantine / icons split so non-chart routes stay lighter.

Serving settings (uvicorn / systemd)

Defaults (template unit)

--workers 2
--limit-concurrency 100
--timeout-keep-alive 5
--backlog 2048
LimitNOFILE=65536
TasksMax=512

Raise workers for multi-core control planes

In /opt/openvox-gui/config/.env (or equivalent install dir):

# Explicit worker count (preferred). Each worker is a full Python process (~100–200 MB RSS).
OPENVOX_GUI_UVICORN_WORKERS=4

Then re-run deploy/update so the unit is rewritten, or edit ExecStart and systemctl daemon-reload && systemctl restart openvox-gui.

Guidance (co-located with Puppet Server + PuppetDB):

Host CPUs Suggested GUI workers Notes
2 1–2 Lab / tiny
4 2 Leave CPU for Server/PDB
8+ 3–4 Cap at 4–6 unless GUI is dedicated
Dedicated GUI host min(8, nproc-1) Watch RAM

Do not set workers so high that Puppet Server JRuby + PDB + GUI starve each other. Prefer ovox infra recommend for the Java side first.

Optional one-shot override during update

UVICORN_WORKERS_OVERRIDE=4 sudo -E bash scripts/update_local.sh

Verify

systemctl cat openvox-gui | grep -E 'ExecStart|LimitNOFILE|TasksMax'
ps -o pid,nlwp,rss,pcpu,cmd -C uvicorn
# or:
pgrep -af 'uvicorn app.main'

You should see a supervisor process plus N workers when --workers N is active (N>1).

Frontend operator tips

  • Prefer 30s or 60s auto-refresh on graph-heavy pages (defaults are 30s).
  • Close unused Monitoring tabs — the SPA still collects history in the background (throttled when the tab is hidden).
  • After a deploy, hard-refresh once so new hashed chunks load.

Reverse proxy (Apache / nginx)

  • Prefer gzip/brotli for application/json and static assets (nginx sample already enables gzip for JSON).
  • Do not cache authenticated /api/* at the proxy without careful cache keys — app-level TTL is safer.
  • Keep proxy buffering defaults unless you stream very large Bolt output.

Measuring before/after

# Authenticated cookie/token required for /api/dashboard/data
time curl -sk -o /dev/null -w '%{time_total}s %{size_download}\n' \
  -H "Authorization: Bearer $TOKEN" \
  https://127.0.0.1:4567/api/dashboard/data

# Second call within 20s should be much faster (cache hit)

In the browser: DevTools → Network (API timing) and Performance (long tasks during chart paint).

Roadmap / further options (not all shipped)

Idea Benefit Cost
React Query / SWR client cache Cross-page request dedupe, stale-while-revalidate New dependency + migration
Web Workers for heavy series transform Keeps UI thread free Complexity
uPlot / Canvas charts for wallboards Faster than SVG Recharts at high point counts Chart rewrite
Server-side pre-aggregates for trends Smaller payloads Background job + storage
Redis shared cache across workers One cache for all workers Ops dependency

For most fleets, workers + TTL cache + no chart animation + sane poll intervals is the right first mile.

Related docs