Skip to content

Add --tls_reload_interval_secs for automatic TLS cert hot reload#7236

Open
oliverlii wants to merge 1 commit intodragonflydb:mainfrom
oliverlii:feature/tls-cert-hot-reload
Open

Add --tls_reload_interval_secs for automatic TLS cert hot reload#7236
oliverlii wants to merge 1 commit intodragonflydb:mainfrom
oliverlii:feature/tls-cert-hot-reload

Conversation

@oliverlii
Copy link
Copy Markdown

Summary

Production TLS deployments rotate certificates periodically (e.g. via cert-manager, Vault, or ACME). Today Dragonfly requires a manual CONFIG SET tls_cert_file / tls_key_file / tls true sequence via redis-cli to pick up renewed certs. This does not scale across large fleets.

Changes

  • Add --tls_reload_interval_secs flag (minimum 60, default 0 = disabled)
  • Add TlsReloadScheduling() background fiber that periodically stat()s the configured cert, key, and CA cert files; when any mtime changes, calls ReconfigureTLS() on every listener
  • Zero downtime: SSL_CTX is refcounted, existing connections keep their sessions, only new handshakes use updated certs
  • Fiber exits immediately when TLS is not enabled or interval is 0
  • Wire up Init/Shutdown lifecycle (tls_reload_fb_, tls_reload_done_)

Testing

  • Integration test test_tls_hot_reload: starts server with TLS, overwrites cert/key files on disk with certs from a different CA, waits for reload timer, verifies old CA rejected, new CA accepted, and existing connection survives

Production TLS deployments rotate certificates periodically (e.g. via
cert-manager, Vault, or ACME). Today Dragonfly requires a manual
CONFIG SET tls_cert_file / tls_key_file / tls true sequence via
redis-cli to pick up renewed certs, which is error-prone and does not
scale across large fleets.

Add a new flag --tls_reload_interval_secs (minimum 60, default 0 =
disabled) that launches a background fiber to periodically stat() the
configured cert, key, and CA files. When any file's mtime changes, the
fiber calls ReconfigureTLS() on every listener, which atomically swaps
the SSL_CTX. Existing connections are unaffected (SSL_CTX is
refcounted); only new handshakes use the updated certificates. Zero
downtime, zero keyspace impact.

- Add TlsReloadScheduling() fiber with mtime-based change detection
- Wire up Init/Shutdown lifecycle (tls_reload_fb_, tls_reload_done_)
- Add integration test: overwrite certs on disk, wait for reload,
  verify old CA rejected and new CA accepted
@oliverlii oliverlii force-pushed the feature/tls-cert-hot-reload branch from cc8185f to 8735e4a Compare April 29, 2026 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant