📌 Proposal: Sandbox Session Heartbeat Monitoring + Snapshot Recovery #288
Replies: 1 comment
-
|
Related to #380 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Background / Motivation
Currently, our sandbox session management suffers from several issues:
This can eventually exhaust system capacity ("blast" the resource limits).
runtime configurations such as MCP settings, tool list caches, environment variables are lost.
Our goals are:
🎯 Feature Objectives
Heartbeat Mechanism
Automatically record a container’s last active time (
last_active_timestamp).Update heartbeat whenever the user interacts with the sandbox session.
Store mapping in Redis or in-memory:
Idle Timeout Handling
now - last_active > timeout(default: 5 minutes, configurable) → trigger container destruction.Snapshot Saving
Snapshot Restoration
add_mcp_servers(...)set_runtime_config(...)Extensibility
Design snapshot format with versioning for future additions:
{ "version": "v1", "mcp_servers": [...], "runtime_config": {...}, "secret_token": "..." }Add more runtime state in later iterations without breaking compatibility.
🏗 Module Design
1. Heartbeat Recording Module
_establish_connection), callupdate_heartbeat(session_ctx_id).Data structure:
2. Heartbeat Scanning Module
session_mapping:Example scan interval:
HEARTBEAT_SCAN_INTERVAL = 60 secondsHEARTBEAT_TIMEOUT = 300 seconds3. Snapshot Saving Module
release()process:snapshot.jsonin mount directory.storage_path).Snapshot file example:
4. Snapshot Restoration Module
snapshot.jsonto the new mount directory.add_mcp_servers(snapshot.mcp_servers, overwrite=true)set_runtime_config(snapshot.runtime_config)secret_tokenif applicable.🔄 End-to-End Flow
📦 Data Storage Layout
Redis Keys:
Object Storage Paths:
⚙️ Configurable Parameters
HEARTBEAT_TIMEOUT→ idle timeout duration (seconds)HEARTBEAT_SCAN_INTERVAL→ period between scans (seconds)SNAPSHOT_VERSION→ current snapshot schema versionREDIS_ENABLED→ store heartbeat data in Redis or memory📈 Benefits
🗳 Review Points
Beta Was this translation helpful? Give feedback.
All reactions