OpenSandbox is a universal sandbox platform designed for AI application scenarios, providing a complete solution with multi-language SDKs, standardized sandbox protocols, and flexible runtime implementations. This document describes the overall architecture and design philosophy of OpenSandbox.
The OpenSandbox architecture consists of four main layers:
- SDKs Layer - Client libraries for interacting with sandboxes
- Specs Layer - OpenAPI specifications defining the protocols
- Runtime Layer - Server implementations managing sandbox lifecycle
- Sandbox Instances Layer - Running sandbox containers with injected execution daemons
The SDK layer provides high-level abstractions for developers to interact with sandboxes. It handles communication with both the Sandbox Lifecycle API and the Sandbox Execution API.
The Sandbox class is the primary entry point for managing sandbox lifecycle:
- Create: Provision new sandbox instances from container images
- Manage: Monitor sandbox state, renew expiration, retrieve endpoints
- Destroy: Terminate sandbox instances when no longer needed
Key Features:
- Async/await support for non-blocking operations
- Automatic state polling for provisioning progress
- Resource quota management (CPU, memory, GPU)
- Metadata and environment variable injection
- TTL-based automatic expiration with renewal
The Filesystem component provides comprehensive file operations within sandboxes:
- CRUD Operations: Create, read, update, and delete files and directories
- Bulk Operations: Upload/download multiple files efficiently
- Search: Glob-based file searching with pattern matching
- Permissions: Manage file ownership, group, and mode (chmod)
- Metadata: Retrieve file info including size, timestamps, permissions
Use Cases:
- Uploading code files and dependencies
- Downloading execution results and artifacts
- Managing workspace directories
- Searching for files by pattern
The Commands component enables shell command execution within sandboxes:
- Foreground Execution: Run commands synchronously with real-time output streaming
- Background Execution: Launch long-running processes in detached mode
- Stream Support: Capture stdout/stderr via Server-Sent Events (SSE)
- Process Control: Interrupt running commands via context cancellation
- Working Directory: Specify custom working directory for command execution
Use Cases:
- Running build commands (e.g.,
npm install,pip install) - Executing system utilities (e.g.,
git,docker) - Starting web servers or services
- Running test suites
The CodeInterpreter component provides stateful code execution across multiple programming languages:
- Multi-Language Support: Python, Java, JavaScript, TypeScript, Go, Bash
- Session Management: Maintain execution state across multiple code blocks
- Jupyter Integration: Built on Jupyter kernel protocol for robust execution
- Result Streaming: Real-time output via SSE with execution counts
- Error Handling: Structured error responses with tracebacks
Key Features:
- Variable persistence across executions within same session
- Display data in multiple MIME types (text, HTML, images)
- Execution interruption support
- Execution timing and performance metrics
Use Cases:
- Interactive coding environments (e.g., Jupyter notebooks)
- AI code generation and execution
- Data analysis and visualization
- Educational coding platforms
OpenSandbox provides SDKs in multiple languages:
- Python SDK (
sdks/sandbox/python,sdks/code-interpreter/python) - Java/Kotlin SDK (
sdks/sandbox/kotlin,sdks/code-interpreter/kotlin) - TypeScript SDK (Roadmap)
All SDKs follow the same design patterns and provide consistent APIs across languages.
The Specs layer defines two core OpenAPI specifications that establish the contract between SDKs and runtime implementations.
File: specs/sandbox-lifecycle.yml
The Lifecycle Spec defines the API for managing sandbox instances throughout their lifecycle.
| Operation | Endpoint | Description |
|---|---|---|
| Create | POST /sandboxes |
Create a new sandbox from a container image |
| List | GET /sandboxes |
List sandboxes with filtering and pagination |
| Get | GET /sandboxes/{id} |
Retrieve sandbox details and status |
| Delete | DELETE /sandboxes/{id} |
Terminate a sandbox |
| Pause | POST /sandboxes/{id}/pause |
Pause a running sandbox |
| Resume | POST /sandboxes/{id}/resume |
Resume a paused sandbox |
| Renew | POST /sandboxes/{id}/renew-expiration |
Extend sandbox TTL |
| Endpoint | GET /sandboxes/{id}/endpoints/{port} |
Get public URL for a port |
File: specs/execd-api.yaml
The Execution Spec defines the API for interacting with running sandbox instances. This API is implemented by the execd daemon injected into each sandbox.
Health
GET /ping- Health check
Code Interpreting
POST /code/context- Create execution contextPOST /code- Execute code with streaming outputDELETE /code- Interrupt code execution
Command Execution
POST /command- Execute shell commandDELETE /command- Interrupt command
Filesystem
GET /files/info- Get file metadataDELETE /files- Remove filesPOST /files/permissions- Change permissionsPOST /files/mv- Rename/move filesGET /files/search- Search files by glob patternPOST /files/replace- Replace file contentPOST /files/upload- Upload filesGET /files/download- Download filesPOST /directories- Create directoriesDELETE /directories- Remove directories
Metrics
GET /metrics- Get system metrics snapshotGET /metrics/watch- Stream metrics via SSE
The Runtime layer implements the Sandbox Lifecycle Spec and manages the orchestration of sandbox containers.
Location: server/
The OpenSandbox server is a FastAPI-based service providing:
- Lifecycle Management: Create, monitor, pause, resume, and terminate sandboxes
- Pluggable Runtimes: Docker (production-ready), Kubernetes (production-ready)
- Async Provisioning: Background creation to reduce latency
- Automatic Expiration: Configurable TTL with renewal support
- Access Control: API key authentication
- Observability: Unified status tracking with transition logging
Features:
- Direct Docker API integration
- Two networking modes:
- Host Mode: Containers share host network (single instance)
- Bridge Mode: Isolated networking with HTTP routing
- Container lifecycle management
- Resource quota enforcement
- Private registry authentication
- Volume mounting for execd injection
- Automatic cleanup on expiration
Key Responsibilities:
- Pull container images (with auth support)
- Create containers with resource limits
- Inject execd binary and start script
- Monitor container state
- Handle pause/resume operations
- Clean up terminated containers
Features:
- Built-in BatchSandbox runtime with sandbox pooling, high-throughput batch creation, and heterogeneous task orchestration; also compatible with SIG agent-sandbox as an alternative runtime
- Support for different secure container runtimes (e.g., kata-containers, gVisor)
- Helm-based deployment for controller and server, see documentation
Planned Features:
- Unified network storage mounting (ossfs, NAS, custom PVC) in both pooled and non-pooled modes
- Pause/resume support
The pluggable architecture allows implementing custom runtimes by:
- Implementing the Lifecycle Spec APIs
- Managing sandbox provisioning and cleanup
- Injecting execd into sandbox instances
- Reporting sandbox state transitions
Purpose: Provides HTTP/HTTPS load balancing to sandbox instance ports.
Features:
- Dynamic endpoint generation based on sandbox ID and port
- Supports both domain-based and wildcard routing
- Reverse proxy to sandbox container ports
- Automatic cleanup when sandbox terminates
Endpoint Format: {domain}/sandboxes/{sandboxId}/port/{port}
Use Cases:
- Accessing web applications running in sandboxes
- Connecting to development servers (e.g., VS Code Server)
- Exposing APIs and services
- VNC and remote desktop access
Sandbox instances are running containers that host user workloads with an injected execution daemon.
Each sandbox instance consists of:
- Base Container: User-specified image (e.g.,
ubuntu:22.04,python:3.11) - execd Daemon: Injected execution agent implementing the Execution Spec
- Entrypoint Process: User-defined main process
Location: components/execd/
execd is a Go-based HTTP daemon built on the Beego framework.
- Code Execution: Manage Jupyter kernel sessions for multi-language code execution
- Command Execution: Run shell commands with output streaming
- File Operations: Provide filesystem API for remote file management
- Metrics Collection: Monitor and report CPU, memory usage
Technology Stack:
- Language: Go 1.24+
- Web Framework: Beego
- Jupyter Integration: WebSocket-based Jupyter protocol client
- Streaming: Server-Sent Events (SSE)
Package Structure:
pkg/flag/- Configuration and CLI flagspkg/web/- HTTP layer (controllers, models, router)pkg/runtime/- Execution dispatcherpkg/jupyter/- Jupyter kernel clientpkg/util/- Utilities and helpers
execd integrates with Jupyter Server running inside the container:
- Session Management: Create and maintain kernel sessions
- WebSocket Communication: Real-time bidirectional communication
- Message Protocol: Jupyter message spec implementation
- Stream Parsing: Parse execution results, outputs, errors
Supported Kernels:
- Python (IPython)
- Java (IJava)
- JavaScript (IJavaScript)
- TypeScript (ITypeScript)
- Go (gophernotes)
- Bash
The execd daemon is injected into sandbox containers during creation:
Docker Runtime Injection Process:
- Pull execd Image: Retrieve the execd container image
- Extract Binary: Copy execd binary from image to temporary location
- Volume Mount: Mount execd binary and startup script into target container
- Entrypoint Override: Modify container entrypoint to start execd first
- User Process Launch: execd forks and executes the user's entrypoint
Startup Sequence:
# Container starts with modified entrypoint
/opt/opensandbox/start.sh
↓
# Start Jupyter Server
jupyter notebook --port=54321 --no-browser --ip=0.0.0.0
↓
# Start execd daemon
/opt/opensandbox/execd --jupyter-host=http://127.0.0.1:54321 --port=44772
↓
# Execute user entrypoint
exec "${USER_ENTRYPOINT[@]}"Benefits:
- Transparent to user code
- No image modification required
- Dynamic injection at runtime
- Works with any base image
User/SDK
│
│ 1. POST /sandboxes (image, entrypoint, resources)
▼
Server (Lifecycle API)
│
│ 2. Pull container image
│ 3. Inject execd binary
│ 4. Create container with entrypoint override
│ 5. Start container
▼
Sandbox Instance
│
│ 6. Start execd daemon
│ 7. Start Jupyter Server
│ 8. Execute user entrypoint
▼
Running (State)
User/SDK
│
│ 1. Create sandbox
│ 2. Get execd endpoint
▼
CodeInterpreter SDK
│
│ 3. POST /code/context (create session)
│ 4. POST /code (execute code)
▼
execd (Execution API)
│
│ 5. Route to Jupyter runtime
▼
Jupyter Runtime
│
│ 6. WebSocket to Jupyter Server
│ 7. Send execute_request
▼
Jupyter Kernel (Python/Java/etc.)
│
│ 8. Execute code
│ 9. Stream output events
▼
execd
│
│ 10. Convert to SSE events
│ 11. Stream to client
▼
CodeInterpreter SDK
│
│ 12. Parse events
│ 13. Return result to user
▼
User/Application
User/SDK
│
│ 1. Upload files
▼
Filesystem SDK
│
│ 2. POST /files/upload (multipart)
▼
execd (Execution API)
│
│ 3. Write to filesystem
│ 4. Set permissions
▼
Sandbox Container Filesystem
- All interactions defined by OpenAPI specifications
- Clear contracts between components
- Enables polyglot implementations
- Supports custom runtime implementations
- SDK: Client-side abstraction and convenience
- Specs: Protocol definition and documentation
- Runtime: Sandbox orchestration and lifecycle
- execd: In-sandbox execution and operations
- Pluggable runtime implementations
- Custom sandbox images
- Multiple SDK languages
- Additional Jupyter kernels
- API key authentication for lifecycle operations
- Token-based authentication for execution operations
- Isolated sandbox environments
- Resource quota enforcement
- Network isolation options
- Structured state transitions
- Real-time metrics streaming
- Comprehensive logging
- Health check endpoints
AI models (like Claude, GPT-4, Gemini) generate code that needs to be executed safely:
- Isolation: Run untrusted AI-generated code in sandboxes
- Multi-Language: Support various programming languages
- Iteration: Maintain state across multiple code generations
- Feedback: Capture execution results and errors for AI refinement
Examples: claude-code, gemini-cli, codex-cli
Build web-based coding platforms and notebooks:
- Code Execution: Run code in isolated environments
- File Management: Upload/download project files
- Terminal Access: Execute shell commands
- Collaboration: Share sandbox instances
Examples: code-interpreter
Automate web browsers for testing and scraping:
- Headless Browsers: Chrome, Playwright
- Remote Debugging: DevTools protocol
- VNC Access: Visual debugging
- Network Isolation: Controlled environment
Examples: chrome, playwright
Provide cloud-based development workspaces:
- VS Code Server: Full IDE in browser
- Desktop Environments: VNC-based desktops
- Tool Pre-installation: Language runtimes, build tools
- Port Forwarding: Access development servers
Run build and test pipelines in isolated environments:
- Reproducible Builds: Consistent container images
- Parallel Execution: Multiple sandbox instances
- Artifact Collection: Download build outputs
- Resource Limits: Prevent resource exhaustion
OpenSandbox provides a complete, production-ready platform for building AI-powered applications that require safe code execution, file management, and command execution in isolated environments. The architecture is designed to be:
- Universal: Works with any container image
- Extensible: Pluggable runtimes and custom implementations
- Developer-Friendly: Multi-language SDKs with consistent APIs
- Production-Ready: Robust lifecycle management and observability
- Secure: Isolated environments with access control
The protocol-first design ensures that all components can evolve independently while maintaining compatibility. Whether you're building AI coding assistants, interactive notebooks, or remote development environments, OpenSandbox provides the foundation you need.