Skip to content

Proposal: Support non-expiring sandboxes / manual cleanup mode across Docker and Kubernetes #442

@ctlaltlaltc

Description

@ctlaltlaltc

Summary

I’d like to discuss adding a "non-expiring" / "manual cleanup" mode for sandbox creation, so that sandboxes do not require TTL-based auto-expiration and can instead be cleaned up explicitly by the application layer.

This should ideally work consistently across both Docker and Kubernetes backends.

Why this is needed

Current, sandbox lifecycle is strongly TTL-driven:

  • create requires timeout
  • server computes expiresAt
  • backend auto-expires the sandbox
  • caller can renew expiration, but cannot disable expiration

This works well for short-lived workloads, but it is limiting for application-integrated scenarios where the upper layer already owns lifecycle and cleanup.

Use Cases

1. Session/workspace sandboxes

An application creates one sandbox per user session or workspace and wants to delete it only when the session ends.

Examples:

  • web IDE
  • notebook workspace
  • coding interview environment

2. External workflow/orchestrator-managed cleanup

A higher-level system already decides when a sandbox should be deleted.

Examples:

  • cleanup after workflow completion
  • cleanup tied to external job state
  • business-driven retry/recovery flows

3. Manual debugging / review

A sandbox should remain available until a human explicitly cleans it up.

Examples:

  • failure investigation
  • QA reproduction environment
  • post-run inspection

4. Stateful application integration

A sandbox may need to stay alive while the application coordinates volume export, snapshot, or handoff.

Proposal

Introduce an explicit expiration mode instead of relying only on timeout.

For example:

  • ttl: current behavior, sandbox auto-expires
  • manual: no server-side TTL expiration; sandbox is deleted only by explicit API call or external control-plane cleanup

I think this is better than using magic values like timeout=0, -1, or a far-future timestamp.

Expected behavior for manual mode

  • no auto-expiration in Docker
  • no auto-expiration in Kubernetes
  • expiresAt may need to be nullable / omitted for this mode
  • cleanup responsibility belongs to the caller/application
  • unsupported providers should fail clearly

Non-Goals

This proposal is not asking to remove TTL support.

TTL should remain the default and recommended mode for most short-lived workloads. The request is to add an opt-in manual cleanup mode for integrations that need it.

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions