diff --git a/.spec/SessionBackends/00-overview.md b/.spec/SessionBackends/00-overview.md new file mode 100644 index 0000000..eee8490 --- /dev/null +++ b/.spec/SessionBackends/00-overview.md @@ -0,0 +1,102 @@ +# Session Backend Abstraction — Design Overview + +## Spec Documents + +| Doc | Contents | +|-----|----------| +| [01-tmux-audit.md](01-tmux-audit.md) | Every tmux touchpoint in the codebase, categorized | +| [02-session-backend-interface.md](02-session-backend-interface.md) | `SessionBackend` interface, data model changes, migration plan | +| [03-tmux-backend.md](03-tmux-backend.md) | `TmuxSessionBackend` implementation, method mapping, file layout | +| [04-ambient-backend.md](04-ambient-backend.md) | `AmbientSessionBackend` implementation, API mapping, behavioral differences | +| [05-agentcore-feasibility.md](05-agentcore-feasibility.md) | AWS Bedrock AgentCore feasibility analysis | + +## Summary + +### What exists today + +- **tmux hardcoded everywhere**: 10+ functions in `tmux.go`, called directly from lifecycle + handlers, liveness loop, broadcast, introspect, approve, and reply. +- **`AgentBackend` interface** in `agent_backend.go` (PR #47) with `Spawn/Stop/List/Name`. + Only used by `handleCreateAgents`. All other code bypasses it. +- **`AgentUpdate.TmuxSession`** is the field that links an agent to its session. Used across + types, DB models, handlers, frontend, scripts, and docs. +- **`tmuxDefaultSession`** (PR #49, open) proposes space-scoped names `{space}-{agent}` — not adopted here. + +### What this design introduces + +- **`SessionBackend` interface** with 13 methods covering the full surface: identity + (`Name`, `Available`), lifecycle (`CreateSession`, `KillSession`, `SessionExists`, + `ListSessions`), status (`GetStatus`), observability (`IsIdle`, `CaptureOutput`, + `CheckApproval`), interaction (`SendInput`, `Approve`, `Interrupt`), and discovery + (`DiscoverSessions`). +- **Role interfaces** (`SessionLifecycle`, `SessionObserver`, `SessionActor`) for + narrow consumer dependencies and easier testing. +- **`TmuxSessionBackend`** — wraps existing tmux functions. Preserves current + `agentdeck_*` naming convention. Zero behavior change. +- **`AmbientSessionBackend`** — backed by the ACP public API (`POST /sessions`, + `POST /message`, `GET /output`, `DELETE /sessions/{id}`, `POST /interrupt`, etc.). + Depends on platform PR #855. +- **Subsumes `AgentBackend`** — the existing interface from PR #47 is folded into + `SessionBackend`. `agent_backend.go` is deleted. +- **`AgentUpdate.SessionID`** + **`AgentUpdate.BackendType`** — replaces `TmuxSession` + with backend-agnostic fields. No backward-compat shim (clean break). +- **`Server.backends`** registry — map of backend name to implementation. Agents carry + their backend type; the server resolves the right implementation per-agent. +- **`SessionStatus`** enum — unified status model (`unknown`, `pending`, `running`, `idle`, + `completed`, `failed`, `missing`) that all backends map into. +- **`BackendOpts interface{}`** — backend-specific creation options. Each backend defines + its own options struct (`TmuxCreateOpts`, `AmbientCreateOpts`), keeping backend-specific + code contained within each backend. + +### Interface at a glance + +```go +type SessionBackend interface { + Name() string + Available() bool + + CreateSession(ctx context.Context, opts SessionCreateOpts) (string, error) + KillSession(ctx context.Context, sessionID string) error + SessionExists(sessionID string) bool + ListSessions() ([]string, error) + + GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) + + IsIdle(sessionID string) bool + CaptureOutput(sessionID string, lines int) ([]string, error) + CheckApproval(sessionID string) ApprovalInfo + + SendInput(sessionID string, text string) error + Approve(sessionID string) error + Interrupt(ctx context.Context, sessionID string) error + + DiscoverSessions() (map[string]string, error) +} +``` + +### Scope of changes + +| Area | Files affected | Nature of change | +|------|---------------|-----------------| +| Interface definition | New: `session_backend.go` | New file | +| Tmux backend | New: `session_backend_tmux.go` | Wraps existing functions | +| Old backend | Delete: `agent_backend.go` | Superseded (folded into SessionBackend) | +| Tmux primitives | `tmux.go` | Unchanged (kept as unexported helpers) | +| Data model | `types.go`, `db/models.go`, `db/convert.go`, `db_adapter.go` | Rename `TmuxSession` -> `SessionID`, add `BackendType` | +| Server | `server.go` | Add `backends` map, `backendFor()` helper | +| Lifecycle | `lifecycle.go` | Route through backend | +| Liveness | `liveness.go` | Route through backend | +| Handlers | `handlers_agent.go` | Route through backend, rename API endpoint | +| Broadcast | `tmux.go` (orchestration funcs) | Route through backend | +| Frontend | `types/index.ts`, `AgentDetail.vue`, `client.ts` | Rename `tmux_session` -> `session_id` | +| Tests | `server_test.go`, `hierarchy_test.go`, `lifecycle_test.go` | Update field names, add mock backend tests | + +### Known gaps (deferred) + +| Gap | Notes | +|-----|-------| +| Context/tool injection for Ambient | Ambient sessions don't inherit local boss commands. Needs workflow or MCP server approach. Deferred to Phase 2. | +| Cross-space session name collisions | Current `agentdeck_*` naming doesn't include space. Same agent name in two spaces can collide. PR #49 proposes a fix but is out of scope here. | +| Session ownership/filtering | `tmuxListSessions` returns all sessions, not just agent-boss. Mitigated by naming convention but not solved. | +| Idle detection brittleness | `isShellPrompt` relies on PS1 heuristics. Claude Code hooks would be cleaner. | +| Model switching compaction risk | Switching from opus to haiku with large context triggers compaction. Needs separate evaluation. | diff --git a/.spec/SessionBackends/01-tmux-audit.md b/.spec/SessionBackends/01-tmux-audit.md new file mode 100644 index 0000000..d83137a --- /dev/null +++ b/.spec/SessionBackends/01-tmux-audit.md @@ -0,0 +1,129 @@ +# Tmux Usage Audit + +Every place in the codebase that directly touches tmux, categorized by purpose. + +## 1. Low-Level Tmux Primitives (`tmux.go`) + +| Function | What it does | Called by | +|----------|-------------|-----------| +| `tmuxAvailable()` | Checks if `tmux` binary is in PATH | `TmuxAutoDiscover`, `BroadcastCheckIn`, `SingleAgentCheckIn`, `checkAllSessionLiveness` | +| `tmuxListSessions()` | Runs `tmux list-sessions -F #S`. **Note:** returns ALL tmux sessions on the machine, not just agent-boss sessions. Needs filtering/tagging mechanism (see §Session Ownership below). | `tmuxSessionExists`, `TmuxAutoDiscover` | +| `tmuxSessionExists(session)` | Checks if a named session is in the list | `handleAgentSpawn`, `handleAgentStop`, `handleAgentRestart`, `handleAgentIntrospect`, `BroadcastCheckIn`, `handleApproveAgent`, `handleReplyAgent`, `handleSpaceTmuxStatus`, `TmuxBackend.Spawn`, `TmuxBackend.Stop`, `checkAllSessionLiveness` | +| `tmuxCapturePaneLines(session, n)` | Runs `tmux capture-pane -t session -p`, returns last N non-empty lines | `tmuxIsIdle`, `tmuxCheckApproval`, `handleAgentIntrospect`, `handleSpaceTmuxStatus` | +| `tmuxCapturePaneLastLine(session)` | Wrapper: captures last 1 line | `handleSpaceTmuxStatus` | +| `tmuxIsIdle(session)` | Checks last 10 lines for idle indicators (shell prompts, Claude Code `>` prompt, etc.) | `tmuxCheckApproval`, `BroadcastCheckIn`, `checkAllSessionLiveness` | +| `tmuxCheckApproval(session)` | Scans pane for "Do you want...?" + numbered choices pattern | `checkAllSessionLiveness`, `handleAgentIntrospect`, `handleApproveAgent`, `handleSpaceTmuxStatus` | +| `tmuxApprove(session)` | Sends `Enter` key to session | `handleApproveAgent` | +| `tmuxSendKeys(session, text)` | Sends text + `C-m` (Enter) to session | `runAgentCheckIn`, `handleAgentSpawn`, `handleAgentRestart`, `handleReplyAgent`, `handleCreateAgents` (ignite), `TmuxBackend.Spawn` | +| `parseTmuxAgentName(session)` | Extracts agent name from `agentdeck_{name}_{id}` pattern | `TmuxAutoDiscover` | + +## 2. Idle Detection Helpers (`tmux.go`) + +| Function | What it does | +|----------|-------------| +| `lineIsIdleIndicator(line)` | Returns true if a line matches known idle patterns: `>`, shell `$`/`%`/`#`, Claude Code hints, status bars | +| `isShellPrompt(line)` | Detects `$`, `%`, `>`, `#` as trailing prompt characters. **Brittle:** assumes PS1 follows convention. A cleaner approach would be using [Claude Code hooks](https://code.claude.com/docs/en/hooks) to emit structured idle/busy signals instead of parsing terminal output. | +| `waitForIdle(session, timeout)` | Polls `tmuxIsIdle` every 3s until idle or timeout | +| `waitForBoardPost(space, agent, since, timeout)` | Polls `agentUpdatedAt` every 3s (not tmux-specific, but used exclusively by broadcast which is tmux-only) | + +## 3. Broadcast / Check-In (`tmux.go`) + +| Function | What it does | +|----------|-------------| +| `runAgentCheckIn(space, agent, tmuxSession, checkModel, workModel, result)` | Switches model, sends `/boss.check`, waits for board post, restores model. All via `tmuxSendKeys` + `waitForIdle`. | +| `BroadcastCheckIn(space, checkModel, workModel)` | Iterates all agents with `TmuxSession`, calls `runAgentCheckIn` concurrently. | +| `SingleAgentCheckIn(space, agent, checkModel, workModel)` | Single-agent version of broadcast. | +| `BroadcastResult` + helpers | Result accumulator for sent/skipped/errors. | + +## 4. Lifecycle Handlers (`lifecycle.go`) + +| Handler | Tmux operations performed | +|---------|--------------------------| +| `handleAgentSpawn` | `tmuxSessionExists`, `exec tmux new-session -d`, `tmuxSendKeys` (command), `tmuxSendKeys` (ignite) | +| `handleAgentStop` | Gets `agent.TmuxSession`, `tmuxSessionExists`, `exec tmux kill-session` | +| `handleAgentRestart` | Gets `agent.TmuxSession`, `tmuxSessionExists`, `exec tmux kill-session`, `exec tmux new-session`, `tmuxSendKeys` (command + ignite) | +| `handleAgentIntrospect` | Gets `agent.TmuxSession`, `tmuxSessionExists`, `tmuxIsIdle`, `tmuxCapturePaneLines`, `tmuxCheckApproval` | +| `isNonTmuxAgent(agent)` | Checks `agent.Registration.AgentType != "tmux"` to gate lifecycle endpoints | +| `nonTmuxLifecycleError(w, type)` | Returns 422 for non-tmux agents hitting tmux-only endpoints | +| `inferAgentStatus(exists, idle, needsApproval)` | Pure function mapping booleans to string status (not tmux-specific logic) | + +## 5. Liveness Loop (`liveness.go`) + +| Function | Tmux operations performed | +|----------|--------------------------| +| `checkAllSessionLiveness` | `tmuxAvailable`, iterates all agents with `TmuxSession`, calls `tmuxSessionExists`, `tmuxIsIdle`, `tmuxCheckApproval`. Updates `InferredStatus`, records interrupts, triggers nudges. Broadcasts SSE `tmux_liveness` event. | +| `executeNudge` | Calls `SingleAgentCheckIn` (which uses tmux) | + +## 6. Agent Handlers (`handlers_agent.go`) + +| Handler | Tmux operations performed | +|---------|--------------------------| +| `handleSpaceAgent` (POST) | Preserves `TmuxSession` as sticky field on agent update | +| `handleIgnition` (GET) | Accepts `?tmux_session=` query param, stores on agent record, references it in ignition text | +| `handleSpaceTmuxStatus` (GET) | `TmuxAutoDiscover`, iterates agents, calls `tmuxSessionExists`, `tmuxIsIdle`, `tmuxCapturePaneLastLine`, `tmuxCheckApproval` | +| `handleApproveAgent` (POST) | Gets `agent.TmuxSession`, `tmuxSessionExists`, `tmuxCheckApproval`, `tmuxApprove` | +| `handleReplyAgent` (POST) | Gets `agent.TmuxSession`, `tmuxSessionExists`, `tmuxSendKeys` | +| `handleCreateAgents` (POST) | Uses `AgentBackend` interface for spawn, but then calls `tmuxSendKeys` directly for ignite | + +## 7. Backend Interface (`agent_backend.go`) — already exists but incomplete + +| Method | `TmuxBackend` impl | `CloudBackend` impl | +|--------|-------------------|---------------------| +| `Name()` | `"tmux"` | `"cloud"` | +| `Spawn(ctx, spec)` | Creates tmux session, sends command | Returns `ErrNotImplemented` | +| `Stop(ctx, space, name)` | Kills tmux session | Returns `ErrNotImplemented` | +| `List(ctx, space)` | Lists all tmux sessions | Returns `ErrNotImplemented` | + +**Only used by:** `handleCreateAgents`. All other lifecycle/liveness/broadcast code bypasses this interface entirely. + +## 8. Data Model References + +| Location | Field | Notes | +|----------|-------|-------| +| `types.go:96` | `AgentUpdate.TmuxSession` | JSON tag `tmux_session` | +| `db/models.go:82` | `Agent.TmuxSession` | SQLite column | +| `db/convert.go:35` | `AgentRow.TmuxSession` | DB-to-coordinator conversion | +| `db/convert.go:61,113` | `FromAgentFields(..., tmuxSession, ...)` | Coordinator-to-DB conversion | +| `db_adapter.go:317,371` | References `TmuxSession` | DB adapter layer | +| `db/migrate_from_json.go:40` | `jsonAgent.TmuxSession` | JSON migration | + +## 9. Frontend References + +| File | Usage | +|------|-------| +| `frontend/src/types/index.ts:49,118` | `tmux_session?: string` on agent types | +| `frontend/src/api/client.ts:257,275` | Spawn/restart return `tmux_session` | +| `frontend/src/components/AgentDetail.vue:133,562,918,921` | Displays tmux session, gates pane/controls sections | + +## 10. Scripts and Documentation + +| File | Usage | +|------|-------| +| `scripts/boss.sh` | `get_tmux_session()`, passes `-e TMUX_SESSION` | +| `scripts/agent-ignition.sh` | `create_tmux_session()` | +| `scripts/coordination-client.py` | Passes `tmux_session` to ignition | +| `commands/boss.ignite.md` | References `?tmux_session=` | +| `commands/boss.check.md` | Notes `tmux_session` is sticky | +| `docs/AGENT_PROTOCOL.md` | Documents `tmux_session` field, ignition params | +| `docs/lifecycle-spec.md` | Spawn/stop/restart reference `tmux_session` | +| `docs/api-reference.md` | API docs reference `tmux_session` | +| `docs/hierarchy-design.md` | Compares parent stickiness to `TmuxSession` | + +## Session Ownership + +`tmuxListSessions()` returns ALL tmux sessions on the machine — not just those +created by agent-boss. This means discovery and liveness can incorrectly interact +with unrelated sessions. + +Currently, agent-boss sessions are identified by naming convention only: +- Legacy: `agentdeck_{name}_{timestamp}` (parsed by `parseTmuxAgentName`) +- PR #49: `{space}-{agent}` (parsed by space prefix matching) + +Neither convention provides a strong ownership guarantee. Options for improvement: +- **tmux environment variable**: set `@agent_boss=true` on sessions at creation, + filter by it during listing +- **Dedicated tmux server**: use `tmux -L agent-boss` to isolate sessions entirely +- **Prefix convention**: require a fixed prefix (e.g., `ab-{space}-{agent}`) that + is unlikely to collide with user sessions + +This is out of scope for the current refactoring but should be addressed. diff --git a/.spec/SessionBackends/02-session-backend-interface.md b/.spec/SessionBackends/02-session-backend-interface.md new file mode 100644 index 0000000..b3b4485 --- /dev/null +++ b/.spec/SessionBackends/02-session-backend-interface.md @@ -0,0 +1,590 @@ +# SessionBackend Interface Design + +## Problem Statement + +Tmux session management is hardcoded throughout the coordinator. The existing `AgentBackend` interface +(Spawn/Stop/List/Name) only covers creation and is only used by one handler (`handleCreateAgents`). +All other session operations — liveness polling, idle detection, approval checking, introspection, +broadcasting, sending input — call tmux functions directly. + +This makes it impossible to swap in a different session manager (e.g., Ambient Code Platform sessions) +without forking the entire coordinator. + +## Design Goals + +1. **Single interface** that covers the full lifecycle: create, destroy, observe, interact +2. **Subsume `AgentBackend`** — the new `SessionBackend` replaces the existing `AgentBackend` + interface from PR #47. `AgentBackend.Spawn` maps to `CreateSession`, `Stop` maps to + `KillSession`, `List` maps to `ListSessions`, `Name` maps to `Name`. +3. **Drop-in tmux implementation** that wraps existing functions with zero behavior change +4. **Ambient backend** implementable against the ACP public API +5. **Per-agent backend selection** — agents in the same space can use different backends + +## Non-Goals + +- Changing the agent protocol (blackboard, messages, tasks, SSE) +- Modifying the frontend beyond renaming JSON fields +- Implementing the Ambient backend in this design doc (separate spec) +- Backward compatibility with `tmux_session` JSON field or `?tmux_session=` query param + (no production agents running — clean break) + +--- + +## Interface Definition + +```go +// SessionBackend is the interface for managing agent sessions. +// Each backend (tmux, ambient, etc.) implements this interface. +// The coordinator routes operations through it instead of calling +// tmux functions directly. +// +// This replaces the existing AgentBackend interface (Spawn/Stop/List/Name) +// with full lifecycle coverage. +type SessionBackend interface { + // --- Identity --- + + // Name returns the backend identifier ("tmux", "ambient", etc.). + Name() string + + // Available reports whether this backend is operational. + // For tmux: checks if the binary is in PATH. + // For ambient: checks if the API is reachable. + Available() bool + + // --- Lifecycle --- + + // CreateSession creates a new session and launches the given command. + // Returns the backend-specific session ID. + // For tmux: creates a detached session and sends the command. + // For ambient: calls POST /sessions with the command as the task. + CreateSession(ctx context.Context, opts SessionCreateOpts) (string, error) + + // KillSession permanently destroys a session by ID. + // For tmux: kills the tmux session (gone forever). + // For ambient: calls DELETE /sessions/{id} (permanent removal). + KillSession(ctx context.Context, sessionID string) error + + // SessionExists checks whether a session with the given ID is alive. + SessionExists(sessionID string) bool + + // ListSessions returns all session IDs managed by this backend. + ListSessions() ([]string, error) + + // --- Status --- + + // GetStatus returns the current status of a session. + // For tmux: derives from SessionExists + IsIdle + CheckApproval. + // For ambient: maps directly from the API response status field + // and the latest run status. + GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) + + // --- Observability --- + + // IsIdle reports whether the session is waiting for user input + // (no agent or process actively running). + // For tmux: checks terminal output for idle indicators (prompts, etc.). + // For ambient: session is "running" AND latest run is completed/error. + IsIdle(sessionID string) bool + + // CaptureOutput returns the last N non-empty lines from the session. + // For tmux: captures terminal pane lines. + // For ambient: fetches transcript messages and formats as + // "[role] content" lines. + CaptureOutput(sessionID string, lines int) ([]string, error) + + // CheckApproval inspects the session output for a pending tool-use + // approval prompt (e.g., "Do you want to run Bash?"). + // For tmux: parses terminal output for approval patterns. + // For ambient: always returns NeedsApproval=false (sessions run + // with configured permissions, no interactive prompts). + CheckApproval(sessionID string) ApprovalInfo + + // --- Interaction --- + + // SendInput sends text to the session. + // For tmux: sends keystrokes followed by Enter. + // For ambient: calls POST /sessions/{id}/message (creates a new run). + SendInput(sessionID string, text string) error + + // Approve sends an approval response to a pending prompt. + // For tmux: sends Enter key to accept. + // For ambient: no-op (returns nil). + Approve(sessionID string) error + + // Interrupt cancels the session's current work without killing it. + // The session remains alive and can accept new messages. + // For tmux: sends Escape key to interrupt Claude Code. + // For ambient: calls POST /sessions/{id}/interrupt. + // Note: this is a new capability — no equivalent exists in the + // current codebase. Claude Code uses Escape (not Ctrl-C) to interrupt. + Interrupt(ctx context.Context, sessionID string) error + + // --- Discovery --- + + // DiscoverSessions finds sessions that match known agent naming + // conventions and returns a map of agentName -> sessionID. + // For tmux: parses agentdeck_* session names. + // For ambient: lists sessions and matches by display_name. + // Backends that don't support discovery return an empty map. + DiscoverSessions() (map[string]string, error) +} +``` + +**Method count: 13** — maps 1:1 from the existing `AgentBackend` (4 methods) plus the 9 +additional operations that are currently hardcoded as direct tmux calls. + +### Role Interfaces + +The 13-method interface is large. Backends that don't support certain roles (e.g., +Ambient has no approval flow) must implement no-op methods. To support smaller +consumers and cleaner testing, the interface is decomposable into role interfaces: + +```go +// SessionLifecycle covers session creation and destruction. +type SessionLifecycle interface { + CreateSession(ctx context.Context, opts SessionCreateOpts) (string, error) + KillSession(ctx context.Context, sessionID string) error + SessionExists(sessionID string) bool + ListSessions() ([]string, error) +} + +// SessionObserver covers session status and observability. +type SessionObserver interface { + GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) + IsIdle(sessionID string) bool + CaptureOutput(sessionID string, lines int) ([]string, error) + CheckApproval(sessionID string) ApprovalInfo +} + +// SessionActor covers session interaction. +type SessionActor interface { + SendInput(sessionID string, text string) error + Approve(sessionID string) error + Interrupt(ctx context.Context, sessionID string) error +} +``` + +`SessionBackend` embeds all three plus identity and discovery. Smaller consumers +(e.g., the liveness loop only needs `SessionObserver`) can depend on the narrow +interface. This makes testing easier — mock only the role you're testing. + +**Implementation note:** All backends still implement the full `SessionBackend`. +The role interfaces are for *consumers*, not *providers*. Go's structural typing +means any `SessionBackend` automatically satisfies all three role interfaces. + +### Supporting Types + +```go +// SessionStatus represents the state of a session. +type SessionStatus string + +const ( + SessionStatusUnknown SessionStatus = "unknown" // can't determine (backend unavailable) + SessionStatusPending SessionStatus = "pending" // created but not yet running (ambient only) + SessionStatusRunning SessionStatus = "running" // actively working + SessionStatusIdle SessionStatus = "idle" // alive but waiting for input + SessionStatusCompleted SessionStatus = "completed" // finished + SessionStatusFailed SessionStatus = "failed" // errored + SessionStatusMissing SessionStatus = "missing" // session does not exist +) + +// SessionCreateOpts holds common parameters for creating a new session. +// Backend-specific options are passed via the BackendOpts field. +type SessionCreateOpts struct { + SessionID string // desired session name/ID (backend may adjust) + Command string // tmux: shell command to run; ambient: initial task/prompt + BackendOpts interface{} // backend-specific options (TmuxCreateOpts, AmbientCreateOpts, etc.) +} + +// TmuxCreateOpts holds tmux-specific session creation options. +type TmuxCreateOpts struct { + WorkDir string // working directory to cd into before launching + Width int // terminal width (default 220) + Height int // terminal height (default 50) +} + +// AmbientCreateOpts holds Ambient-specific session creation options. +type AmbientCreateOpts struct { + DisplayName string // human-readable session name + Model string // Claude model to use + Repos []SessionRepo // repositories to clone into the session +} + +// SessionRepo describes a repository to attach to an ambient session. +type SessionRepo struct { + URL string `json:"url"` + Branch string `json:"branch,omitempty"` +} + +// ApprovalInfo describes a pending tool-use approval prompt. +// Exported from the existing unexported approvalInfo in tmux.go. +type ApprovalInfo struct { + NeedsApproval bool `json:"needs_approval"` + ToolName string `json:"tool_name,omitempty"` + PromptText string `json:"prompt_text,omitempty"` +} +``` + +### Rename: `approvalInfo` -> `ApprovalInfo` + +The existing `approvalInfo` struct in `tmux.go` is unexported. Since it's now part of the +interface contract, it gets exported. The struct body is unchanged. + +--- + +## Data Model Changes + +### `AgentUpdate` field rename + +```go +// Before: +TmuxSession string `json:"tmux_session,omitempty"` + +// After: +SessionID string `json:"session_id,omitempty"` +BackendType string `json:"backend_type,omitempty"` // "tmux", "ambient", etc. +``` + +**No backward compatibility shim.** No production agents are running. This is a clean +break — all references to `tmux_session` are updated in a single pass. + +### DB schema + +```sql +-- Rename column (SQLite requires table rebuild via GORM automigrate) +-- agents.tmux_session -> agents.session_id +-- Add new column: agents.backend_type TEXT DEFAULT '' +``` + +### Ignition query param + +``` +-- Before: +GET /spaces/{space}/ignition/{agent}?tmux_session=X + +-- After: +GET /spaces/{space}/ignition/{agent}?session_id=X&backend=tmux +``` + +Old `?tmux_session=` param is removed. No compat path. + +--- + +## Server Integration + +### Backend registry on `Server` + +```go +type Server struct { + // ... existing fields ... + + // backends maps backend name -> implementation. + // Populated at startup. At minimum: {"tmux": &TmuxSessionBackend{}}. + backends map[string]SessionBackend + + // defaultBackend is the name of the backend to use when none is specified. + // Defaults to "tmux". + defaultBackend string +} +``` + +### Resolving the backend for an agent + +Every operation that currently reads `agent.TmuxSession` and calls a tmux function needs to: + +1. Read `agent.BackendType` (defaulting to `"tmux"` if empty) +2. Look up the backend in `s.backends[agent.BackendType]` +3. Call the backend method with `agent.SessionID` + +Helper: + +```go +// backendFor returns the SessionBackend for the given agent. +// Returns the default backend if the agent has no BackendType set. +func (s *Server) backendFor(agent *AgentUpdate) SessionBackend { + if agent.BackendType != "" { + if b, ok := s.backends[agent.BackendType]; ok { + return b + } + } + return s.backends[s.defaultBackend] +} +``` + +--- + +## Reconciliation with `AgentBackend` (PR #47) + +The existing `AgentBackend` interface from PR #47 is **folded into** `SessionBackend`. +`agent_backend.go` is deleted, and all callers are migrated. + +| `AgentBackend` method | `SessionBackend` equivalent | Notes | +|----------------------|---------------------------|-------| +| `Name()` | `Name()` | Identical | +| `Spawn(ctx, spec)` | `CreateSession(ctx, opts)` | `AgentSpec` fields map to `SessionCreateOpts` + `TmuxCreateOpts` | +| `Stop(ctx, space, name)` | `KillSession(ctx, sessionID)` | Callers must resolve the session ID first | +| `List(ctx, space)` | `ListSessions()` | Backend returns all sessions; caller filters by space | + +### `AgentSpec` -> `SessionCreateOpts` mapping + +```go +// AgentSpec (PR #47): +type AgentSpec struct { + Space, Name, Command, WorkDir string + Width, Height int +} + +// Becomes: +opts := SessionCreateOpts{ + SessionID: spec.Name, + Command: spec.Command, + BackendOpts: TmuxCreateOpts{ + WorkDir: spec.WorkDir, + Width: spec.Width, + Height: spec.Height, + }, +} +``` + +### `handleCreateAgents` migration + +This is the only consumer of `AgentBackend`. It currently calls `s.backend.Spawn(ctx, spec)`. +After migration, it calls `s.backendFor(agent).CreateSession(ctx, opts)`. + +--- + +## Migration Plan: Which Code Changes + +### Phase 1: Interface + TmuxBackend (this PR) + +| Current code | Change | +|-------------|--------| +| `tmux.go` top-level functions | Keep as-is. `TmuxSessionBackend` delegates to them. | +| `agent_backend.go` | Delete entirely. `AgentBackend`, `TmuxBackend`, `CloudBackend`, `AgentSpec`, `AgentInfo` all superseded. `tmuxDefaultSession` and `shellQuote` move to `session_backend_tmux.go`. | +| `lifecycle.go` handlers | Route through `s.backendFor(agent)` instead of calling tmux directly | +| `liveness.go` loop | Route through backend instead of calling tmux directly | +| `handlers_agent.go` approve/reply/introspect/tmux-status | Route through backend | +| `handlers_agent.go` `handleCreateAgents` | Replace `s.backend.Spawn` with `s.backendFor(agent).CreateSession` | +| `tmux.go` broadcast/check-in | Route through backend for sendkeys/idle/approve | +| `types.go` `AgentUpdate` | Rename `TmuxSession` -> `SessionID`, add `BackendType` | +| `db/models.go` | Rename column | +| `db/convert.go` | Update field mappings | +| `handlers_agent.go` ignition | Replace `?tmux_session=` with `?session_id=&backend=` | +| `server.go` | Add `backends` map, initialize with tmux backend | +| Frontend types | Rename `tmux_session` -> `session_id` | + +### Phase 2: Ambient Backend (follow-up PR) + +Implement `AmbientSessionBackend` using ACP public API. Separate spec. + +--- + +## Migration Sequencing + +The rename and refactoring must be done in a specific order to avoid breaking +the build at any intermediate step: + +``` +Step 1: Add new files + - session_backend.go (interface, types) + - session_backend_tmux.go (TmuxSessionBackend) + Both compile independently. Existing code unchanged. + +Step 2: Add backend registry to Server + - server.go: add backends map, defaultBackend, backendFor() + - Initialize with TmuxSessionBackend in NewServer() + Existing code still works — backends is additive. + +Step 3: Migrate handlers one at a time + - Each handler switches from direct tmux calls to s.backendFor(agent) + - Do this file-by-file: lifecycle.go, liveness.go, handlers_agent.go, tmux.go (broadcast) + - Each file is independently testable after migration. + +Step 4: Rename data model fields + - types.go: TmuxSession -> SessionID, add BackendType + - db/models.go, db/convert.go, db_adapter.go: rename column + - handlers_agent.go: update ignition query param + - Frontend: update types and components + This is the "big bang" step — do it all at once since field + names are referenced across the stack. + +Step 5: Delete old code + - Delete agent_backend.go + - Remove any remaining direct tmux calls from handlers + - Remove isNonTmuxAgent / nonTmuxLifecycleError helpers +``` + +Each step produces a compilable, testable codebase. + +--- + +## Handler Migration Details + +### `handleAgentSpawn` -> route through backend + +``` +Before: + exec.Command("tmux", "new-session", ...) + tmuxSendKeys(session, command) + tmuxSendKeys(session, igniteCmd) + +After: + backend := s.backendFor(agent) // or from request + sessionID, err := backend.CreateSession(ctx, opts) + backend.SendInput(sessionID, igniteCmd) +``` + +### `handleAgentStop` -> route through backend + +``` +Before: + tmuxSessionExists(session) -> exec.Command("tmux", "kill-session", ...) + +After: + backend := s.backendFor(agent) + backend.KillSession(ctx, agent.SessionID) +``` + +### `handleAgentRestart` -> route through backend + +``` +Before: + kill old tmux session -> create new tmux session -> send command + ignite + +After: + backend := s.backendFor(agent) + backend.KillSession(ctx, agent.SessionID) + newID, _ := backend.CreateSession(ctx, opts) + backend.SendInput(newID, igniteCmd) +``` + +### `handleAgentIntrospect` -> route through backend + +``` +Before: + tmuxSessionExists -> tmuxIsIdle -> tmuxCapturePaneLines -> tmuxCheckApproval + +After: + backend := s.backendFor(agent) + exists := backend.SessionExists(agent.SessionID) + idle := backend.IsIdle(agent.SessionID) + lines, _ := backend.CaptureOutput(agent.SessionID, 50) + approval := backend.CheckApproval(agent.SessionID) +``` + +### `checkAllSessionLiveness` -> route through backend + +``` +Before: + if !tmuxAvailable() { return } + for each agent with TmuxSession: + tmuxSessionExists -> tmuxIsIdle -> tmuxCheckApproval + +After: + for each agent with SessionID: + backend := s.backendFor(agent) + if !backend.Available() { continue } + exists := backend.SessionExists(agent.SessionID) + idle := backend.IsIdle(agent.SessionID) + approval := backend.CheckApproval(agent.SessionID) +``` + +### `handleApproveAgent` -> route through backend + +``` +Before: + tmuxSessionExists -> tmuxCheckApproval -> tmuxApprove + +After: + backend := s.backendFor(agent) + backend.SessionExists(agent.SessionID) + backend.CheckApproval(agent.SessionID) + backend.Approve(agent.SessionID) +``` + +### `handleReplyAgent` -> route through backend + +``` +Before: + tmuxSessionExists -> tmuxSendKeys(session, message) + +After: + backend := s.backendFor(agent) + backend.SessionExists(agent.SessionID) + backend.SendInput(agent.SessionID, message) +``` + +### `handleSpaceTmuxStatus` -> generalize to `handleSpaceSessionStatus` + +Rename route from `/api/tmux-status` to `/api/session-status`. +Response struct rename `tmuxAgentStatus` -> `agentSessionStatus`. + +### `BroadcastCheckIn` / `SingleAgentCheckIn` -> route through backend + +``` +Before: + if !tmuxAvailable() { error } + TmuxAutoDiscover(...) + tmuxSessionExists -> tmuxIsIdle -> tmuxSendKeys(checkModel) -> waitForIdle -> tmuxSendKeys(check) + +After: + backend := s.backendFor(agent) + if !backend.Available() { error } + // discovery only for tmux (other backends register explicitly) + backend.SessionExists(sessionID) + backend.IsIdle(sessionID) + backend.SendInput(sessionID, "/model "+checkModel) + // waitForIdle uses backend.IsIdle in its poll loop + backend.SendInput(sessionID, "/boss.check ...") +``` + +### `TmuxAutoDiscover` -> route through backend + +``` +Before: + tmuxListSessions -> parseTmuxAgentName -> match to agents + +After: + backend := s.backends["tmux"] // discovery is tmux-specific + discovered := backend.DiscoverSessions() + // match discovered sessions to agents +``` + +--- + +## API Response Changes + +### `/api/tmux-status` -> `/api/session-status` + +```json +{ "agent": "FE", "session_id": "myspace-FE", "backend": "tmux", "registered": true, "exists": true, "idle": false, "needs_approval": true } +``` + +### Agent JSON + +```json +{ "status": "active", "session_id": "FE", "backend_type": "tmux", ... } +``` + +### Spawn/restart responses + +```json +{ "ok": true, "session_id": "FE", "backend": "tmux" } +``` + +--- + +## SSE Event Changes + +### `tmux_liveness` -> `session_liveness` + +Rename the event type. No alias — clean break. + +--- + +## Test Strategy + +1. **All existing tests pass** after refactoring (behavior-preserving) +2. **New unit tests** for `TmuxSessionBackend` implementing `SessionBackend` +3. **Mock backend** for integration tests that don't require tmux +4. **Role interface tests** — verify backends satisfy `SessionObserver`, etc. diff --git a/.spec/SessionBackends/03-tmux-backend.md b/.spec/SessionBackends/03-tmux-backend.md new file mode 100644 index 0000000..ce8bff5 --- /dev/null +++ b/.spec/SessionBackends/03-tmux-backend.md @@ -0,0 +1,288 @@ +# TmuxSessionBackend Design + +Implementation of `SessionBackend` that wraps the existing tmux functions with zero behavior change. + +## Struct + +```go +// TmuxSessionBackend implements SessionBackend using local tmux sessions. +type TmuxSessionBackend struct { + // sessionAliases maps tmux session name fragments to canonical agent names. + // Used by DiscoverSessions to match agentdeck_* session names. + // e.g., "control-plane" -> "CP", "boss-app" -> "" (skip) + sessionAliases map[string]string +} +``` + +## Method Mapping + +Every method delegates to an existing function from `tmux.go`. No new tmux logic is introduced +except `GetStatus` (composite of existing checks) and `Interrupt` (new: sends Escape key). + +| SessionBackend method | Delegates to | Notes | +|----------------------|-------------|-------| +| `Name()` | — | Returns `"tmux"` | +| `Available()` | `tmuxAvailable()` | `exec.LookPath("tmux")` | +| `CreateSession(ctx, opts)` | `exec.Command("tmux", "new-session", ...)` + `tmuxSendKeys` | Extracted from `handleAgentSpawn` and `TmuxBackend.Spawn` | +| `KillSession(ctx, id)` | `exec.Command("tmux", "kill-session", ...)` | Extracted from `handleAgentStop` | +| `SessionExists(id)` | `tmuxSessionExists(id)` | Unchanged | +| `ListSessions()` | `tmuxListSessions()` | Unchanged | +| `GetStatus(ctx, id)` | `tmuxSessionExists` + `tmuxIsIdle` | New composite: missing/idle/running/unknown | +| `IsIdle(id)` | `tmuxIsIdle(id)` | Unchanged — all idle detection logic preserved | +| `CaptureOutput(id, n)` | `tmuxCapturePaneLines(id, n)` | Unchanged | +| `CheckApproval(id)` | `tmuxCheckApproval(id)` | Returns exported `ApprovalInfo` instead of `approvalInfo` | +| `SendInput(id, text)` | `tmuxSendKeys(id, text)` | Unchanged | +| `Approve(id)` | `tmuxApprove(id)` | Unchanged | +| `Interrupt(ctx, id)` | `tmux send-keys -t id Escape` | New: sends Escape key (Claude Code interrupt). Not Ctrl-C. | +| `DiscoverSessions()` | `tmuxListSessions()` + `parseTmuxAgentName()` | Returns `map[agentName]sessionID` using existing `agentdeck_*` naming | + +## Session Naming + +Session naming is **unchanged** from the current codebase. Sessions use the +`agentdeck_{name}_{timestamp}` convention, parsed by `parseTmuxAgentName()`. + +### Known issue: cross-space collisions + +The current naming convention does not include the space name. If the same agent +name (e.g., "FE") exists in two spaces, their sessions could collide. PR #49 +(open) proposes `tmuxDefaultSession(space, agent)` → `{space}-{agent}` to fix +this, but that change is **out of scope** for this refactoring. + +This must be resolved before multi-space deployments are common. Options include: +- Adopt PR #49's `{space}-{agent}` convention +- Use `agentdeck_{space}_{agent}_{timestamp}` to preserve backward compat with discovery +- Add `Space` to `SessionCreateOpts` so the backend can incorporate it + +When the naming convention does change, `DiscoverSessions()` and +`parseTmuxAgentName()` will need corresponding updates. + +## Implementation + +```go +func NewTmuxSessionBackend() *TmuxSessionBackend { + return &TmuxSessionBackend{ + sessionAliases: map[string]string{ + "control-plane": "CP", + "boss-app": "", // skip + }, + } +} + +func (b *TmuxSessionBackend) Name() string { return "tmux" } + +func (b *TmuxSessionBackend) Available() bool { + return tmuxAvailable() +} + +func (b *TmuxSessionBackend) CreateSession(ctx context.Context, opts SessionCreateOpts) (string, error) { + sessionID := opts.SessionID + if sessionID == "" { + return "", fmt.Errorf("session ID is required") + } + if tmuxSessionExists(sessionID) { + return "", fmt.Errorf("tmux session %q already exists", sessionID) + } + + // Extract tmux-specific options + var tmuxOpts TmuxCreateOpts + if opts.BackendOpts != nil { + if to, ok := opts.BackendOpts.(TmuxCreateOpts); ok { + tmuxOpts = to + } + } + + width := tmuxOpts.Width + if width <= 0 { + width = 220 + } + height := tmuxOpts.Height + if height <= 0 { + height = 50 + } + + createCtx, cancel := context.WithTimeout(ctx, tmuxCmdTimeout) + defer cancel() + if err := exec.CommandContext(createCtx, "tmux", "new-session", "-d", "-s", sessionID, + "-x", fmt.Sprintf("%d", width), "-y", fmt.Sprintf("%d", height)).Run(); err != nil { + return "", fmt.Errorf("create tmux session: %w", err) + } + + // cd to work dir if specified + if tmuxOpts.WorkDir != "" { + time.Sleep(300 * time.Millisecond) + if err := tmuxSendKeys(sessionID, "cd "+shellQuote(tmuxOpts.WorkDir)); err != nil { + exec.CommandContext(ctx, "tmux", "kill-session", "-t", sessionID).Run() + return "", fmt.Errorf("cd to workdir: %w", err) + } + } + + // Launch command + if opts.Command != "" { + time.Sleep(300 * time.Millisecond) + if err := tmuxSendKeys(sessionID, opts.Command); err != nil { + exec.CommandContext(ctx, "tmux", "kill-session", "-t", sessionID).Run() + return "", fmt.Errorf("launch command: %w", err) + } + } + + return sessionID, nil +} + +func (b *TmuxSessionBackend) KillSession(ctx context.Context, sessionID string) error { + killCtx, cancel := context.WithTimeout(ctx, tmuxCmdTimeout) + defer cancel() + return exec.CommandContext(killCtx, "tmux", "kill-session", "-t", sessionID).Run() +} + +func (b *TmuxSessionBackend) SessionExists(sessionID string) bool { + return tmuxSessionExists(sessionID) +} + +func (b *TmuxSessionBackend) ListSessions() ([]string, error) { + return tmuxListSessions() +} + +func (b *TmuxSessionBackend) GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) { + if !b.Available() { + return SessionStatusUnknown, nil + } + if !tmuxSessionExists(sessionID) { + return SessionStatusMissing, nil + } + if tmuxIsIdle(sessionID) { + return SessionStatusIdle, nil + } + return SessionStatusRunning, nil +} + +func (b *TmuxSessionBackend) IsIdle(sessionID string) bool { + return tmuxIsIdle(sessionID) +} + +func (b *TmuxSessionBackend) CaptureOutput(sessionID string, lines int) ([]string, error) { + return tmuxCapturePaneLines(sessionID, lines) +} + +func (b *TmuxSessionBackend) CheckApproval(sessionID string) ApprovalInfo { + result := tmuxCheckApproval(sessionID) + return ApprovalInfo(result) // type alias or direct conversion +} + +func (b *TmuxSessionBackend) SendInput(sessionID string, text string) error { + return tmuxSendKeys(sessionID, text) +} + +func (b *TmuxSessionBackend) Approve(sessionID string) error { + return tmuxApprove(sessionID) +} + +func (b *TmuxSessionBackend) Interrupt(ctx context.Context, sessionID string) error { + // Claude Code uses Escape to interrupt, not Ctrl-C. + interruptCtx, cancel := context.WithTimeout(ctx, tmuxCmdTimeout) + defer cancel() + return exec.CommandContext(interruptCtx, "tmux", "send-keys", "-t", sessionID, "Escape").Run() +} + +func (b *TmuxSessionBackend) DiscoverSessions() (map[string]string, error) { + sessions, err := tmuxListSessions() + if err != nil { + return nil, err + } + discovered := make(map[string]string) + for _, session := range sessions { + name := parseTmuxAgentName(session) + if name == "" { + continue + } + // Apply aliases + if alias, ok := b.sessionAliases[name]; ok { + if alias == "" { + continue // skip + } + name = alias + } + discovered[name] = session + } + return discovered, nil +} +``` + +## What stays in `tmux.go` + +The low-level functions remain in `tmux.go` as unexported helpers: + +- `tmuxAvailable()` +- `tmuxListSessions()` +- `tmuxSessionExists(session)` +- `tmuxCapturePaneLines(session, n)` +- `tmuxCapturePaneLastLine(session)` — only used by tmux-status handler, can stay +- `tmuxIsIdle(session)` +- `lineIsIdleIndicator(line)` — pure function, stays +- `isShellPrompt(line)` — pure function, stays (see idle detection note below) +- `tmuxCheckApproval(session)` +- `tmuxApprove(session)` +- `tmuxSendKeys(session, text)` +- `parseTmuxAgentName(session)` — used by DiscoverSessions +- `shellQuote(s)` — used by CreateSession + +### Idle detection brittleness + +`isShellPrompt` and `lineIsIdleIndicator` rely on heuristic terminal output +matching (checking for `$`, `%`, `>`, `#` as prompt characters). This is +inherently fragile — non-standard PS1 configurations will break it. + +A cleaner approach for future work would be to use +[Claude Code hooks](https://code.claude.com/docs/en/hooks) to emit structured +idle/busy signals instead of parsing terminal output. This is out of scope for +the current refactoring but noted as a known limitation. + +## What moves out of `tmux.go` + +These functions currently in `tmux.go` are **coordinator-level orchestration**, not tmux primitives. +They move to the coordinator layer and use the `SessionBackend` interface: + +| Function | New location | Reason | +|----------|-------------|--------| +| `waitForIdle(session, timeout)` | Stays in coordinator, calls `backend.IsIdle()` in loop | Orchestration, not tmux | +| `waitForBoardPost(...)` | Stays as-is (already not tmux-specific) | Not tmux-related | +| `BroadcastCheckIn(...)` | Stays in coordinator, routes through backend | Orchestration | +| `SingleAgentCheckIn(...)` | Stays in coordinator, routes through backend | Orchestration | +| `runAgentCheckIn(...)` | Stays in coordinator, routes through backend | Orchestration | +| `BroadcastResult` + helpers | Stay as-is (not tmux-specific) | Data types | +| `TmuxAutoDiscover(...)` | Becomes `AutoDiscoverSessions(...)`, uses `backend.DiscoverSessions()` | Generalized | + +## What gets deleted + +| Item | Reason | +|------|--------| +| `agent_backend.go` `AgentBackend` interface | Superseded by `SessionBackend` | +| `agent_backend.go` `TmuxBackend` struct | Superseded by `TmuxSessionBackend` | +| `agent_backend.go` `CloudBackend` struct | Superseded by future `AmbientSessionBackend` | +| `agent_backend.go` `AgentSpec` struct | Replaced by `SessionCreateOpts` + `TmuxCreateOpts` | +| `agent_backend.go` `AgentInfo` struct | No longer needed; `SessionID` + `BackendType` on agent record | +| `agent_backend.go` `tmuxDefaultSession` | Out of scope for this refactoring (PR #49 concern) | +| `agent_backend.go` `shellQuote` | Moves to `session_backend_tmux.go` | +| `tmuxSessionAliases` global var | Moves into `TmuxSessionBackend.sessionAliases` field | + +## Session Ownership / Filtering + +Currently `tmuxListSessions()` returns ALL tmux sessions on the machine, not just +agent-boss sessions. This is a pre-existing issue that the refactoring preserves +but does not fix. + +Sessions are identified by naming convention only (`agentdeck_{name}_{timestamp}`). +For stronger ownership guarantees, a future enhancement could: +- Tag sessions with a tmux environment variable (e.g., `@agent_boss=true`) +- Use a dedicated tmux server socket (`tmux -L agent-boss`) + +This is out of scope for the current refactoring. + +## File Layout After Refactoring + +``` +internal/coordinator/ + session_backend.go # SessionBackend interface, SessionCreateOpts, ApprovalInfo, role interfaces + session_backend_tmux.go # TmuxSessionBackend implementation + shellQuote + tmux.go # Low-level tmux primitives (unchanged, unexported) + # agent_backend.go # DELETED (superseded) +``` diff --git a/.spec/SessionBackends/04-ambient-backend.md b/.spec/SessionBackends/04-ambient-backend.md new file mode 100644 index 0000000..67d6f6b --- /dev/null +++ b/.spec/SessionBackends/04-ambient-backend.md @@ -0,0 +1,623 @@ +# AmbientSessionBackend Design + +Implementation of `SessionBackend` backed by the Ambient Code Platform public API. + +**Dependency:** Requires [platform PR #855](https://github.com/ambient-code/platform/pull/855) +to be merged. Assumes no major changes to the OpenAPI spec. + +## Ambient Public API Summary + +| Endpoint | Method | Purpose | +|----------|--------|---------| +| `/v1/sessions` | GET | List sessions | +| `/v1/sessions` | POST | Create session (task, display_name, model, repos) | +| `/v1/sessions/{id}` | GET | Get session (status: pending/running/completed/failed) | +| `/v1/sessions/{id}` | DELETE | Delete session permanently | +| `/v1/sessions/{id}/message` | POST | Send user message (creates a run) | +| `/v1/sessions/{id}/output` | GET | Get output (transcript/compact/events format) | +| `/v1/sessions/{id}/runs` | GET | List runs (status, timestamps, event counts) | +| `/v1/sessions/{id}/runs` | POST | Create run (low-level AG-UI) | +| `/v1/sessions/{id}/start` | POST | Resume a stopped/completed session | +| `/v1/sessions/{id}/stop` | POST | Stop session (pod terminated, session preserved) | +| `/v1/sessions/{id}/interrupt` | POST | Cancel current run without killing session | + +Authentication: Bearer token via `Authorization` header. +Scoping: `X-Ambient-Project` header selects the target namespace. + +--- + +## Conceptual Mapping: Tmux vs Ambient + +The two backends have fundamentally different interaction models: + +| Concept | Tmux | Ambient | +|---------|------|---------| +| Session identity | tmux session name (local) | Kubernetes resource ID (remote) | +| "Exists" | `tmux list-sessions` contains name | `GET /sessions/{id}` returns 200 | +| "Idle" | Terminal shows shell prompt or Claude `>` | Session is `running` AND latest run is `completed` or `error` (no active run) | +| "Busy" | Terminal shows active output, no prompt | Latest run status is `running` | +| "Capture output" | Read terminal pane lines | Fetch transcript messages, format as lines | +| "Send input" | `tmux send-keys` text + Enter | `POST /sessions/{id}/message` | +| "Approval check" | Parse terminal for "Do you want...?" | Not applicable (sessions run with configured permissions). Returns `NeedsApproval: false`. | +| "Approve" | `tmux send-keys Enter` | Not applicable. No-op. | +| "Kill" (permanent) | `tmux kill-session` (gone forever) | `DELETE /sessions/{id}` (permanent removal) | +| "Stop" (preserve) | Not available | `POST /sessions/{id}/stop` (pod terminated, session data preserved) | +| "Create" | `tmux new-session -d -s name` | `POST /sessions` (async pod creation) | +| "Discovery" | Parse `agentdeck_*` session names | List sessions, match by `display_name` convention | +| "Interrupt" | `tmux send-keys Escape` | `POST /sessions/{id}/interrupt` | +| "Status" | Inferred from existence + idle + approval | Explicit: pending/running/completed/failed | +| "Resume" | Not possible (create new session) | `POST /sessions/{id}/start` | + +### Kill vs Stop + +Ambient distinguishes between two levels of session termination: + +- **`DELETE /sessions/{id}`** — permanent. Removes the session resource and all + associated data. This is the semantic equivalent of `tmux kill-session`. +- **`POST /sessions/{id}/stop`** — graceful. Stops the pod but preserves session + data, output history, and the session resource. The session can be resumed + later with `POST /sessions/{id}/start`. + +`KillSession` maps to `DELETE` (permanent), matching the tmux behavior. The +stop/resume flow is a separate Ambient capability not exposed in the current +interface. A future `StopSession`/`ResumeSession` pair could be added as an +optimization — the coordinator's `handleAgentRestart` currently does kill + create, +which works for both backends. + +--- + +## Known Gap: Context and Tool Injection + +**Problem:** Tmux sessions are launched in an environment where agent-boss commands +(`/boss.check`, `/boss.ignite`, etc.) are available as Claude Code slash commands +or via the CLAUDE.md configuration. The session inherits the local filesystem +context, MCP servers, and tool configurations. + +Ambient sessions are remote Kubernetes pods. They do not automatically have access +to agent-boss commands. The boss commands must be provided through one of: + +1. **Ambient workflows** — structured configs defining system prompts, slash + commands, and tool access. This is the Ambient-native approach but requires + designing a workflow specifically for agent-boss. +2. **Session creation options** — some minimal context can be passed at creation + time via the `task` field and repo configuration. +3. **MCP server configuration** — Ambient sessions can connect to MCP servers. + An agent-boss MCP server could expose boss commands as tools. + +**Decision:** Defer to Phase 2 implementation. This gap will surface during +Ambient backend integration and must be resolved before Ambient agents can +participate in the broadcast/check-in flow. + +--- + +## Interface Gaps Revealed by Ambient + +The Ambient API exposes capabilities that the current `SessionBackend` interface +(from `02-session-backend-interface.md`) does not cover. These need to be added. + +### Gap 1: `GetStatus` — structured session status + +**Problem:** The current interface only has `SessionExists(id) bool` and `IsIdle(id) bool`. +Ambient sessions have four distinct states (`pending`, `running`, `completed`, `failed`). +The coordinator needs richer status to make correct decisions (e.g., don't send a message +to a `pending` session that hasn't started yet). + +**Addition to interface:** + +```go +// SessionStatus represents the state of a session. +type SessionStatus string + +const ( + SessionStatusUnknown SessionStatus = "unknown" // can't determine (e.g., tmux binary missing) + SessionStatusPending SessionStatus = "pending" // created but not yet running + SessionStatusRunning SessionStatus = "running" // session is active + SessionStatusIdle SessionStatus = "idle" // session is running but waiting for input + SessionStatusCompleted SessionStatus = "completed" // session finished + SessionStatusFailed SessionStatus = "failed" // session errored + SessionStatusMissing SessionStatus = "missing" // session does not exist +) + +// GetStatus returns the current status of a session. +// For tmux: derives from SessionExists + IsIdle + CheckApproval. +// For ambient: maps directly from the API response status field. +GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) +``` + +**Tmux mapping:** + +``` +session not found -> SessionStatusMissing +session exists + idle -> SessionStatusIdle +session exists + busy -> SessionStatusRunning +tmux unavailable -> SessionStatusUnknown +``` + +**Ambient mapping:** + +``` +GET /sessions/{id} 404 -> SessionStatusMissing +status: "pending" -> SessionStatusPending +status: "running" + latest run "running" -> SessionStatusRunning +status: "running" + latest run "completed"/"error"/no runs -> SessionStatusIdle +status: "completed" -> SessionStatusCompleted +status: "failed" -> SessionStatusFailed +API error -> SessionStatusUnknown +``` + +### Gap 2: `Interrupt` — cancel current work without killing session + +**Problem:** Ambient has `POST /sessions/{id}/interrupt` to cancel the current run +while keeping the session alive. This is semantically different from both `KillSession` +(destroys the session) and `SendInput` (sends new work). The coordinator needs this +for scenarios like: "agent is stuck on a bad task, cancel and reassign." + +**Addition to interface:** + +```go +// Interrupt cancels the session's current work without killing the session. +// The session remains alive and can accept new messages. +// For tmux: sends Escape key to the session (Claude Code interrupt). +// For ambient: calls POST /sessions/{id}/interrupt. +Interrupt(ctx context.Context, sessionID string) error +``` + +Note: this is a new capability. No equivalent exists in the current codebase. +For tmux, the implementation sends the Escape key (Claude Code uses Escape, not +Ctrl-C, to interrupt). + +### Gap 3: `SessionCreateOpts` needs backend-specific options + +**Problem:** Ambient sessions need `task` (initial prompt), `display_name`, `model`, +and `repos`. These are not relevant to tmux. + +**Solution:** Use a generic `BackendOpts interface{}` field on `SessionCreateOpts`. +Each backend defines its own options struct and type-asserts at runtime. + +```go +type SessionCreateOpts struct { + SessionID string // desired session name/ID + Command string // tmux: shell command; ambient: mapped to task + BackendOpts interface{} // backend-specific (TmuxCreateOpts, AmbientCreateOpts, etc.) +} + +type AmbientCreateOpts struct { + DisplayName string // human-readable session name + Model string // Claude model to use + Repos []SessionRepo // repositories to clone +} +``` + +### Non-Gap: `CheckApproval` and `Approve` + +These are tmux-specific concepts. Ambient sessions run with configured permissions +and don't present terminal approval prompts. The Ambient backend returns no-op values: + +- `CheckApproval` -> `ApprovalInfo{NeedsApproval: false}` +- `Approve` -> `nil` (no-op) + +This is correct behavior, not a gap. The coordinator already checks `NeedsApproval` +before acting, so a backend that never needs approval simply never triggers that path. + +### Non-Gap: Resume + +Ambient supports `POST /sessions/{id}/start` to resume a stopped session. Tmux does not +(you create a new session). This is valuable but not needed in the interface for the +initial implementation — the coordinator's `handleAgentRestart` already does +kill + create, which works for both backends. Resume can be added later as an optimization. + +--- + +## AmbientSessionBackend Implementation + +### Configuration + +```go +type AmbientSessionBackend struct { + apiURL string // e.g., "https://public-api-ambient-code.apps.okd1.timslab/v1" + token string // Bearer token + project string // X-Ambient-Project header value + httpClient *http.Client // with timeouts +} + +type AmbientBackendConfig struct { + APIURL string `json:"api_url"` + Token string `json:"token"` + Project string `json:"project"` +} + +func NewAmbientSessionBackend(cfg AmbientBackendConfig) *AmbientSessionBackend { + return &AmbientSessionBackend{ + apiURL: strings.TrimRight(cfg.APIURL, "/"), + token: cfg.Token, + project: cfg.Project, + httpClient: &http.Client{Timeout: 30 * time.Second}, + } +} +``` + +### Method Implementations + +#### `Name() string` + +```go +func (b *AmbientSessionBackend) Name() string { return "ambient" } +``` + +#### `Available() bool` + +Calls `GET /sessions` with a short timeout. Returns true if 200, false otherwise. +Caches result for 30 seconds to avoid hammering the API on every liveness tick. + +```go +func (b *AmbientSessionBackend) Available() bool { + // Check cached result (30s TTL) + // If stale: GET /sessions, check for 200 + // Note: any 2xx/4xx means the API is reachable (available). + // Only network errors or 502 mean unavailable. +} +``` + +#### `CreateSession(ctx, opts) (string, error)` + +Maps to `POST /v1/sessions`. + +```go +func (b *AmbientSessionBackend) CreateSession(ctx context.Context, opts SessionCreateOpts) (string, error) { + body := map[string]interface{}{ + "task": opts.Command, // Command maps to the initial task/prompt + } + + // Extract ambient-specific options + var ambientOpts AmbientCreateOpts + if opts.BackendOpts != nil { + if ao, ok := opts.BackendOpts.(AmbientCreateOpts); ok { + ambientOpts = ao + } + } + + if ambientOpts.DisplayName != "" { + body["display_name"] = ambientOpts.DisplayName + } else if opts.SessionID != "" { + body["display_name"] = opts.SessionID + } + if ambientOpts.Model != "" { + body["model"] = ambientOpts.Model + } + if len(ambientOpts.Repos) > 0 { + body["repos"] = ambientOpts.Repos + } + + // POST /v1/sessions + // Returns {"id": "session-abc123", "message": "Session created"} + // Return the session ID +} +``` + +**Note on `Command` -> `task` mapping:** For tmux, `Command` is the shell command +to execute (e.g., `claude --dangerously-skip-permissions`). For Ambient, the platform +handles launching Claude — `Command` is repurposed as the initial prompt/task. If +the caller sets `opts.Command` to a shell command, it becomes the session's task. +This is acceptable because agents spawned through the coordinator always get an +ignite prompt as their first message anyway. + +#### `KillSession(ctx, id) error` + +Maps to `DELETE /v1/sessions/{id}`. Permanently removes the session. + +```go +func (b *AmbientSessionBackend) KillSession(ctx context.Context, sessionID string) error { + // DELETE /v1/sessions/{id} + // Accept 200 (deleted) or 404 (already gone) as success +} +``` + +#### `SessionExists(id) bool` + +Maps to `GET /v1/sessions/{id}`. Returns true for any status; false on 404. + +```go +func (b *AmbientSessionBackend) SessionExists(sessionID string) bool { + // GET /v1/sessions/{id} + // 200 -> true (any status counts as "exists") + // 404 -> false + // Error -> false +} +``` + +#### `ListSessions() ([]string, error)` + +Maps to `GET /v1/sessions`. Returns IDs of all sessions in the project. + +```go +func (b *AmbientSessionBackend) ListSessions() ([]string, error) { + // GET /v1/sessions + // Extract .items[].id +} +``` + +#### `IsIdle(id) bool` + +Checks if the session is `running` but has no active run. + +```go +func (b *AmbientSessionBackend) IsIdle(sessionID string) bool { + // Step 1: GET /v1/sessions/{id} -> check status == "running" + // If not running -> not idle (it's stopped, pending, or failed) + // + // Step 2: GET /v1/sessions/{id}/runs -> check latest run + // If no runs exist -> idle (session running, nothing to do) + // If latest run status == "completed" or "error" -> idle + // If latest run status == "running" -> not idle (working) +} +``` + +#### `CaptureOutput(id, lines) ([]string, error)` + +Maps to `GET /v1/sessions/{id}/output?format=transcript`. Formats the last N +transcript messages as human-readable lines (matching the `[]string` contract). + +```go +func (b *AmbientSessionBackend) CaptureOutput(sessionID string, lines int) ([]string, error) { + // GET /v1/sessions/{id}/output?format=transcript + // Format each message as: "[role] content" (truncated to ~200 chars) + // Return last N lines + // + // Example output: + // "[user] /boss.check FE my-project" + // "[assistant] I'll check in now. Reading the blackboard..." + // "[tool] Bash: curl -s http://localhost:8899/spaces/my-project/raw" + // "[assistant] Posted status update to the blackboard." +} +``` + +#### `CheckApproval(id) ApprovalInfo` + +Ambient sessions run with configured permissions. No terminal approval prompts. + +```go +func (b *AmbientSessionBackend) CheckApproval(sessionID string) ApprovalInfo { + return ApprovalInfo{NeedsApproval: false} +} +``` + +#### `SendInput(id, text) error` + +Maps to `POST /v1/sessions/{id}/message`. + +```go +func (b *AmbientSessionBackend) SendInput(sessionID string, text string) error { + // POST /v1/sessions/{id}/message + // Body: {"content": text} + // Accept 202 as success + // Return error on 422 (session not running) — caller should check status +} +``` + +#### `Approve(id) error` + +No-op for Ambient. + +```go +func (b *AmbientSessionBackend) Approve(sessionID string) error { + return nil // Ambient sessions don't have terminal approval prompts +} +``` + +#### `DiscoverSessions() (map[string]string, error)` + +Lists all sessions and matches by `display_name`. Convention: sessions created by +agent-boss use `display_name` = agent name. + +```go +func (b *AmbientSessionBackend) DiscoverSessions() (map[string]string, error) { + // GET /v1/sessions + // For each session where status is "running": + // discovered[session.display_name] = session.id + // Return map +} +``` + +#### `GetStatus(ctx, id) (SessionStatus, error)` + +Maps directly from the API response. + +```go +func (b *AmbientSessionBackend) GetStatus(ctx context.Context, sessionID string) (SessionStatus, error) { + // GET /v1/sessions/{id} + // 404 -> SessionStatusMissing, nil + // 200 -> map status field: + // "pending" -> SessionStatusPending + // "completed" -> SessionStatusCompleted + // "failed" -> SessionStatusFailed + // "running" -> check runs: + // latest run "running" -> SessionStatusRunning + // else -> SessionStatusIdle + // Error -> SessionStatusUnknown, err +} +``` + +#### `Interrupt(ctx, id) error` + +Maps to `POST /v1/sessions/{id}/interrupt`. + +```go +func (b *AmbientSessionBackend) Interrupt(ctx context.Context, sessionID string) error { + // POST /v1/sessions/{id}/interrupt + // Accept 200 as success +} +``` + +--- + +## Behavioral Differences from Tmux + +### 1. Asynchronous session creation + +Tmux `CreateSession` is synchronous — by the time it returns, the tmux session +exists and the command is running. Ambient `CreateSession` is asynchronous — the +API returns a session ID immediately, but the pod may take seconds to start +(status transitions: `pending` -> `running`). + +**Impact on coordinator:** After `CreateSession`, the coordinator currently sends +an ignite command after a 5-second sleep. For Ambient, it should poll `GetStatus` +until the session reaches `running` (or `idle`) before sending the first message. + +```go +// After creating an ambient session: +for i := 0; i < 30; i++ { // up to 60s + status, _ := backend.GetStatus(ctx, sessionID) + if status == SessionStatusRunning || status == SessionStatusIdle { + break + } + time.Sleep(2 * time.Second) +} +``` + +### 2. No terminal = no idle heuristics + +Tmux idle detection reads terminal output and matches against patterns (shell +prompts, Claude `>` indicator, status bar keywords). This is inherently heuristic. + +Ambient idle detection is structural: check the session status and the latest run +status. It's deterministic — no false positives from prompt-like text in output. + +### 3. `SendInput` creates a run + +In tmux, `SendInput` types text into a terminal. The text could be anything — a +slash command, a prompt, arbitrary keystrokes. There's no concept of "runs." + +In Ambient, `SendInput` calls `POST /message`, which creates a new AG-UI run. +Each `SendInput` call is a discrete unit of work with its own run ID, start time, +end time, and event stream. This is important for: + +- **Broadcast check-in:** Each `/boss.check` creates a run. The coordinator can + poll `GET /runs` to know exactly when the check-in completed (run status = + `completed`) instead of heuristically waiting for idle. +- **Model switching:** `/model sonnet` as a tmux command becomes an Ambient + message. This may not work as intended — Ambient sessions have a fixed model + set at creation. Model switching may need to be a no-op or handled differently. + +### 4. Model switching concern + +The broadcast check-in flow switches agents to a cheaper model (`/model haiku`) +before sending the check-in prompt, then restores the work model afterward. This +is problematic for two reasons: + +- **Ambient:** Sessions have a fixed model set at creation. `/model` is a Claude + Code slash command, not an Ambient API concept. Sending it as a message would + be interpreted as a task, not a model switch. +- **Context compaction risk:** Even for tmux, switching from opus (1M context) to + haiku with 300K tokens in context would trigger compaction, potentially losing + important context. + +For non-tmux backends, the coordinator should skip model switching entirely. +For tmux, the model-switch behavior is preserved but the compaction risk should +be evaluated separately. + +### 5. No approval flow + +Ambient sessions run with the permissions configured at session creation. There +are no interactive approval prompts. The entire approval detection + interrupt +ledger pipeline in the liveness loop becomes a no-op for Ambient agents. + +### 6. Persistent sessions + +Tmux sessions are ephemeral — if the machine reboots, they're gone. Ambient +sessions are persistent Kubernetes resources with stored state. `KillSession` +(= `DELETE`) permanently removes the session. For non-destructive pause, use +the Ambient-specific `POST /stop` endpoint (not exposed in the interface). + +- Sessions can be resumed (`POST /start`) after being stopped (not killed) +- Session output/history survives stops +- The coordinator could reconnect to existing sessions after its own restart + +--- + +## Configuration and Initialization + +### Environment variables + +```bash +AMBIENT_API_URL=https://public-api-ambient-code.apps.okd1.timslab/v1 +AMBIENT_TOKEN= +AMBIENT_PROJECT=my-project +``` + +### Server initialization + +```go +func NewServer(port, dataDir string) *Server { + s := &Server{ + // ... existing fields ... + backends: make(map[string]SessionBackend), + defaultBackend: "tmux", + } + + // Always register tmux backend + s.backends["tmux"] = NewTmuxSessionBackend() + + // Register ambient backend if configured + if apiURL := os.Getenv("AMBIENT_API_URL"); apiURL != "" { + cfg := AmbientBackendConfig{ + APIURL: apiURL, + Token: os.Getenv("AMBIENT_TOKEN"), + Project: os.Getenv("AMBIENT_PROJECT"), + } + s.backends["ambient"] = NewAmbientSessionBackend(cfg) + + // If ambient is configured and tmux is not available, + // default to ambient + if !s.backends["tmux"].Available() { + s.defaultBackend = "ambient" + } + } + + return s +} +``` + +--- + +## Impact on Broadcast / Check-In + +The broadcast system (`BroadcastCheckIn`, `runAgentCheckIn`) is the most complex +tmux-dependent flow. Here's how it adapts for Ambient: + +### Current tmux flow (per agent) + +``` +1. Switch to check model: tmuxSendKeys("/model haiku") +2. Wait for idle: poll tmuxIsIdle() every 3s +3. Send check-in: tmuxSendKeys("/boss.check Agent Space") +4. Wait for board post: poll agentUpdatedAt() every 3s +5. Restore work model: tmuxSendKeys("/model sonnet") +6. Wait for idle: poll tmuxIsIdle() every 3s +``` + +### Ambient adaptation + +``` +1. Skip model switch: Ambient sessions have a fixed model; model switching + risks context compaction even for tmux (see §4 above) +2. Check status: backend.GetStatus() == idle +3. Send check-in: backend.SendInput("/boss.check Agent Space") +4. Wait for board post: poll agentUpdatedAt() every 3s (same — blackboard is boss-side) +5. Skip model restore: (see step 1) +6. Check status: backend.GetStatus() == idle +``` + +The coordinator should check `backend.Name()` and skip model switching for +non-tmux backends. + +--- + +## File Layout + +``` +internal/coordinator/ + session_backend.go # Interface, types, SessionStatus, role interfaces + session_backend_tmux.go # TmuxSessionBackend + session_backend_ambient.go # AmbientSessionBackend (this spec) + tmux.go # Low-level tmux primitives (unchanged) +``` diff --git a/.spec/SessionBackends/05-agentcore-feasibility.md b/.spec/SessionBackends/05-agentcore-feasibility.md new file mode 100644 index 0000000..5e242ba --- /dev/null +++ b/.spec/SessionBackends/05-agentcore-feasibility.md @@ -0,0 +1,221 @@ +# AWS Bedrock AgentCore — SessionBackend Feasibility Analysis + +Can the `SessionBackend` interface (13 methods) support an AgentCore backend? + +**Short answer:** Yes, but with significant client-side state management. +AgentCore is a lower-level hosting primitive than Ambient — it provides +"invoke" and "stop", not managed session lifecycle. The interface design +is sound and requires no new methods, but an AgentCore implementation +would be substantially more complex than the Ambient one. + +--- + +## What is AgentCore? + +AgentCore is a **"bring your own agent code" serverless hosting platform**. +You package your agent as a Docker container exposing `/invocations` (request +handler) and `/ping` (health check) on port 8080. AgentCore runs it in +isolated microVMs with per-session compute, memory, and filesystem isolation. + +This is fundamentally different from Ambient, which is a **managed Claude +session service** — you say "create a session with this task" and the platform +runs Claude Code for you. + +### Two-Tier API + +| Tier | Purpose | Key Operations | +|------|---------|----------------| +| **Control Plane** | Manage runtime *definitions* (like a Deployment template) | `CreateAgentRuntime`, `GetAgentRuntime`, `UpdateAgentRuntime`, `DeleteAgentRuntime`, `ListAgentRuntimes` | +| **Data Plane** | Interact with individual sessions | `InvokeAgentRuntime`, `StopRuntimeSession` | + +The control plane manages the *deployment* of your agent code. The data plane +manages *sessions* within that deployment. For our use case, the coordinator +would pre-deploy a Claude Code agent as an AgentCore Runtime, then manage +individual sessions via the data plane. + +### Session Model + +- Sessions are **implicit** — created automatically on first `InvokeAgentRuntime` + call with a new `runtimeSessionId`. +- Each session gets a dedicated **microVM** with isolated resources. +- Sessions auto-terminate after **15 minutes of inactivity** (configurable). +- Maximum session lifetime: **8 hours**. +- Session state is **ephemeral** — no persistence after termination. +- There is **no "list sessions" API** — you must track session IDs yourself. + +### Authentication + +AWS IAM or OAuth 2.0 (via AgentCore Identity). Go SDK: +`github.com/aws/aws-sdk-go-v2/service/bedrockagentcore` + +--- + +## Interface Mapping + +### Methods that map cleanly (5 of 13) + +| Method | AgentCore Mapping | Notes | +|--------|------------------|-------| +| `Name()` | Returns `"agentcore"` | Trivial | +| `Available()` | `GetAgentRuntime(runtimeARN)` returns 200 | Checks if the pre-deployed runtime exists and is active | +| `KillSession(ctx, id)` | `StopRuntimeSession(runtimeARN, sessionID)` | Direct mapping. Terminates the microVM. | +| `SendInput(id, text)` | `InvokeAgentRuntime(runtimeARN, sessionID, payload)` | Sends payload to agent. Returns streaming response. | +| `Approve(id)` | No-op (return nil) | Agent code handles its own tool permissions | + +### Methods that work but with semantic differences (2 of 13) + +| Method | AgentCore Mapping | Semantic Difference | +|--------|------------------|-------------------| +| `CreateSession(ctx, opts)` | `InvokeAgentRuntime` with a new UUID as `runtimeSessionId` | No explicit "create" — session springs into existence on first invoke. The initial `opts.Command` becomes the first invocation payload. Backend must wait for streaming response to confirm the session started. | +| `CheckApproval(id)` | Returns `ApprovalInfo{NeedsApproval: false}` | Same as Ambient — agent code manages its own permissions. No terminal to parse. | + +### Methods with significant gaps (6 of 13) + +| Method | Gap | Workaround | +|--------|-----|-----------| +| `SessionExists(id)` | **No API to query session existence.** Sessions are either active (accepting invocations) or terminated (gone). No status endpoint. | Backend must maintain a **local session registry** — track created sessions and mark them terminated on stop/timeout. Alternatively, attempt a lightweight invoke and interpret errors, but this is fragile. | +| `ListSessions()` | **No `ListRuntimeSessions` API exists.** You can list *runtimes* but not *sessions within a runtime*. | Backend must maintain a **local session registry** of all session IDs it has created. | +| `GetStatus(ctx, id)` | **No external session status API.** The `/ping` endpoint is internal to the agent container — it's how AgentCore infrastructure monitors the agent, not how external callers query status. | Backend must **infer status** from local state: just created → `pending`/`running`, last invoke returned output → `running`, invoke failed with session-not-found → `missing`, locally marked stopped → `completed`. Very approximate. | +| `IsIdle(id)` | **`/ping` is internal.** Reports `Healthy` (idle) or `HealthyBusy` (working) but only to AgentCore infrastructure, not to external API callers. | Backend could embed a custom status endpoint in the agent code that the coordinator calls directly. Or track idle state based on whether the last streaming invoke response has completed. Both require custom agent code changes. | +| `CaptureOutput(id, lines)` | **No transcript/output API.** Output comes back as a streaming response from `InvokeAgentRuntime`. There's no after-the-fact "get output" endpoint. | Backend must **capture and buffer** streaming output from every `InvokeAgentRuntime` call. Store the last N lines in memory or a local store. This means the coordinator must be the one invoking (or subscribing to) the agent to capture its output. | +| `DiscoverSessions()` | **No list sessions API.** | Backend returns only sessions it has locally registered. Cannot discover sessions created by other coordinators or external callers. | + +### Methods with partial gaps (1 of 13) + +| Method | Gap | Workaround | +|--------|-----|-----------| +| `Interrupt(ctx, id)` | `StopRuntimeSession` is **destructive** — it terminates the microVM entirely. There's no "cancel current work but keep session alive" equivalent. Unlike Ambient's `POST /interrupt` which cancels the current run while preserving the session. | If the agent code supports it, send a special "interrupt" message via `InvokeAgentRuntime`. But this requires custom agent code and won't work if the agent is busy (AgentCore won't accept new invocations while status is `HealthyBusy`). In practice, interrupt = kill + recreate for AgentCore. | + +--- + +## Conceptual Comparison + +| Concept | Tmux | Ambient | AgentCore | +|---------|------|---------|-----------| +| What runs the agent | Local tmux + Claude CLI | Platform-managed Claude pod | Your Docker container in a microVM | +| Session creation | Explicit (`tmux new-session`) | Explicit (`POST /sessions`) | Implicit (first invoke) | +| Session listing | `tmux list-sessions` | `GET /sessions` | **Not available** | +| Session status | Inferred from terminal | `GET /sessions/{id}` status field | **Not available externally** | +| Idle detection | Parse terminal output | Check run status via API | `/ping` internal only | +| Output capture | Read terminal pane | `GET /sessions/{id}/output` | Streaming during invoke only | +| Send input | `tmux send-keys` | `POST /sessions/{id}/message` | `InvokeAgentRuntime` | +| Kill session | `tmux kill-session` | `POST /sessions/{id}/stop` | `StopRuntimeSession` | +| Interrupt (non-destructive) | `tmux send-keys C-c` | `POST /sessions/{id}/interrupt` | **Not available** | +| Session persistence | Ephemeral (lost on reboot) | Persistent (K8s resource) | Ephemeral (lost on terminate) | +| Discovery | Parse session names | List + match display_name | **Not available** | +| Model flexibility | Any (runs locally) | Configured at creation | Any (you deploy the model call) | + +--- + +## Architecture: What an AgentCore Backend Would Require + +### Pre-requisite: Deploy Claude Code as an AgentCore Runtime + +Before the backend can create sessions, a Claude Code agent must be +deployed as an AgentCore Runtime. This is a one-time setup: + +``` +1. Package Claude Code CLI in a container +2. Implement /invocations handler (receives prompts, runs Claude, streams output) +3. Implement /ping handler (reports Healthy/HealthyBusy) +4. CreateAgentRuntime with the container image +5. Store the Runtime ARN in coordinator config +``` + +This is entirely outside the `SessionBackend` interface — it's infrastructure +setup, similar to having tmux installed or Ambient deployed. + +### Client-Side State Store + +The AgentCore backend would need a local state store to compensate for +the missing APIs: + +```go +type AgentCoreSessionBackend struct { + runtimeARN string + client *bedrockagentcore.Client + + mu sync.RWMutex + sessions map[string]*agentCoreSession // sessionID -> state +} + +type agentCoreSession struct { + id string + createdAt time.Time + status SessionStatus // locally tracked + output *ring.Buffer // circular buffer of last N output lines + lastInvoke time.Time +} +``` + +### Estimated Implementation Complexity + +| Backend | Lines of code (est.) | External dependencies | Client-side state needed | +|---------|---------------------|----------------------|------------------------| +| Tmux | ~150 | tmux binary | None | +| Ambient | ~300 | HTTP client | None (API is stateful) | +| AgentCore | ~500-700 | AWS SDK Go v2 | Session registry, output buffer, status tracking | + +--- + +## Verdict + +### The interface works — no changes needed + +All 13 methods can be implemented against AgentCore. No new methods are +required. The gaps are all solvable with client-side state management. + +### But the implementation is substantially more complex + +AgentCore is designed as a low-level hosting platform, not a managed +session service. It gives you two operations — invoke and stop — and +expects you to build session management on top. This means: + +1. **Session tracking**: Must maintain a local registry of all sessions +2. **Output buffering**: Must capture and store streaming output +3. **Status inference**: Must derive status from local state rather than querying an API +4. **No interrupt**: Must accept kill-and-recreate as the "interrupt" pattern +5. **No discovery**: Can only find sessions the coordinator itself created +6. **Custom agent code**: Need to build and deploy a Claude Code container + +### Comparison to Ambient + +Ambient's API was practically designed for this interface — nearly 1:1 mapping +with rich session lifecycle, status, output, and interrupt APIs. AgentCore +requires the backend to replicate what Ambient provides natively. + +### When AgentCore makes sense + +Despite the complexity, AgentCore would be the right choice when: + +- Running on AWS infrastructure (native IAM integration) +- Need model flexibility beyond Claude (AgentCore is model-agnostic) +- Want per-session compute isolation (dedicated microVMs) +- Already have agent code that isn't Claude Code (LangGraph, CrewAI, etc.) +- Need the AWS ecosystem (CloudWatch, X-Ray, IAM, VPC integration) + +### Recommendation + +AgentCore support is feasible as a Phase 3 backend (after tmux and Ambient), +but it's a larger effort. The interface design is validated — it accommodates +AgentCore's minimal API surface without requiring new methods. The complexity +lives entirely in the implementation, not the interface. + +--- + +## Sources + +- [Amazon Bedrock AgentCore Overview](https://aws.amazon.com/bedrock/agentcore/) +- [AgentCore Runtime — How it Works](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-how-it-works.html) +- [InvokeAgentRuntime API Reference](https://docs.aws.amazon.com/bedrock-agentcore/latest/APIReference/API_InvokeAgentRuntime.html) +- [StopRuntimeSession API Reference](https://docs.aws.amazon.com/bedrock-agentcore/latest/APIReference/API_StopRuntimeSession.html) +- [CreateAgentRuntime API Reference](https://docs.aws.amazon.com/bedrock-agentcore-control/latest/APIReference/API_CreateAgentRuntime.html) +- [GetAgentRuntime API Reference](https://docs.aws.amazon.com/bedrock-agentcore-control/latest/APIReference/API_GetAgentRuntime.html) +- [ListAgentRuntimes API Reference](https://docs.aws.amazon.com/bedrock-agentcore-control/latest/APIReference/API_ListAgentRuntimes.html) +- [Session Isolation](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-sessions.html) +- [Lifecycle Configuration](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-lifecycle-settings.html) +- [Async and Long-Running Agents](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-long-run.html) +- [HTTP Protocol Contract](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-http-protocol-contract.html) +- [AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) +- [Go SDK — bedrockagentcore package](https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/bedrockagentcore) +- [AgentCore Python SDK (GitHub)](https://github.com/aws/bedrock-agentcore-sdk-python)