docs: add SessionBackend abstraction design specs#50
Conversation
Design specs for replacing hardcoded tmux with a pluggable SessionBackend interface supporting tmux and Ambient backends.
| `CheckApproval`), interaction (`SendInput`, `Approve`, `Interrupt`), and discovery | ||
| (`DiscoverSessions`). | ||
| - **`TmuxSessionBackend`** — pure wrapper around existing tmux functions. Zero behavior change. | ||
| - **`AmbientSessionBackend`** — backed by the ACP public API (`POST /sessions`, |
There was a problem hiding this comment.
Note that this requires ambient-code/platform#855 to be merged and assumes no major changes to the openapi spec.
| | Function | What it does | Called by | | ||
| |----------|-------------|-----------| | ||
| | `tmuxAvailable()` | Checks if `tmux` binary is in PATH | `TmuxAutoDiscover`, `BroadcastCheckIn`, `SingleAgentCheckIn`, `checkAllSessionLiveness` | | ||
| | `tmuxListSessions()` | Runs `tmux list-sessions -F #S` | `tmuxSessionExists`, `TmuxAutoDiscover` | |
There was a problem hiding this comment.
This assumes all tmux sessions are used by agent-boss. Do we need some sort of filtering / tagging mechanism?
jsell-rh
left a comment
There was a problem hiding this comment.
Design Review: SessionBackend Abstraction
Overall: This is solid, thorough design work. The tmux audit (doc 01) is especially valuable — it's the kind of audit that prevents missed callsites during implementation. A few things to address before this is ready to drive implementation.
🔴 Conflict with existing agent_backend.go
PR #47 (already merged) added an AgentBackend interface in agent_backend.go with Spawn/Stop/List/Name and a concrete TmuxBackend. Your spec proposes deleting this file as "superseded."
Before implementing: reconcile the two designs. AgentBackend.Spawn covers session creation with the same SessionCreateOpts-equivalent params. Either:
- Fold
AgentBackendinto the newSessionBackend(preferred — avoids two competing interfaces) - Explain why both are needed
Also: PR #49 (just merged) added tmuxDefaultSession(space, agent) to agent_backend.go that namespaces sessions as {space}-{agent}. This needs to carry forward into the TmuxSessionBackend implementation — the spec currently shows session names as bare agent names which will reintroduce the cross-space collision bug.
🔴 DiscoverSessions will miss space-scoped session names
The ambient DiscoverSessions maps display_name -> session.id. For tmux, the current convention after PR #49 is {space}-{agent} as the session name, not just the agent name.
The discovery implementation needs to know which space it's operating in so it can filter/match correctly, or DiscoverSessions needs to return {sessionID, agentName, space} tuples. The current map[string]string (agentName -> sessionID) signature doesn't carry space information.
🟡 13-method interface is large — consider role segregation
A 13-method interface means every backend must implement all 13, even if most are no-ops. The spec acknowledges several Ambient no-ops already (CheckApproval, Approve). Consider splitting into composable role interfaces:
type SessionLifecycle interface { CreateSession; KillSession; SessionExists; ListSessions }
type SessionObserver interface { GetStatus; IsIdle; CaptureOutput; CheckApproval }
type SessionActor interface { SendInput; Approve; Interrupt }A full SessionBackend embeds all three. Smaller consumers (e.g., the liveness loop) can depend only on SessionObserver. This makes testing easier and intent clearer.
🟡 SessionID rename: migration burden is higher than the spec implies
The spec shows a UnmarshalJSON shim for the tmux_session -> session_id field rename, but the migration affects:
- All running agents (they POST
tmux_sessionin their status updates — old agents will silently stop registering their session) - The DB schema (needs
ALTER TABLEor a migration) - Scripts, docs, the ignition template
- Frontend (
AgentDetail.vuegates entire pane sections on this field)
The spec lists these as "files affected" but doesn't show the migration sequencing. A phased approach — accept both tmux_session and session_id until all agents are updated — should be spelled out explicitly.
🟢 Strengths
- The tmux usage audit is comprehensive and accurate (I verified against the codebase)
- Behavioral differences section (async creation, no approval flow, persistent sessions) is exactly the right level of detail
- Broadcast adaptation plan is realistic — the model-switch skip for non-tmux is the right call
Available()caching (30s TTL) prevents liveness loop hammeringGetStatusreturningSessionStatusPendingfor ambient catches the async startup gap cleanly
Verdict
Do not merge as-is. The conflict with the existing AgentBackend interface and the session naming issue with tmuxDefaultSession need to be resolved before implementation starts. Happy to help reconcile once the design decisions are made on the interface shape.
| 5. **Backward compatible** — existing `TmuxSession` field, ignition `?tmux_session=` param, and | ||
| all API responses continue to work | ||
|
|
There was a problem hiding this comment.
No need to be backwards compatible
| | Function | What it does | | ||
| |----------|-------------| | ||
| | `lineIsIdleIndicator(line)` | Returns true if a line matches known idle patterns: `>`, shell `$`/`%`/`#`, Claude Code hints, status bars | | ||
| | `isShellPrompt(line)` | Detects `$`, `%`, `>`, `#`, `>` as trailing prompt characters | |
There was a problem hiding this comment.
This is hilariously brittle. Fine if there is no formal way to determine, but would prefer something that doesn't assume PS1 follows convention.
There was a problem hiding this comment.
Strongly agree. Perhaps using hooks would be a cleaner approach. https://code.claude.com/docs/en/hooks
| // The session remains alive and can accept new messages. | ||
| // For tmux: sends Ctrl-C (C-c) to the session. | ||
| // For ambient: calls POST /sessions/{id}/interrupt. | ||
| Interrupt(ctx context.Context, sessionID string) error |
There was a problem hiding this comment.
Pretty sure it's ESC to interrupt claude code, not Ctrl-C.
This raises a bigger design question: tmux is an interface to multiple potential agent backends.
Do we want to support, for example, tmux + claude code, tmux + opencode, etc.?
At least, do we want to design such that it's possible to support different local agents via tmux?
There was a problem hiding this comment.
I think that is pretty out of scope. For now, lets focus on tmux + claude code, but I can definitely see tmux + opencode and tmux + gemini being separate backends.
|
|
||
| // Interrupt cancels the session's current work without killing it. | ||
| // The session remains alive and can accept new messages. | ||
| // For tmux: sends Ctrl-C (C-c) to the session. |
There was a problem hiding this comment.
I didn't actually see this in the tmux backend spec and AFAIK it doesn't exist today
| **Backward compatibility:** The JSON tag changes from `tmux_session` to `session_id`. Since agents | ||
| POST status updates and the server preserves `SessionID` as a sticky field, old agents sending | ||
| `tmux_session` will need a migration shim in the JSON unmarshaler: | ||
|
|
||
| ```go | ||
| func (u *AgentUpdate) UnmarshalJSON(data []byte) error { | ||
| type Alias AgentUpdate | ||
| aux := &struct { | ||
| TmuxSession string `json:"tmux_session,omitempty"` | ||
| *Alias | ||
| }{Alias: (*Alias)(u)} | ||
| if err := json.Unmarshal(data, aux); err != nil { | ||
| return err | ||
| } | ||
| // Backward compat: accept tmux_session if session_id is empty | ||
| if u.SessionID == "" && aux.TmuxSession != "" { | ||
| u.SessionID = aux.TmuxSession | ||
| u.BackendType = "tmux" | ||
| } | ||
| return nil | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Drop backwards compatibility. No need - no production agents running.
| ### Current tmux flow (per agent) | ||
|
|
||
| ``` | ||
| 1. Switch to check model: tmuxSendKeys("/model haiku") |
There was a problem hiding this comment.
This model switching can be a problem if, for example, an agent is opus [1m] and has 300K tokens in context. haiku would trigger a compaction.
I'm not sure how I feel about this in general
| | "Send input" | `tmux send-keys` text + Enter | `POST /sessions/{id}/message` | | ||
| | "Approval check" | Parse terminal for "Do you want...?" | Not applicable (sessions run with configured permissions). Returns `NeedsApproval: false`. | | ||
| | "Approve" | `tmux send-keys Enter` | Not applicable. No-op. | | ||
| | "Kill" | `tmux kill-session` (gone forever) | `POST /sessions/{id}/stop` (session data preserved) | |
There was a problem hiding this comment.
I don't think these relate. "Kill" is the same as DELETE /sessions/{id} in ambient. We need a new line here highliting that "Stop" functionality exists in ambient, but not in tmux.
Maybe if we did implement it in tmux, it would exit the claude code session, record the session ID so it can claude --continue <sessionid>. Not necessary, but its probably possible.
| DisplayName string // human-readable session name | ||
| Model string // Claude model to use | ||
| Repos []SessionRepo // repositories to clone | ||
| } |
There was a problem hiding this comment.
This is overloading the interface. When not support a generic BackendOpts interface and allow the backends to define their options?
There was a problem hiding this comment.
Ideally, the backends would contain all backend-specific code.
|
I think missing from these specs entirely is how context and tools are injected into sessions. Right now, tmux sessions are assumed to already have access to the agent-boss commands. How will those commands, or similar instructions and tools, be provided to the ambient sessions? Perhaps this is something we can follow-up with after the session interface and backend support is added - noting is as something that will almost certainly cause issues with the ambient backend integration. |
@tiwillia Agree, this is also a significant gap in the current implementation. Is this something supported by the Ambient API? Or must one configure (for example) available MCP servers via Ambient first? |
|
When ambient sessions are created, you can provide some minimal context - but ambient is built around providing instructions / context to agents via workflows. I'd hate to have to design a workflow specifically for this given its an ambient-specific concept, but it might be the best way. I'll have to dig into it. For now, I think we can consider it after the implementation of ambient support. |
…ity analysis - Fold AgentBackend (PR jsell-rh#47) into SessionBackend — reconcile competing interfaces - Drop backward compat (no production agents) — remove UnmarshalJSON shim - Replace SessionCreateOpts union with BackendOpts interface{} pattern - Add role interfaces (SessionLifecycle, SessionObserver, SessionActor) - Fix Interrupt: sends Escape (not Ctrl-C) for Claude Code - Add GetStatus and Interrupt to tmux backend method table - Fix ambient KillSession mapping: DELETE (permanent), not POST /stop - Add migration sequencing (5-step build-safe plan) - Add context/tool injection gap for ambient (deferred to Phase 2) - Note session ownership filtering and idle detection brittleness - Note cross-space name collision as future issue (naming unchanged) - Note model switching compaction risk in broadcast flow - Add AWS Bedrock AgentCore feasibility analysis (05)
|
Updated the specs to address all review feedback. Here's what changed: RED items resolvedAgentBackend reconciliation: DiscoverSessions + session naming: Session naming is unchanged — we preserve the existing YELLOW items addressedRole interfaces: Added Migration sequencing: New 5-step migration plan in doc 02 showing the exact order of changes, each producing a compilable codebase:
Other feedback incorporated
New: AgentCore feasibility analysis (doc 05)Added a feasibility analysis for AWS Bedrock AgentCore as a potential third backend. The interface works without changes, but implementation would be substantially more complex — AgentCore only exposes "invoke" and "stop", requiring client-side state management for session tracking, output buffering, and status inference. Recommended as Phase 3. Ready for another look. @jsell-rh |
| - PR #49: `{space}-{agent}` (parsed by space prefix matching) | ||
|
|
||
| Neither convention provides a strong ownership guarantee. Options for improvement: | ||
| - **tmux environment variable**: set `@agent_boss=true` on sessions at creation, |
There was a problem hiding this comment.
This seems like the best proposal
jsell-rh
left a comment
There was a problem hiding this comment.
Re-Review: SessionBackend Abstraction (updated)
@tiwillia — this is a significant improvement. You've addressed every blocking concern from round 1. Here's my updated assessment:
✅ RED items resolved
AgentBackend reconciliation — the new goal #2 ("Subsume AgentBackend") is explicit and clear. The method mapping (, , , ) eliminates any ambiguity. Deleting as part of step 5 of the migration plan is correct sequencing.
DiscoverSessions + session naming — I accept the decision to preserve existing naming conventions and document the cross-space collision as a known gap. This is the right pragmatic call for a refactoring spec — changing naming in parallel would scope-creep the PR. The gap is real and correctly catalogued.
One note: PR #49 (now merged) changed and to use naming for NEW sessions created via the spawn UI. The 5-step migration will need to handle sessions using both naming conventions during the transition — worth noting in doc 03 implementation notes when the time comes.
✅ YELLOW items resolved
Role interfaces (, , ) — the composable design is clean. Consumers that only need liveness polling (liveness loop → ) don't need to import the full interface. This is exactly right.
Migration sequencing — the 5-step plan with "each step produces a compilable codebase" is exactly what was needed. Step 4 (data model rename) is correctly identified as the big-bang step that affects the most files. The sequence is sound.
✅ Additional improvements (appreciated)
- Clean break (no shim) — correct given no production agents are running
- **** with typed / — keeps backend-specific concerns self-contained; better than mixing tmux/ambient fields in one struct
- Interrupt → Escape key — important detail for Claude Code compatibility
- Context/tool injection gap documented — this is the right call. The three approaches listed (startup prompt, MCP injection, workflow) are the realistic options; punting to Phase 2 is correct
- AgentCore feasibility analysis — the "invoke + stop only" limitation you identified is the key constraint. The state management complexity assessment is accurate
- Model switching compaction risk documented — good operational detail
🟡 One remaining implementation concern (not blocking spec approval)
The spec removes the query param backward compat and the JSON field alias. The clean break is correct in principle, but the migration plan (step 4) should explicitly call out:
- The DB column rename ( → ) requires a migration file, not just (GORM's AutoMigrate doesn't rename columns)
- The ignition prompt template (hardcoded in Go) references in the ignite command examples — update needed
These are implementation details, not spec problems. Flag them in the migration tracking issue when this lands.
Verdict: APPROVE ✅
This spec is ready to drive implementation. The design is coherent, the interface is well-sized with the role segregation, the migration plan is realistic, and the known gaps are properly documented rather than papered over.
Suggested next step: create a tracking task for the implementation in phases as laid out in the migration plan. Phase 1 (add new files + backend registry) can start immediately without touching existing code.
Design specs for replacing hardcoded tmux with a pluggable SessionBackend interface supporting tmux and Ambient backends.