Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions Docs/Development/ACP_Production_Readiness.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,13 @@ evidence recorded below. Remaining live-backend and host-runtime caveats are
called out explicitly and should be resolved before release notes claim fully
verified production deployment on a specific host.

The release-caveat tracker
[#2398](https://github.com/rmusser01/tldw_server/issues/2398) is reconciled in
[ACP Release-Caveat Closeout - 2026-06-21](ACP_Release_Caveat_Closeout_2026_06_21.md).
All child issues are closed; remaining sandbox, artifact, reviewer-loop,
failure-diagnostic, and documented-only profile caveats are expected support
boundaries rather than open tracker blockers.

## Issue Map

| Issue | Workstream | Readiness role |
Expand All @@ -32,6 +39,7 @@ verified production deployment on a specific host.
| [#1538](https://github.com/rmusser01/tldw_server/issues/1538) | ACP output artifact mapping | Maps ACP run outputs to the artifact contract and separates execution artifacts from promoted workspace work products. |
| [#1532](https://github.com/rmusser01/tldw_server/issues/1532) | ACP-adjacent release work tracker | Tracks artifact storage/API, promotion, UI, export, verification, compatibility, and product-state follow-ups after the first ACP productionization pass. |
| [#1704](https://github.com/rmusser01/tldw_server/issues/1704) | ACP artifact release verification | Records release-grade verification for the first accepted ACP-to-workspace-artifact golden path. |
| [#2398](https://github.com/rmusser01/tldw_server/issues/2398) | ACP release-caveat closeout tracker | Reconciles the final child issue outcomes, surface-language consistency, and remaining expected support boundaries before closing the release-caveat tracker. |
| [#2401](https://github.com/rmusser01/tldw_server/issues/2401) | Artifact retention and transcript redaction release policy | Makes the release retention/redaction boundaries explicit for ACP session evidence, audit records, diagnostics, artifacts, run previews, and promoted workspace artifacts. |
| [#2400](https://github.com/rmusser01/tldw_server/issues/2400) | Sandbox host-runtime release verification | Records release-host evidence for selected ACP sandbox runtimes before sandbox-backed support claims are made. |
| [#2402](https://github.com/rmusser01/tldw_server/issues/2402) | Live-agent caveat verification | Records deeper Goose, Hermes, and OpenCode evidence for workspace binding and non-empty MCP server injection while preserving artifact, sandbox, reviewer-loop, and failure-diagnostic caveats. |
Expand Down Expand Up @@ -72,6 +80,13 @@ must keep `supported_with_caveats` wording because live agent-produced ACP
artifacts, sandbox-backed execution, reviewer-loop behavior, and failure
diagnostic payloads remain unverified for those profiles.

## Release-Caveat Closeout

[#2398](https://github.com/rmusser01/tldw_server/issues/2398) is the final
tracker for the June 2026 ACP release-caveat work. The reconciled child map and
expected remaining support boundaries are recorded in
[ACP Release-Caveat Closeout - 2026-06-21](ACP_Release_Caveat_Closeout_2026_06_21.md).

## Readiness Matrix

| Surface | Owner modules | Required evidence | Verification commands | Pass/fail gate | Runtime caveats |
Expand Down
61 changes: 61 additions & 0 deletions Docs/Development/ACP_Release_Caveat_Closeout_2026_06_21.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# ACP Release-Caveat Closeout - 2026-06-21

This records the final reconciliation for
[GitHub #2398](https://github.com/rmusser01/tldw_server/issues/2398). The
goal is to make the release evidence boundaries explicit after the ACP
release-caveat child workstreams closed.

## Child Issue Outcomes

| Issue | Outcome | Release interpretation |
| --- | --- | --- |
| [#2404](https://github.com/rmusser01/tldw_server/issues/2404) | Closed. Live-backend browser E2E passed across ACP Playground, Agent Registry, Agent Tasks, and Research Workspace diagnostics. | User-facing ACP flows have live-backend browser evidence when run with the bundled runner home from `Config_Files/config.txt`. |
| [#2403](https://github.com/rmusser01/tldw_server/issues/2403) | Closed. Go runner verification passed on the recorded macOS host. | The runner build/test refresh is not a remaining release blocker. |
| [#2401](https://github.com/rmusser01/tldw_server/issues/2401) | Closed, with [#2408](https://github.com/rmusser01/tldw_server/issues/2408) split out and closed. | Retention and redaction policy covers ACP session evidence, audit records, diagnostics, artifacts, task previews, and promoted workspace artifacts. |
| [#2400](https://github.com/rmusser01/tldw_server/issues/2400) | Closed. Docker is the only sandbox runtime with current release-host pass evidence. | Release surfaces may claim Docker-backed sandbox runtime lifecycle evidence for the recorded host only. Lima, VZ, all-runtime sandbox support, and named-agent sandbox support remain unverified. |
| [#2402](https://github.com/rmusser01/tldw_server/issues/2402) | Closed. Goose, Hermes, and OpenCode passed `workspace-live-e2e` on the host runner with workspace binding and non-empty MCP server injection. | These agents remain `supported_with_caveats`; artifact-producing workflows, sandbox-backed execution, reviewer-loop behavior, and failure diagnostic payloads remain unverified. |
| [#2399](https://github.com/rmusser01/tldw_server/issues/2399) | Closed. Guardrails preserve conservative Aider, Continue, and custom-profile status. | Aider and Continue remain `documented_unverified`; the seeded `custom` profile remains template-only until a distinct named profile has live evidence. |

## Surface Reconciliation

- `ACP_Production_Readiness.md` now links this closeout record from the issue
map and status summary.
- `ACP_Compatibility_Matrix.md` already distinguishes host E2E, workspace/MCP
evidence, sandbox evidence, artifact evidence, reviewer-loop evidence, and
failure-diagnostic evidence.
- `tldw_Server_API/Config_Files/agents.yaml` records Goose, Hermes, and
OpenCode workspace/MCP evidence while keeping all three at
`supported_with_caveats`.
- The setup guide and Agent Registry consume registry compatibility metadata and
continue to avoid promoting documented-only profiles.

## Remaining Caveats

These are expected support boundaries, not blockers for closing #2398:

- Named downstream agents are not sandbox-supported until they pass
agent-specific sandbox evidence, preferably `workspace-live-e2e` with
`ACP_E2E_EXPECT_SANDBOX=1`.
- Goose, Hermes, and OpenCode did not produce ACP artifacts during the
workspace-live runs, so artifact-producing workflows remain unverified.
- Reviewer-loop behavior and failure diagnostic payloads remain unverified for
those live-agent profiles because the passing success paths did not exercise
those cases.
- Aider, Continue, and seeded custom profiles remain conservative
documented-only entries until concrete ACP commands or named custom profiles
pass live certification.

## Verification

Closeout validation for this reconciliation slice:

- GitHub GraphQL issue-state check confirmed #2398 is open and #2399, #2400,
#2401, #2402, #2403, #2404, and #2408 are closed.
- Targeted `rg` audits checked ACP readiness docs, compatibility docs, setup
guide surfaces, seeded registry metadata, runner config, Agent Registry, and
ACP setup-guide fallback copy for stale caveats or overclaims.
- Focused registry metadata tests passed for the shipped `agents.yaml` parser
path.
- Bandit was run on the touched pytest file and reported the existing test-file
`B101` assertion baseline only. The metadata checks use ordinary pytest
assertions and avoid pinning the workspace-live evidence to one exact commit.
22 changes: 22 additions & 0 deletions IMPLEMENTATION_PLAN_acp_final_reconciliation_2398.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# ACP Final Reconciliation (#2398)

## Stage 1: Evidence Inventory

**Goal**: Confirm all #2398 child issues and follow-ups are closed or explicitly accounted for.
**Success Criteria**: A final child-issue map identifies the outcome for #2404, #2403, #2401, #2400, #2402, #2408, and #2399.
**Tests**: GitHub issue state checks and local evidence-doc presence checks.
**Status**: Complete

## Stage 2: Surface Reconciliation

**Goal**: Audit ACP docs, setup/registry surfaces, and compatibility language for stale caveats or overclaims.
**Success Criteria**: Readiness, compatibility, retention/redaction, sandbox, live-agent, and setup-guide surfaces agree with final evidence state.
**Tests**: Targeted `rg` searches over ACP docs, API setup-guide code, registry config, runner config, and frontend Agent Registry copy.
**Status**: Complete

## Stage 3: Closeout

**Goal**: Apply any minimal corrections, validate, and update #2398 with a final reconciliation note.
**Success Criteria**: Any drift is fixed, verification is recorded, and #2398 can be closed or left open with a concrete blocker.
**Tests**: `git diff --check`, docs/search verification, and focused tests only if code surfaces change.
**Status**: Complete
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
id: TASK-2396
title: Finalize ACP release caveat reconciliation for epic 2398
status: In Progress
labels:
- ACP
- release-closeout
- github-2398
references:
- https://github.com/rmusser01/tldw_server/issues/2398
- https://github.com/rmusser01/tldw_server/pull/2422
modified_files:
- IMPLEMENTATION_PLAN_acp_final_reconciliation_2398.md
- Docs/Development/ACP_Release_Caveat_Closeout_2026_06_21.md
- Docs/Development/ACP_Production_Readiness.md
- tldw_Server_API/Config_Files/agents.yaml
- tldw_Server_API/tests/Agent_Client_Protocol/test_acp_agent_registry.py
documentation:
- 'Implementation notes - GraphQL confirmed #2398 open and all child issues closed;
audit covered readiness docs plus compatibility docs plus setup guide plus seeded
registry plus runner config plus Agent Registry UI; corrected stale Goose Hermes
OpenCode registry caveats so June 20 workspace-live-e2e evidence is reflected while
artifact sandbox reviewer-loop and failure-diagnostic caveats remain.'
- Verification - registry pytest passed; git diff check passed; targeted stale wording
search passed with the old non-empty MCP caveat only on Codex as expected; Bandit
was run on the touched pytest file and reported the existing B101 test assert baseline.
- Review follow-up - removed custom assertion helpers and exact workspace-live commit
hash checks from the registry tests.
- PR - https://github.com/rmusser01/tldw_server/pull/2422
---

## Description

<!-- SECTION:DESCRIPTION:BEGIN -->

<!-- SECTION:DESCRIPTION:END -->

## Acceptance Criteria
<!-- AC:BEGIN -->
- [ ] #1 All #2398 child issues are confirmed closed or explicitly accounted for.
- [ ] #2 ACP docs and setup/registry surfaces agree with final evidence state.
- [ ] #3 Release support claims distinguish backend E2E, host stdio, sandbox, artifact, reviewer-loop, and failure-diagnostic evidence.
- [ ] #4 Any drift is corrected with minimal changes and validation is recorded.
- [ ] #5 #2398 receives a final reconciliation comment and can be closed when appropriate.
<!-- AC:END -->

## Implementation Plan

<!-- SECTION:PLAN:BEGIN -->
1. Audit ACP readiness, compatibility, setup, sandbox, live-agent, and retention/redaction docs against the now-closed child issue evidence.
2. Search API/setup-guide and Agent Registry surfaces for stale release-caveat wording that conflicts with the final evidence state.
3. Apply only minimal docs/setup corrections if drift exists.
4. Record verification and final reconciliation evidence for #2398, then open a narrow PR or close #2398 directly if no repo changes are needed.
<!-- SECTION:PLAN:END -->

## Implementation Notes

<!-- SECTION:IMPLEMENTATION_NOTES:BEGIN -->

<!-- SECTION:IMPLEMENTATION_NOTES:END -->

## Final Summary

<!-- SECTION:FINAL_SUMMARY:BEGIN -->

<!-- SECTION:FINAL_SUMMARY:END -->

## Definition of Done
<!-- DOD:BEGIN -->
- [ ] #1 Acceptance criteria completed
- [ ] #2 Tests or verification recorded
- [ ] #3 Documentation updated when relevant
- [ ] #4 Bandit run for touched code when applicable or document non-code/environment skip
- [ ] #5 Final summary added
- [ ] #6 Known skips or blockers documented
<!-- DOD:END -->
6 changes: 3 additions & 3 deletions tldw_Server_API/Config_Files/agents.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ agents:
docs_url: "https://github.com/block/goose"
support_state: supported_with_caveats
verification_level: live_e2e_tested
compatibility_notes: "Goose native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-goose-backend-live-e2e commit f9ff03f88 with tldw-agent runner 0.1.0: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. Sandbox, non-empty MCP injection, artifact-producing workflows, and reviewer-loop behavior remain unverified."
compatibility_notes: "Goose native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-goose-backend-live-e2e commit f9ff03f88 with tldw-agent runner 0.1.0: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. June 20, 2026 workspace-live-e2e on branch codex/acp-live-agent-caveats commit ac93d96d9c verified workspace binding and non-empty MCP server injection with mcp_server_count=1. Sandbox, artifact-producing workflows, reviewer-loop behavior, and failure diagnostic payloads remain unverified."
compatibility_docs_url: "/docs-static/Development/ACP_Compatibility_Matrix.md"
entrypoint_strategy: native_acp
acp_command: goose
Expand All @@ -159,7 +159,7 @@ agents:
default: false
support_state: supported_with_caveats
verification_level: live_e2e_tested
compatibility_notes: "Hermes native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-hermes-live-e2e-certification commit 5e6672f8f with tldw-agent runner 0.1.0: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. Sandbox, non-empty MCP injection, artifact-producing workflows, and reviewer-loop behavior remain unverified."
compatibility_notes: "Hermes native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-hermes-live-e2e-certification commit 5e6672f8f with tldw-agent runner 0.1.0: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. June 20, 2026 workspace-live-e2e on branch codex/acp-live-agent-caveats commit ac93d96d9c verified workspace binding and non-empty MCP server injection with mcp_server_count=1. Sandbox, artifact-producing workflows, reviewer-loop behavior, and failure diagnostic payloads remain unverified."
compatibility_docs_url: "/docs-static/Development/ACP_Compatibility_Matrix.md"
entrypoint_strategy: native_acp
acp_command: hermes
Expand Down Expand Up @@ -203,7 +203,7 @@ agents:
default: false
support_state: supported_with_caveats
verification_level: live_e2e_tested
compatibility_notes: "OpenCode v1.15.7 native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-opencode-aider-llamacpp-certification commit 53c018269 with tldw-agent runner 0.1.0 using a local llama.cpp OpenAI-compatible server at 127.0.0.1:9099: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. Sandbox, non-empty MCP injection, artifact-producing workflows, and reviewer-loop behavior remain unverified."
compatibility_notes: "OpenCode v1.15.7 native ACP completed May 23, 2026 backend live E2E through the macOS host runner on branch codex/acp-opencode-aider-llamacpp-certification commit 53c018269 with tldw-agent runner 0.1.0 using a local llama.cpp OpenAI-compatible server at 127.0.0.1:9099: health/setup-guide, session-new, prompt, redacted support views, diagnostics endpoint, cancel, and close passed. June 20, 2026 workspace-live-e2e on branch codex/acp-live-agent-caveats commit ac93d96d9c verified workspace binding and non-empty MCP server injection with mcp_server_count=1. Sandbox, artifact-producing workflows, reviewer-loop behavior, and failure diagnostic payloads remain unverified."
compatibility_docs_url: "/docs-static/Development/ACP_Compatibility_Matrix.md"
entrypoint_strategy: native_acp
acp_command: opencode
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,9 @@ def test_default_agents_yaml_includes_hermes_native_acp_entrypoint() -> None:
assert entry.acp_args == ["acp", "--accept-hooks"]
assert entry.support_state == "supported_with_caveats"
assert entry.verification_level == "live_e2e_tested"
assert "workspace-live-e2e" in entry.compatibility_notes
assert "non-empty MCP server injection" in entry.compatibility_notes
assert "non-empty MCP injection, artifact-producing workflows" not in entry.compatibility_notes


def test_default_agents_yaml_includes_goose_backend_live_e2e_metadata() -> None:
Expand All @@ -362,6 +365,9 @@ def test_default_agents_yaml_includes_goose_backend_live_e2e_metadata() -> None:
assert entry.verification_level == "live_e2e_tested"
assert "backend live E2E" in entry.compatibility_notes
assert "commit f9ff03f88" in entry.compatibility_notes
assert "workspace-live-e2e" in entry.compatibility_notes
assert "non-empty MCP server injection" in entry.compatibility_notes
assert "non-empty MCP injection, artifact-producing workflows" not in entry.compatibility_notes


def test_default_agents_yaml_includes_opencode_backend_live_e2e_metadata() -> None:
Expand All @@ -387,6 +393,9 @@ def test_default_agents_yaml_includes_opencode_backend_live_e2e_metadata() -> No
assert "backend live E2E" in entry.compatibility_notes
assert "local llama.cpp" in entry.compatibility_notes
assert "commit 53c018269" in entry.compatibility_notes
assert "workspace-live-e2e" in entry.compatibility_notes
assert "non-empty MCP server injection" in entry.compatibility_notes
assert "non-empty MCP injection, artifact-producing workflows" not in entry.compatibility_notes


def test_default_agents_yaml_includes_codex_backend_live_e2e_metadata() -> None:
Expand Down
Loading