Skip to content

feat(phase3-step9): Background Execution + Permission Gate#19

Merged
AVADSA25 merged 8 commits into
mainfrom
feat/phase3-step9-implementation
May 3, 2026
Merged

feat(phase3-step9): Background Execution + Permission Gate#19
AVADSA25 merged 8 commits into
mainfrom
feat/phase3-step9-implementation

Conversation

@AVADSA25
Copy link
Copy Markdown
Owner

@AVADSA25 AVADSA25 commented May 3, 2026

Summary

Phase 3 Step 9 — Background Execution + Permission Gate. The runtime layer.

codec-agent-runner PM2 daemon picks up status=approved plans (from Step 8), executes their checkpoints autonomously via Qwen-3.6 ↔ skill loops. Permission gate enforces the manifest on every action. Strict-consent gate (Step 3 §1.7 reuse) on destructive ops. Resume after PM2 restart from last atomic checkpoint (Q5). Multi-agent concurrency cap = 3 (Q6, Q8 blocked occupies slot). Plan-hash tamper detection at run start (Q13).

No UI yet — Step 10 picks that up. Step 9 alone is shippable: agents actually run; you observe via audit.log + notifications.json.

Reference

  • Blueprint: docs/PHASE3-BLUEPRINT.md §3
  • TDD plan: docs/PHASE3-STEP9-PLAN.md
  • Resolved Q&A (blueprint §8): Q5 resume from last atomic checkpoint, Q6 3 concurrent, Q7 destructive timeout = blocked not aborted, Q8 blocked occupies slot, Q13 plan-hash tamper

Files

Path Type Purpose
codec_agent_runner.py NEW (~700 LOC) Daemon loop + per-agent run + permission gate + checkpoint executor + qwen driver
tests/test_agent_runner.py NEW (~770 LOC, 30 tests) Full coverage: audit constants, state machine, permission gate, qwen driver, strict-consent, executor, run_agent paths, daemon, endpoints
codec_audit.py MOD (+18) Step 9 event constants + PHASE3_STEP9_EVENTS frozenset
codec_agent_plan.py MOD (+15) Extended _VALID_TRANSITIONS with Step 9 runtime statuses
routes/agents.py MOD (+82) abort/pause/resume/grant endpoints + GrantBody Pydantic model
ecosystem.config.js MOD (+22) PM2 codec-agent-runner entry
AGENTS.md MOD (+65) Step 9 module description, audit events, don't-touch list

Audit envelope

8 new schema:1 events. agent_started opens the per-agent operation envelope; subsequent agent_* events all share that single correlation_id (multi-emit op per Step 1 §1.4 contract).

Permission gate (the core safety enforcement)

Every Action returned by Qwen goes through permission_gate(action, agent_grants, global_grants):

  • skill not in (per-agent ∪ global) → PermissionViolation(skill_not_authorized)
  • write path not in (per-agent ∪ global) write paths → PermissionViolation(path_not_authorized)
  • network domain not in (per-agent ∪ global) → PermissionViolation(domain_not_authorized)

Destructive ops STILL hit Step 3 §1.7 strict-consent (universal floor) — even pre-approved.

State machine extension

draft_pending → awaiting_approval → approved → running → completed
                                                       → aborted
                                                       → paused → running
                                                       → blocked_on_permission → running | aborted
                                                       → blocked_on_destructive → running | aborted
                                                       → crashed_resumed → running | aborted

completed, aborted, rejected are terminal.

Process supervision (Q15 adapted)

Q15 in the blueprint asked for codec-heartbeat HTTP probe of codec-agent-runner. On implementation, codec-heartbeat only probes HTTP services (LLM, Whisper, Kokoro, Vision) and codec-agent-runner is a daemon by design (no HTTP). PM2's autorestart: true provides crash recovery automatically; no additional probe needed. Documented in AGENTS.md §10.

Test plan

  • 🧪 tests/test_agent_runner.py → 30 passed
  • 🧪 Full suite — 900 passed / 20 failed / 73 skipped (same 20/73 baseline as main, +30 new tests from Step 9)
  • Permission gate matrix coverage (skill / path / domain × in-manifest / in-global / outside)
  • All 8 Step 9 audit events emit with paired correlation_id
  • All 5 _run_agent paths (happy / blocked / aborted / tampered / resume) tested
  • Multi-agent concurrency (Q6=3, Q8 blocked occupies slot) tested
  • AGENT_RUNNER_ENABLED=false global kill switch tested
  • Post-merge deploy:
    git pull
    pm2 start ecosystem.config.js --only codec-agent-runner
    pm2 restart codec-dashboard  # picks up new abort/pause/resume/grant endpoints
  • Real-world test: POST /api/agents with a project description, approve via /api/agents/{id}/approve, watch ~/.codec/audit.log for agent_started + agent_checkpoint_* events. With Qwen-3.6 alive, the agent should actually run the plan to completion (or block on permission if manifest is tight)

One plan deviation

The plan's _VALID_TRANSITIONS map omitted approved → aborted — needed for the plan-tamper code path which detects tampering before transitioning to running. Added "aborted" to approved's frozenset. The Task 2 test asserts "running" in _VALID_TRANSITIONS["approved"] (subset check, not equality), so it remains green.

Out of scope (Step 10)

  • Project mode UI / chat dropdown / status pills
  • Proactive messaging from agent → user (chat thread integration)
  • Auto-escalation from chat mode
  • Reading messages.jsonl for agent timeline UI

🤖 Generated with Claude Code

@AVADSA25 AVADSA25 merged commit 9579697 into main May 3, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants