diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml new file mode 100644 index 0000000..5fb7cca --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -0,0 +1,91 @@ +name: Bug report +description: Something isn't working as expected +title: "[Bug] " +labels: ["bug"] +body: + - type: markdown + attributes: + value: | + Thanks for taking the time to report a bug. Please fill in as much detail as you can. + - type: textarea + id: what-happened + attributes: + label: What happened? + description: A clear and concise description of what the bug is. + placeholder: When I do X, Y happens instead of Z. + validations: + required: true + - type: textarea + id: reproduce + attributes: + label: Steps to reproduce + description: Minimal sequence to reproduce the issue. + placeholder: | + 1. Open `/chat` + 2. Click 'Project' mode + 3. Type "..." + 4. See error + validations: + required: true + - type: textarea + id: expected + attributes: + label: Expected behavior + description: What you expected to happen. + validations: + required: false + - type: textarea + id: logs + attributes: + label: Relevant logs / audit entries + description: | + Paste relevant lines from `~/.codec/audit.log` or `pm2 logs codec-dashboard --lines 50 --nostream`. + Strip any sensitive data (paths, account names, tokens). + render: shell + validations: + required: false + - type: input + id: codec-version + attributes: + label: CODEC version / commit + description: Output of `git -C ~/codec-repo rev-parse --short HEAD` and `python3.13 --version`. + placeholder: "abc1234, Python 3.13.x" + validations: + required: false + - type: dropdown + id: pm2-services + attributes: + label: Which PM2 service is involved? + multiple: true + options: + - codec-dashboard + - codec-agent-runner + - codec-observer + - codec-mcp-http + - codec-heartbeat + - codec-hotkey + - codec-dictate + - open-codec + - codec-imessage + - codec-telegram + - codec-watchdog + - codec-overlay + - none / not applicable + validations: + required: false + - type: input + id: macos-version + attributes: + label: macOS version + placeholder: "14.5 (Sonoma) on M1 Ultra" + validations: + required: false + - type: checkboxes + id: terms + attributes: + label: Pre-flight check + options: + - label: I searched existing issues and discussions to confirm this isn't already reported + required: true + - label: I confirmed the issue happens with all default kill switches enabled (`AGENT_*`, `OBSERVER_*`, `TRIGGERS_*`, etc.) + required: false diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000..2b9419d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,8 @@ +blank_issues_enabled: false +contact_links: + - name: Discussion / Q&A / "is this a bug?" + url: https://github.com/AVADSA25/codec/discussions + about: For open-ended questions, setup help, or "I'm not sure if this is a bug yet" — use Discussions instead of opening an issue. + - name: Enterprise setup + url: https://avadigital.ai + about: Custom integration, deployment across a team, or paid setup work. diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 0000000..bc383f6 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,64 @@ +name: Feature request +description: Suggest a new skill, capability, or improvement +title: "[Feature] " +labels: ["enhancement"] +body: + - type: markdown + attributes: + value: | + Thanks for the suggestion. CODEC's design is opinionated — we keep things small, local-first, and reversible. The clearer your description, the easier it is to scope. + - type: textarea + id: problem + attributes: + label: What problem does this solve? + description: Describe the user pain or workflow gap, not the solution. + placeholder: "Right now I have to manually X every time I Y. It takes 5 minutes and I do it 10x/day." + validations: + required: true + - type: textarea + id: proposal + attributes: + label: Proposed solution + description: What should CODEC do? If it's a new skill, describe the trigger phrase + output. If it's a UI change, describe where it goes. + validations: + required: true + - type: dropdown + id: category + attributes: + label: Which area? + options: + - New skill (drop-in `skills/*.py`) + - Existing skill enhancement + - PWA / dashboard UI + - Voice / wake-word path + - MCP integration (claude.ai, Cursor, etc.) + - Agent system (Phase 3 plan-and-build) + - Memory / search + - Notifications / alerts + - Documentation + - Other + validations: + required: true + - type: textarea + id: alternatives + attributes: + label: Alternatives considered + description: Any workarounds you tried? Other tools that solve this differently? + validations: + required: false + - type: textarea + id: scope + attributes: + label: Scope guess + description: How big is this? 1-line config flag, new skill (~100 LOC), or new module? + validations: + required: false + - type: checkboxes + id: privacy + attributes: + label: Privacy alignment + options: + - label: This proposal works locally (no required cloud dependency for the core feature) + required: false + - label: This proposal respects the local-first principle (no automatic data exfiltration) + required: false diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..c418d7a --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,52 @@ + + +## Summary + + + +## Reference + + + +## What changes + +| Path | Type | Purpose | +|---|---|---| +| `path/to/file.py` | NEW \| MOD | what this file is responsible for | + + + +## Test plan + +- [ ] 🧪 New tests added (file: `tests/test_*.py`) +- [ ] 🧪 Full pytest passes — same baseline 20 failed / 73 skipped, only new passing tests added +- [ ] All new audit events emit with correct `correlation_id` per Step 1 §1.4 envelope contract +- [ ] No writes to `~/.codec/*` from tests (verify `temp_codec_dir` fixture covers `codec_audit._AUDIT_LOG`) +- [ ] All kill switches still work (env var disables the feature) + +**Manual smoke test after merge:** +- [ ] `git pull && pm2 restart ` +- [ ] [describe the user-facing test sequence here] + +## Audit emits added + + + +## Kill switches added or modified + + + +## Out of scope (explicitly deferred) + + + +## Self-review checklist + +- [ ] Read every line of the diff myself +- [ ] No commented-out code, no `print()` left in +- [ ] No emojis added to code/files unless explicitly requested by the user +- [ ] No `~/.codec/*` paths hand-written that should go through atomic R/W helpers +- [ ] Followed existing patterns; didn't refactor unrelated code diff --git a/docs/API.md b/docs/API.md index bc5d89a..c249c72 100644 --- a/docs/API.md +++ b/docs/API.md @@ -140,6 +140,115 @@ List saved custom agents. --- +## Autonomous Agents (Phase 3 — drop-a-project mode) + +The agent system added in Phase 3. User drops a project description; Qwen-3.6 drafts a structured plan with explicit permission manifest; user approves once; `codec-agent-runner` PM2 daemon executes autonomously with permission gate enforcement, tamper detection, and resume-after-restart. + +For full design, see `docs/PHASE3-BLUEPRINT.md`. For runtime architecture, see `docs/ARCHITECTURE.md` (Phase 3 sequence diagram). + +### POST /api/agents +Create a new agent and draft its plan via Qwen-3.6 (typical 2–10 s). + +```bash +curl -X POST -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"title":"Marbella property bot","description":"Build a Telegram bot that monitors Marbella property listings under €500k and pings me on new ones"}' \ + http://localhost:8090/api/agents +# → {"agent_id": "agent_abc123", "status": "awaiting_approval"} +``` + +### GET /api/agents +List all agents with current status. Polled by the PWA every 5 s for the agent status pills above the chat input. + +```bash +curl -H "Authorization: Bearer $TOKEN" http://localhost:8090/api/agents +# → {"agents": [{"agent_id":"...","title":"...","status":"running","created_at":"..."}]} +``` + +### GET /api/agents/{id} +Full agent state — manifest + plan + state + grants in one response. The PWA's "View plan" button calls this. + +### POST /api/agents/{id}/approve +Approve drafted plan. Re-validates skills against registry, computes plan_hash (sha256), writes grants.json, transitions `awaiting_approval → approved`. The daemon picks up `approved` agents within 5 s. + +### POST /api/agents/{id}/reject +Body: `{"reason": "..."}` (optional). Transitions to `rejected`; plan dir kept 7 days for review then auto-deleted. + +### POST /api/agents/{id}/revise +Body: `{"edited_plan": { ... full Plan dict ... }}`. User-edited plan, re-validated, transitions `awaiting_approval → revised → awaiting_approval`. + +### POST /api/agents/{id}/abort +Atomic transition to `aborted`. Daemon checks status before each operation. + +### POST /api/agents/{id}/pause / /resume +`paused → running` (resume), or `running → paused` (pause). Idempotent. + +### POST /api/agents/{id}/grant +Grant a missing permission to a `blocked_on_permission` agent. Per-agent only (not global). + +```bash +curl -X POST -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"kind":"skills","value":"calculator"}' \ + http://localhost:8090/api/agents/agent_abc123/grant +``` + +`kind` ∈ `skills` / `read_paths` / `write_paths` / `network_domains`. + +### POST /api/agents/{id}/extend_budget +Bump current checkpoint's step_budget. Only valid when `status=paused` AND `status_reason=step_budget_exhausted`. + +```bash +curl -X POST -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"additional_steps":20}' \ + http://localhost:8090/api/agents/agent_abc123/extend_budget +``` + +Returns `{previous_budget, new_budget, status:"running"}`. Override is written to `state.json` (plan stays immutable; tamper-hash check intact). + +### GET /api/agents/{id}/messages +Return the full message timeline from `~/.codec/agents/{id}/messages.jsonl`. + +```json +{"messages":[ + {"ts":"2026-05-03T12:15:00Z","type":"agent_update","title":"Checkpoint 2/5: Scaffolded bot","body":"...","actions":[...]} +]} +``` + +`type` ∈ `agent_update` / `agent_blocked` / `agent_question` / `agent_done` / `agent_aborted` / `user_reply`. + +### POST /api/agents/{id}/messages +User reply — daemon picks up between checkpoints. + +```bash +curl -X POST -H "Authorization: Bearer $TOKEN" \ + -d '{"body":"please skip the email step and continue"}' \ + http://localhost:8090/api/agents/agent_abc123/messages +``` + +### POST /api/agents/{id}/silence +Toggle banner silence per-agent. Silenced = timeline messages still written; notifications.json banner skipped (no badge spam). + +```bash +curl -X POST -d '{"silenced":true}' http://localhost:8090/api/agents/agent_abc123/silence +``` + +### Global allowlist (cross-agent permissions, Q4) + +#### GET /api/agent_global_grants +Read the global allowlist. + +#### POST /api/agent_global_grants +Add an entry. Body: `{"kind":"network_domains","value":"github.com"}`. Items added here are auto-approved on every future plan. + +#### DELETE /api/agent_global_grants +Remove an entry. Same body shape. + +`kind` ∈ `network_domains` / `read_paths` / `write_paths` / `skills`. + +--- + ## Schedules ### GET /api/schedules diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..45becf4 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,317 @@ +# CODEC Architecture + +**Sovereign AI Workstation** runs as a swarm of small Python processes coordinated by PM2. No single monolith — each service has one clear responsibility, communicates through atomic file writes (`~/.codec/*.json`) or HTTP localhost calls, and can be killed without breaking the others. + +This doc is for engineers who want to understand the runtime topology before reading the code. For per-feature design rationale, see `docs/PHASE*-*.md`. + +--- + +## Process topology (PM2 services) + +```mermaid +graph TB + subgraph User["User-facing surfaces"] + PWA[PWA / Browser
localhost:8090] + Voice[Wake-word / Voice
open-codec] + Hotkey[Global hotkeys
codec-hotkey] + Dictate[Dictation
codec-dictate] + iMessage[iMessage
codec-imessage] + Telegram[Telegram
codec-telegram] + end + + subgraph Core["Core services"] + Dashboard[codec-dashboard
FastAPI · port 8090
chat / audit / settings UI] + MCP[codec-mcp-http
port 8091
OAuth 2.1 · claude.ai bridge] + Heartbeat[codec-heartbeat
20-min daemon
service health probes] + Watchdog[codec-watchdog
PM2 supervisor] + end + + subgraph Phase2["Phase 2 — observation + automation"] + Observer[codec-observer
5s tick · RingBuffer
active window + clipboard] + end + + subgraph Phase3["Phase 3 — autonomous agents"] + AgentRunner[codec-agent-runner
5s tick · MAX_CONCURRENT=3
Qwen ↔ skill loops] + end + + subgraph LLMs["Local LLM services"] + Qwen[qwen3.6
OpenAI-compatible · port 8090] + Whisper[whisper-stt
STT] + Kokoro[kokoro-82m
TTS] + end + + subgraph Storage["State (atomic R/W via tmp+rename)"] + AuditLog[~/.codec/audit.log
schema:1 · 30-day rotation] + Memory[~/.codec/memory.db
SQLite · 250K context] + Notifications[~/.codec/notifications.json] + Agents[~/.codec/agents/«id»/
plan · grants · state · messages] + Config[~/.codec/config.json] + Skills[~/.codec/skills/«name».py
user skills] + Plugins[~/.codec/plugins/«name».py
lifecycle hooks] + end + + PWA --> Dashboard + Voice --> Dashboard + Hotkey --> Dashboard + Dictate --> Dashboard + iMessage --> Dashboard + Telegram --> Dashboard + + Dashboard --> Qwen + Dashboard --> Whisper + Dashboard --> Kokoro + Dashboard --> AuditLog + Dashboard --> Memory + Dashboard --> Notifications + Dashboard --> Agents + Dashboard --> Config + + MCP --> Dashboard + + Heartbeat -. probes .-> Qwen + Heartbeat -. probes .-> Whisper + Heartbeat -. probes .-> Kokoro + Heartbeat --> AuditLog + + Observer --> AuditLog + Observer -. observation summaries .-> Storage + Observer --> AgentRunner + + AgentRunner --> Qwen + AgentRunner --> Agents + AgentRunner --> AuditLog + AgentRunner --> Notifications + + Dashboard -. dispatches .-> Skills + Dashboard -. lifecycle hooks .-> Plugins + + style AgentRunner fill:#a78bfa,color:#000 + style Phase3 fill:#1a0a3e,color:#fff + style Phase2 fill:#0e2a3e,color:#fff +``` + +--- + +## Key modules + their files + +```mermaid +graph LR + subgraph Phase1["Phase 1 — substrate"] + audit[codec_audit.py
schema:1 envelope] + hooks[codec_hooks.py
plugin lifecycle] + ask[codec_ask_user.py
blocking pause + strict-consent] + agents[codec_agents.py
Crew + ReAct + stuck detect] + end + + subgraph Phase2["Phase 2 — observation"] + observer[codec_observer.py
RingBuffer + injection contract] + triggers[codec_triggers.py
declarative SKILL_OBSERVATION_TRIGGER] + shift[skills/shift_report.py
end-of-day summary] + end + + subgraph Phase3["Phase 3 — autonomy"] + plan[codec_agent_plan.py
plan + permission contract] + runner[codec_agent_runner.py
daemon + permission gate] + messaging[codec_agent_messaging.py
post_message + 60s batching] + end + + subgraph Existing["Existing core"] + dashboard[codec_dashboard.py
FastAPI router] + dispatch[codec_dispatch.py
skill dispatch chokepoint] + registry[codec_skill_registry.py
AST-discovered skills] + identity[codec_identity.py
system prompts] + end + + audit --> hooks + audit --> ask + audit --> agents + audit --> observer + audit --> triggers + audit --> shift + audit --> plan + audit --> runner + audit --> messaging + + hooks --> dispatch + dispatch --> registry + + ask --> plan + ask --> runner + + plan --> runner + runner --> messaging + runner --> dispatch + observer --> triggers + observer --> shift + + dashboard --> dispatch + dashboard --> plan + dashboard --> runner + dashboard --> messaging + + style Phase1 fill:#0e2a3e,color:#fff + style Phase2 fill:#1a3e0a,color:#fff + style Phase3 fill:#3e0a3e,color:#fff +``` + +--- + +## Skill execution paths (3 distinct routes) + +A skill is a `.py` file in `skills/` (built-in) or `~/.codec/skills/` (user). Each declares `SKILL_NAME`, `SKILL_TRIGGERS`, `run(task, app="", ctx="")`. Three execution paths, all flowing through `codec_dispatch.run_skill`: + +```mermaid +sequenceDiagram + participant U as User + participant V as open-codec
(wake-word) + participant D as codec-dashboard
(chat HTTP) + participant M as codec-mcp-http
(claude.ai) + participant Disp as codec_dispatch.run_skill + participant H as codec_hooks
(plugin lifecycle) + participant Skill as skills/«name».py + participant Audit as ~/.codec/audit.log + + Note over U,V: Path A — Voice + U->>V: "hey codec, weather in Paris" + V->>Disp: dispatch(text) + + Note over U,D: Path B — PWA chat + U->>D: POST /api/command + D->>Disp: _try_skill(text) + + Note over U,M: Path C — MCP / claude.ai + U->>M: tool_call(skill_name, args) + M->>Disp: dispatch(skill, task) + + Disp->>H: pre_tool hook + H-->>Audit: hook_fired + Disp->>Skill: run(task) + Skill-->>Disp: result + Disp->>H: post_tool hook + H-->>Audit: hook_fired + Disp-->>Audit: tool_result +``` + +`run_with_hooks` wraps every skill call. Step 2 plugins (e.g., `self_improve`) observe via `pre_tool` / `post_tool` / `on_error` / `on_operation_*` hooks. + +--- + +## Phase 3 — drop-a-project pipeline + +```mermaid +sequenceDiagram + participant U as User
(/chat Project mode) + participant D as codec-dashboard + participant Plan as codec_agent_plan + participant Q as Qwen-3.6 + participant R as codec-agent-runner + participant Msg as codec_agent_messaging + participant Audit as audit.log + + U->>D: POST /api/agents
{title, description} + D->>Plan: create_agent(description) + Plan->>Q: draft plan + Q-->>Plan: JSON {goals, checkpoints, manifest} + Plan->>Plan: validate skills against registry + Plan->>Audit: agent_plan_drafted + Plan-->>D: agent_id + D-->>U: 200 {agent_id} + + U->>D: POST /api/agents/«id»/approve + D->>Plan: approve_plan(id) + Plan->>Plan: write grants.json + plan_hash + Plan->>Audit: agent_plan_approved + Plan-->>D: grants + + Note over R: 5s tick scans
~/.codec/agents/*/state.json + R->>R: status=approved → spawn thread + + loop per checkpoint + R->>Q: next_action(plan, checkpoint, history) + Q-->>R: Action {skill, task, is_destructive, ...} + R->>R: permission_gate(action, grants) + alt destructive + R->>U: strict_consent (verb-match) + end + R->>R: run_skill (Step 1+2 hooks fire) + R->>Msg: post_message(agent_update) + Msg->>Audit: agent_message_sent + end + + R->>Audit: agent_completed + R->>Msg: post_message(agent_done) +``` + +Permission gate enforces **union of per-agent grants + global allowlist**. Destructive ops always hit Step 3 §1.7 strict-consent (universal floor). Plan-hash verified at run start (tamper detection per Q13). + +--- + +## Storage contract + +Every `~/.codec/*.json` write follows the **atomic tmp+rename pattern**: + +```python +def _atomic_write_json(path, data): + tmp = path.with_suffix(path.suffix + ".tmp") + with open(tmp, "w") as f: + json.dump(data, f) + f.flush() + os.fsync(f.fileno()) + os.replace(tmp, path) +``` + +This is the contract: +- A reader either sees the OLD complete file or the NEW complete file — never a partial write +- Multiple writers from different processes don't tear each other's data +- Power loss mid-write leaves the OLD file intact + +Helpers live in `codec_agent_plan._atomic_write_json` (Phase 3 Step 8), `skills/shift_report._atomic_write` (Phase 2 Step 7), and `codec_observer._atomic_write` (Phase 2 Step 5). All three are the same pattern. + +**Don't bypass.** Direct `open(path, "w").write(...)` is the canonical bug source — flagged in `AGENTS.md §10` for every state file. + +--- + +## Audit envelope (`schema:1`) + +Every audit emit goes through `codec_audit.audit()` and produces a JSON line in `~/.codec/audit.log`: + +```json +{ + "ts": "2026-05-03T11:37:23.717+00:00", + "schema": 1, + "event": "agent_started", + "source": "codec-agent-runner", + "tool": "", + "outcome": "ok", + "level": "info", + "transport": "local", + "message": "agent started agent_xxx", + "extra": { + "agent_id": "agent_xxx", + "checkpoint_count": 3, + "starting_at": 0, + "correlation_id": "7f9369c04115" + } +} +``` + +Multi-emit operations (e.g., `agent_started` → `agent_checkpoint_started` → `agent_checkpoint_completed` → `agent_completed`) **share a single `correlation_id`** so they can be joined in analytics. This is the §1.4 contract from Phase 1 Step 1. + +Daily rotation, 30-day retention, append-only, thread-safe. + +--- + +## Where to read next + +| Topic | File | +|---|---| +| Why each Phase exists | `docs/PHASE1-COMPLETE.md`, `docs/PHASE2-COMPLETE.md`, `docs/PHASE3-COMPLETE.md` | +| Per-step design rationale | `docs/PHASE-STEP-DESIGN.md` and `docs/PHASE-STEP-PLAN.md` | +| What you must NOT touch | `AGENTS.md` §10 (don't-touch zones) | +| Audit event vocabulary | `AGENTS.md` §6 | +| Skill template | `skills/_template.py` | +| Plugin template | `plugins/_template.py` | + +--- + +*Architecture as of 2026-05-03. Last major change: Phase 3 backend (Steps 8 + 9 + 10) shipped, codec-agent-runner online.*