|
| 1 | +# Phase 2 — COMPLETE |
| 2 | + |
| 3 | +**Date:** 2026-05-02 20:53 CEST |
| 4 | +**Status:** All 3 steps merged + production-stable. |
| 5 | +**Phase 3 planning:** awaiting explicit go-ahead — not automatic. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Merge commits (chronological) |
| 10 | + |
| 11 | +| Step | PR | Merge SHA | Title | Sign-off | |
| 12 | +|---|---|---|---|---| |
| 13 | +| 5 | #9 | `824a52f` | feat(observer): Continuous Observation Loop (RingBuffer + injection contract) | T+0 ok; 65 `observation_tick` + 97 `observation_tick_slow` emits in `audit.log` | |
| 14 | +| 6 | #11 | `2d2ff3f` | feat(triggers): Trigger System (matcher + cooldown + consent) | T+0 ok; codec-observer `_eval_triggers(snapshot)` integrated; routes/triggers.py mounted | |
| 15 | +| 7 | #12 | `0e40687` | feat(shift_report): end-of-day shift report | T+0 ok; live `shift_report_started`/`_completed` paired emits at `2026-05-02T18:49:40Z` (`cid=5f188e5485e5`) | |
| 16 | + |
| 17 | +**Hotfix that landed during Step 5 deployment:** |
| 18 | + |
| 19 | +| PR | Merge SHA | What | |
| 20 | +|---|---|---| |
| 21 | +| #10 | `26e6add` | hotfix: `observer.ocr_enabled` config flag — bypass `screencapture` popup storm when macOS Screen Recording permission not yet granted to `python3.13` and to `node` (PM2 parent). Default `true`; user runtime config patched to `false` until both permissions explicitly granted. | |
| 22 | + |
| 23 | +**Main HEAD at Phase 2 close:** `0e40687` (Merge PR #12) → followed by Phase 2 sign-off + this doc commit. |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## What Phase 2 delivered |
| 28 | + |
| 29 | +### Step 5 — Continuous Observation Loop |
| 30 | +- `codec_observer.py` (~810 LOC NEW): PM2-managed daemon `codec-observer`. `RingBuffer` keeps the last 10 minutes of observation snapshots in RAM only (no disk persistence). Polls active window + clipboard digest + (optional) screenshot OCR every 5 s; lazy-imports Quartz with graceful non-mac fallback. |
| 31 | +- `§X Observation Injection Contract` (Q5 override): `maybe_inject_observation_summary()` always injects for `transport=local` (chat / voice). Cloud transports (`mcp`) gate on possessive pronoun OR continuation phrase OR explicit `SKILL_NEEDS_OBSERVATION` flag. |
| 32 | +- Integration: `codec_dashboard.py` chat handler (+59 LOC) and `codec_voice.py` voice handler (+18 LOC) call `maybe_inject_observation_summary()` before LLM dispatch. |
| 33 | +- Q5.1 OCR retry-once logic; Q5.2 image-redaction never logs raw pixels; Q5.3 stop-noun list filters trivial captures; Q5.4 audit cardinality (one tick per 5s); Q5.5 slow-poll degraded mode emits `observation_tick_slow`; Q5.6 `/api/observer/buffer?debug=1` PWA debug endpoint; Q5.7 forward-compat snapshot schema reserves keys for Step 6/7. |
| 34 | +- 5 audit events: `observation_tick`, `observation_tick_slow`, `observation_summary_injected`, `observer_started`, `observer_stopped`. |
| 35 | +- Kill switch: PM2 `codec-observer` `pm2 stop` + `observer.enabled: false` in `~/.codec/config.json`. |
| 36 | +- 33 new passing tests (`tests/test_observer.py`, includes the 3 `ocr_enabled` config-flag tests added during the hotfix). |
| 37 | + |
| 38 | +### Step 6 — Trigger System |
| 39 | +- `codec_triggers.py` (~520 LOC NEW): declarative `SKILL_OBSERVATION_TRIGGER` per skill. 5 matcher types — `window_title_match`, `clipboard_pattern`, `file_change`, `time`, `compound`. Cooldowns held in RAM (per-skill, configurable seconds). Persistent kill state at `~/.codec/triggers_killed.json` (atomic tmp+rename). `evaluate(snapshot)` returns the list of triggers fired this tick; `dispatch()` calls `codec_dispatch.run_skill` (with optional `codec_ask_user.ask` confirmation gate). |
| 40 | +- Stable `sha8` key per `(skill_name, trigger_type, params_hash)` so `triggers_killed.json` survives skill rename without resurrecting an intentionally-killed trigger. |
| 41 | +- AST extraction: `codec_skill_registry.py` (+15 LOC) walks every skill module, extracts `SKILL_OBSERVATION_TRIGGER` + `SKILL_NEEDS_OBSERVATION` constants without importing the module. |
| 42 | +- Integration: `codec_observer.py` calls `_eval_triggers(snapshot)` after `_emit_observation_tick`, in `try/except` so a broken trigger never breaks the observation loop. |
| 43 | +- PWA endpoints: `routes/triggers.py` (NEW, ~95 LOC) — `GET /api/triggers` (list), `GET /api/triggers/{key}` (detail), `POST /api/triggers/{key}/kill` (toggle kill). |
| 44 | +- 3 audit events: `trigger_fired`, `trigger_skipped`, `trigger_killed`. |
| 45 | +- Kill switches: per-trigger via PWA POST, OR full-system via `TRIGGERS_ENABLED=false`, OR per-skill by simply not declaring `SKILL_OBSERVATION_TRIGGER`. |
| 46 | +- 35 new passing tests (`tests/test_triggers.py`, all mocking `codec_dispatch.run_skill`). |
| 47 | + |
| 48 | +### Step 7 — Shift Report |
| 49 | +- `skills/shift_report.py` (~470 LOC NEW, 22151 bytes installed at `~/.codec/skills/shift_report.py`): assembles a 5-section markdown report — `## Completed tasks` / `## Blocked or stuck moments` / `## Observed work patterns` / `## Pending questions` / `## Tomorrow`. Per-day dedup via atomic state at `~/.codec/shift_report_state.json` (one report per local-date, idle/time path; manual path always fires). |
| 50 | +- 3 trigger paths: `manual` (chat or MCP invocation), `time` (daily-at-hour:minute), `idle` (continuous idle ≥ N minutes). Time + idle paths live inside `codec_observer.py:_maybe_fire_shift_report(idle_seconds)` called every observation tick. |
| 51 | +- Public API: `run(task)` (manual entrypoint, used by chat / MCP) and `run_with_trigger_kind(kind)` (used by observer for `time` / `idle`). |
| 52 | +- Skill metadata: `SKILL_NAME = "shift_report"`, `SKILL_TRIGGERS = ["shift report", "shift-report", "daily shift report", "what did i do today", "summarize my day", "today's summary", "end of day report", "eod report"]`, `SKILL_MCP_EXPOSE = True`. |
| 53 | +- 2 audit events: `shift_report_started`, `shift_report_completed` (paired `correlation_id`, `extra.trigger_kind` ∈ `{manual, time, idle}`, `extra.sections_included`, `extra.word_count`, `extra.audit_records_scanned`). |
| 54 | +- Kill switches: `SHIFT_REPORT_ENABLED=false`, OR `shift_report.enabled: false` in `~/.codec/config.json`, OR remove `~/.codec/skills/shift_report.py` (skill not discovered → not callable). |
| 55 | +- 20 new passing tests (`tests/test_shift_report.py`, all filesystem-mocked to `tmp_path`). |
| 56 | + |
| 57 | +--- |
| 58 | + |
| 59 | +## Audit envelope `schema:1` — all Phase 2 events live in production |
| 60 | + |
| 61 | +Captured from `~/.codec/audit.log` at 2026-05-02 20:53 CEST: |
| 62 | + |
| 63 | +| Event | Source | Count | Phase | Status | |
| 64 | +|---|---|---|---|---| |
| 65 | +| `observation_tick` | codec-observer | 65 | Step 5 | ✅ live | |
| 66 | +| `observation_tick_slow` | codec-observer | 97 | Step 5 | ✅ live (graceful-degraded path active because `ocr_enabled=false`) | |
| 67 | +| `observation_summary_injected` | codec-dashboard / codec-voice | 0 | Step 5 | dormant — needs chat-handler call to fire | |
| 68 | +| `trigger_fired` | codec-triggers | 0 | Step 6 | dormant — no `SKILL_OBSERVATION_TRIGGER` declared yet on any installed skill | |
| 69 | +| `trigger_skipped` | codec-triggers | 0 | Step 6 | dormant — same reason | |
| 70 | +| `trigger_killed` | codec-triggers | 0 | Step 6 | dormant — no kills issued | |
| 71 | +| **`shift_report_started`** | **codec-shift-report** | **7** | **Step 7** | ✅ live (paired) | |
| 72 | +| **`shift_report_completed`** | **codec-shift-report** | **7** | **Step 7** | ✅ live (paired) | |
| 73 | + |
| 74 | +(Step 5 + Step 7 events directly observable. Step 6 events are dormant by design — the trigger evaluator runs every observation tick, but no skill in the runtime currently declares a `SKILL_OBSERVATION_TRIGGER` constant. As soon as one is declared and AST-discovered at next PM2 restart, `trigger_fired` / `trigger_skipped` will populate.) |
| 75 | + |
| 76 | +**Most recent paired emit (live deployment proof):** |
| 77 | + |
| 78 | +``` |
| 79 | +2026-05-02T18:49:40.547+00:00 shift_report_started cid=5f188e5485e5 trigger_kind=manual |
| 80 | +2026-05-02T18:49:40.555+00:00 shift_report_completed cid=5f188e5485e5 sections_included=2 word_count=69 audit_records_scanned=305 duration_ms=8.28 |
| 81 | +``` |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +## Skill `shift_report` — registered and observable |
| 86 | + |
| 87 | +```bash |
| 88 | +$ ls -la ~/.codec/skills/shift_report.py |
| 89 | +-rw-r--r--@ 1 mickaelfarina staff 22151 May 2 20:49 /Users/mickaelfarina/.codec/skills/shift_report.py |
| 90 | +``` |
| 91 | + |
| 92 | +```bash |
| 93 | +$ pm2 list | grep codec-observer |
| 94 | +│ 40 │ codec-observer │ default │ N/A │ fork │ 46482 │ 3m │ 3 │ online │ 0% │ 0b │ mickael… │ disabled │ |
| 95 | +``` |
| 96 | + |
| 97 | +```bash |
| 98 | +$ tail ~/.codec/notifications.json | grep shift_report |
| 99 | +{ |
| 100 | + "id": "notif_033ec308cd", |
| 101 | + "type": "shift_report", |
| 102 | + "title": "CODEC Shift Report — 2026-05-02", |
| 103 | + "body": "# CODEC Shift Report — 2026-05-02\n\n_Generated 20:49 via `manual` trigger. Window: last 24h._\n..." |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | +Confirmed: |
| 108 | +1. Skill installed at `~/.codec/skills/shift_report.py` ✅ |
| 109 | +2. Public API `shift_report.run("shift report")` returns success ✅ (live test 20:49 CEST) |
| 110 | +3. Audit emits paired `started` + `completed` with shared `correlation_id` ✅ |
| 111 | +4. Notification posted with `type="shift_report"` and full markdown body ✅ |
| 112 | +5. State files clean: `~/.codec/shift_report_state.json` only created on first `time`/`idle` fire (manual path bypasses dedup by design) ✅ |
| 113 | + |
| 114 | +--- |
| 115 | + |
| 116 | +## Final test counts |
| 117 | + |
| 118 | +| Suite | Pass | Fail | Skip | |
| 119 | +|---|---|---|---| |
| 120 | +| Pre-Phase-2 baseline (Phase 1 close) | 732 | 20 | 73 | |
| 121 | +| After Step 5 (observer) | 765 | 20 | 73 | |
| 122 | +| After Step 6 (triggers) | 800 | 20 | 73 | |
| 123 | +| After Step 7 (shift_report) | 823 | 20 | 73 | |
| 124 | + |
| 125 | +**Net Phase 2 contribution: +91 passing tests, 0 new failures, 0 new skips.** |
| 126 | + |
| 127 | +The 20 baseline failures are the same pre-existing failures from Phase 1 — all classified in `docs/PHASE1-STEP1-PREMERGE-AUDIT.md` / Step 2 / Step 3 audits. None were caused by Phase 2 work, none have been resolved by Phase 2 work. They remain on the deferred-fix list in `docs/known-issues.md`. |
| 128 | + |
| 129 | +--- |
| 130 | + |
| 131 | +## PM2 services state at Phase 2 close |
| 132 | + |
| 133 | +| Service | Status | Notes | |
| 134 | +|---|---|---| |
| 135 | +| `codec-dashboard` | online | Phase 2 Step 5 + Step 6 routes mounted; chat-handler observation injection active | |
| 136 | +| `codec-mcp-http` | online | claude.ai connections live; shift_report exposed via `SKILL_MCP_EXPOSE=True` | |
| 137 | +| `codec-heartbeat` | online | 20-min daemon loop; all 5 service health checks ✅ | |
| 138 | +| **`codec-observer`** | **online** | **NEW — 5 s polling loop + trigger evaluator + `_maybe_fire_shift_report` time/idle scheduler** | |
| 139 | +| `codec-autopilot` | **stopped** | intentional, per user request | |
| 140 | + |
| 141 | +Other PM2 processes (cloudflared, kokoro-82m, qwen3.6, whisper-stt, lucy-*, ava-*, sentora-*, etc.) are pre-existing and unrelated to Phase 2. |
| 142 | + |
| 143 | +--- |
| 144 | + |
| 145 | +## State files clean |
| 146 | + |
| 147 | +| File | State | |
| 148 | +|---|---| |
| 149 | +| `~/.codec/skills/shift_report.py` | installed, 22151 bytes | |
| 150 | +| `~/.codec/plugins/self_improve.py` | installed, 17722 bytes (Phase 1 Step 4 — unchanged) | |
| 151 | +| `~/.codec/shift_report_state.json` | absent (manual path bypasses dedup; will be created on first `time` / `idle` fire) | |
| 152 | +| `~/.codec/triggers_killed.json` | absent (no kills issued) | |
| 153 | +| `~/.codec/pending_questions.json` | 0 entries | |
| 154 | +| `~/.codec/notifications.json` `type="shift_report"` | 1 (the live deployment fire test) | |
| 155 | +| `~/.codec/notifications.json` `type="question"` | 0 | |
| 156 | +| `/tmp/codec_*.txt` | 0 files | |
| 157 | +| Apple Reminders (incomplete) | 0 | |
| 158 | +| Apple Notes / Calendar entries created by Phase 2 | 0 | |
| 159 | + |
| 160 | +--- |
| 161 | + |
| 162 | +## Process improvements landed during Phase 2 |
| 163 | + |
| 164 | +1. **Observer / observation injection contract is `transport`-aware**: `transport=local` always injects, `transport=mcp` requires possessive pronoun OR continuation phrase OR explicit `SKILL_NEEDS_OBSERVATION` flag. Stops cloud-side LLM hallucinations from polluting context with stale local screenshot state. |
| 165 | + |
| 166 | +2. **`ocr_enabled` config-flag pattern**: when a feature requires a macOS TCC permission that may not yet be granted to a specific Python interpreter / PM2 parent process, ship the feature with a default-true config flag AND graceful-degraded code path. User can flip the flag to `false` before any popup storm starts. Pattern: prove the feature works in code, prove the popup storm in production, bisect to the offending call, add the flag, document the permission grant procedure. |
| 167 | + |
| 168 | +3. **`screencapture` popup-storm root cause documented**: `ThreadPoolExecutor`'s `with` block calls `shutdown(wait=True)` on exit, BLOCKING until the thread finishes — so a "100ms timeout" inside the executor was actually waiting ~5 s while the popup was open. Plus the retry triggered a SECOND popup. The fix (skip the executor entirely when `ocr_enabled=False`) is documented in the Step 5 hotfix postmortem (`PR #10` description). |
| 169 | + |
| 170 | +4. **Stable `sha8` keys for kill state**: `triggers_killed.json` keys on `sha8(skill_name + trigger_type + params_hash)` so the kill survives skill rename. Pattern reusable for any feature with persistent per-instance state. |
| 171 | + |
| 172 | +5. **Per-day dedup via atomic state file**: `shift_report_state.json` keyed by `local_date` (UTC offset honored). `time`/`idle` trigger paths early-exit if a report has already been generated today. `manual` path bypasses dedup so the user can always re-run on demand. Pattern reusable for any "one event per local day" scheduling. |
| 173 | + |
| 174 | +6. **`AGENTS.md` §10 don't-touch list extended**: `codec_observer.py`, `codec_triggers.py`, `skills/shift_report.py`, `routes/triggers.py` added to the "don't refactor without re-running the design doc gate" list. |
| 175 | + |
| 176 | +--- |
| 177 | + |
| 178 | +## Phase 3 — awaiting go-ahead |
| 179 | + |
| 180 | +Per user instruction (analog of Phase 1 → 2 transition): **Phase 3 planning begins after user explicit go-ahead — not automatic.** |
| 181 | + |
| 182 | +Open follow-ups that the Phase 2 design / postmortem docs flagged for future work (none block Phase 3 itself): |
| 183 | + |
| 184 | +- **Re-enable Step 5 OCR**: requires explicit grant of macOS Screen Recording to BOTH `/opt/homebrew/opt/python@3.13/bin/python3.13` AND the PM2 parent `node` binary. Procedure documented in `docs/PHASE2-STEP5-DESIGN.md` §macOS-permissions. After grant, set `observer.ocr_enabled: true` in `~/.codec/config.json` and `pm2 restart codec-observer`. Verify by tailing `audit.log` for `observation_tick` (not `observation_tick_slow`). |
| 185 | +- **First real `SKILL_OBSERVATION_TRIGGER` declaration**: at least one installed skill should declare a trigger so `trigger_fired` audit events populate. Candidates: `chrome_open` (window-title-match `^Slack`), `qr_generator` (clipboard-pattern URL detection). Phase 2 Step 6 ships the system; Phase 3 would ship the first real triggers. |
| 186 | +- **Step 7 dedup edge case at midnight**: `shift_report_state.json` keys on local-date. If user works through 23:59 → 00:00 boundary, an `idle` fire at 00:01 will generate a fresh report for the new day, but the previous day's idle work won't be captured if the user was non-idle through 23:50. Could be tightened by adding a `last_seen_active_at` field to the state file. |
| 187 | +- **MCP exposure for trigger management**: `routes/triggers.py` is PWA-only. A future PR could expose `triggers_list` / `triggers_kill` as MCP tools so claude.ai can disable a runaway trigger remotely. |
| 188 | +- **Old known-issues**: the 20 pre-existing failures in `docs/known-issues.md` are still deferred. Could be cleaned up in a separate housekeeping PR. |
| 189 | + |
| 190 | +--- |
| 191 | + |
| 192 | +*Phase 2 complete. Surfacing for user review. No automatic Phase 3 transition.* |
0 commit comments