AVADSA25 · AVADSA25 · May 2, 2026 · May 1, 2026 · May 1, 2026 · May 1, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -111,9 +111,29 @@ CODEC agents can pause and ask the user a structured question, self-detect when
 
 **Audit envelope**: all Step 3 events use `outcome="warning"`, `level="warning"`. They are NOT `outcome="error"` because each is an operational signal, not an operation failure (same Q4 tightening as Step 2's `hook_error`). `correlation_id` inherits from the wrapping operation per Step 1 §1.4.
 
-### Other known gaps (tracked for Phase 2)
+### Continuous Observation Loop (Phase 2 Step 5)
+
+CODEC has a background process (`codec-observer` PM2 service, `codec_observer.py`) that polls four cheap signals — frontmost window, screenshot OCR, clipboard delta, recent file changes — and keeps the last 10 minutes of state in a RAM-only ring buffer. On every chat / voice request, an injection helper decides whether to prepend a ≤200-token summary to the LLM's system prompt, gated per the §X "Observation injection contract":
+
+- **`transport="local"`** (local Qwen) → always inject. Cheap + private.
+- **`transport="mcp"`** → never inject. The MCP client (claude.ai, Claude.app) brings its own context.
+- **`transport in {"chat", "voice", "http"}`** → gated on cheap text-pattern checks: possessive-without-context (`"my X"`/`"this Y"` filtered against a stop-noun list), continuation language (`"continue"`, `"where was I"`), or skill-flag (`SKILL_NEEDS_OBSERVATION = True` on a resolved skill module).
+
+**Privacy contract**: 4 layers. (1) RAM only — `collections.deque` wiped on process restart. (2) Audit emits are METADATA-ONLY: lengths, counts, `content_type` tags, but NEVER raw window titles, OCR text, clipboard content, or file paths. (3) Cloud-transport injection gating per §X. (4) NO new system permissions — uses existing skills + primitives (osascript, pbpaste, Quartz, getmtime).
+
+**Cadence**: 60s when active (`CGEventSourceSecondsSinceLastEventType < 60s`); drops to 5min when idle. Long-idle reset wipes buffer at 30min idle.
+
+**Kill switch**: `OBSERVER_ENABLED=false` env var disables polling AND injection.
+
+**Audit events** (4 new): `observation_tick` (per poll, info), `observation_tick_slow` (poll > 150ms, warning), `observation_summary_injected` (gated inject fired, info, inherits cid), `observer_buffer_inspected` (debug-gated PWA read).
+
+**Forward-compat API for Steps 6 + 7**: `get_global_buffer()` exposes the live ring buffer (Step 6 Triggers reads `.snapshot()` for trigger evaluation); `persist_for_shift_report()` writes a summary to `~/.codec/observation_summaries/<ts>.md` (the only persistent observer output, called by Step 7 shift-report assembly).
+
+Implementation: `codec_observer.py` (RingBuffer + poll + injection helper + run_daemon), wired into `codec_dashboard.py:chat_completion` and `codec_voice.py:generate_response`. Debug PWA endpoint at `GET /api/observer/buffer?debug=1` returns metadata-only summary (raw entries never exposed even to authed callers; emits `observer_buffer_inspected` per call).
+
+### Other known gaps (tracked for Phase 2 follow-on)
 - No formal teammate / sub-agent recursion — Crew is the only multi-agent primitive
-- Self-improve agent doesn't yet emit memory facts on Phase 1 events (Step 4 work)
+- Step 6 (Triggers) and Step 7 (Shift Report Crew) — Phase 2 Steps still pending
 
 ## 4. Skill system
 
@@ -230,6 +250,18 @@ Six new event names exported from `codec_audit.py` as module constants. All `out
 
 The constants are also exposed as frozensets for analyzer / introspection: `ASKUSER_EVENTS`, `STUCK_EVENTS`, `STEP3_EVENTS`. `audit_report.py` ingests them as additive event types — no schema bump.
 
+### Phase 2 Step 5 audit events (continuous observation)
+Four new event names exported from `codec_audit.py` for the Continuous Observation Loop. All inherit `correlation_id` per §1.4 (the inject event reuses the wrapping chat/voice op's cid; the tick events generate per-poll cids).
+
+| Event | Source | level | extra fields |
+|---|---|---|---|
+| `observation_tick` | `codec-observer` | info | METADATA-ONLY: `active_app`, `active_title_len`, `ocr_chars`, `ocr_skipped`, `clipboard_changed`, `clipboard_kind`, `recent_files_count`, `idle_seconds`, `cadence_used_s`, `buffer_depth`, `poll_duration_ms` |
+| `observation_tick_slow` | `codec-observer` | warning | Same as `observation_tick` — emitted instead when `poll_duration_ms > poll_slow_threshold_ms` (default 150ms). Q5.5 flag for visibility, no behavior change. |
+| `observation_summary_injected` | `codec-observer` | info | `tokens_used`, `injection_reason` (`always_local`\|`possessive_match`\|`continuation_match`\|`skill_flag`), `buffer_entries_summarized`. `transport` is top-level (reserved). |
+| `observer_buffer_inspected` | `codec-dashboard` | info | `client_ip`, `buffer_entries_returned`. Q5.6 PWA `?debug=1` audit. |
+
+`PHASE2_STEP5_EVENTS` frozenset exposed for analyzer breakdown. `observation_tick` is METADATA-ONLY by design — no titles, no OCR text, no clipboard content, no file paths leak to `~/.codec/audit.log`.
+
 ### Notifications (`~/.codec/notifications.json`)
 Four sources can produce notifications: scheduler (crew completion), heartbeat (threshold alert), autopilot (ambient trigger), and Phase 1 Step 3's AskUserQuestion (`type="question"`). All write through `routes/_shared.py:51-127` except AskUserQuestion which writes via `codec_ask_user._write_question_notification`.
 
@@ -368,6 +400,9 @@ These zones break running infrastructure if changed without coordination. NEVER
 - `~/.codec/voice_session.json` (Phase 1 Step 3) — voice-session active-marker; `VoicePipeline.run` owns its lifecycle.
 - Phase 1 Step 3 feature-flag env vars — `ASKUSER_ENABLED`, `STUCK_DETECTION_ENABLED`, `STEP_BUDGET_ENABLED` (default true). Set to `false` to disable a feature in production; tests use these to bypass during isolated unit testing. Don't toggle them globally without coordinating — they alter agent behavior across all paths (chat / voice / crew / MCP).
 - `~/.codec/config.json:ask_user.{timeout_seconds, consent_strict_max_attempts}` and `:stuck.{window, repeat_threshold, escalation_action}` and `:step_budget.{chat, voice}` — Phase 1 Step 3 tunables. Bumping `step_budget.chat` to 8 or 10 is the documented "tune up before tuning out" pressure-relief valve, but don't touch the others without referencing the design doc rationale (§1.2 Q1, §1.7, §2.3, §3.2).
+- `~/.codec/observation_summaries/` (Phase 2 Step 5) — populated only by `codec_observer.persist_for_shift_report()`. Do not add files manually; the Step 7 shift-report assembly relies on the time-stamped naming convention. Safe to delete the whole directory if you want to wipe the persisted history.
+- `OBSERVER_ENABLED` env var (Phase 2 Step 5, default `true`). Setting `false` disables both the polling loop AND the prompt injection. No separate injection kill switch — the buffer is always populated when enabled, only injection is gated.
+- `~/.codec/config.json:observer.{...}` — Phase 2 Step 5 tunables (cadence_active_s, cadence_idle_s, idle_threshold_s, buffer_depth_min, ocr_timeout_ms, ocr_retry_timeout_ms, reset_on_long_idle, reset_idle_threshold_s, summary_max_tokens, poll_slow_threshold_ms, stop_nouns). Don't tune the cadences below 30s without considering OCR cost.
 
 ## 11. Working with this repo as a coding agent
 

diff --git a/codec_audit.py b/codec_audit.py
@@ -153,6 +153,55 @@
 )
 
 
+# ── Phase 2 Step 5 event names (Continuous Observation Loop) ──────────────────
+# Per docs/PHASE2-STEP5-DESIGN.md §3. `observation_tick` is `level="info"`
+# (operational signal, fires once per poll cycle). `observation_summary_injected`
+# is `level="info"` and inherits the wrapping chat/voice operation's
+# correlation_id (per Step 1 §1.4 — this emit is part of that op, not new).
+# `observation_tick_slow` (Q5.5) is `level="warning"` to flag poll-overrun
+# without changing behavior. `observer_buffer_inspected` (Q5.6) audits any
+# debug-gated read of the live buffer state via the PWA endpoint.
+OBSERVATION_TICK              = "observation_tick"
+OBSERVATION_TICK_SLOW         = "observation_tick_slow"        # Q5.5
+OBSERVATION_SUMMARY_INJECTED  = "observation_summary_injected"
+OBSERVER_BUFFER_INSPECTED     = "observer_buffer_inspected"    # Q5.6
+
+PHASE2_STEP5_EVENTS = frozenset({
+    OBSERVATION_TICK, OBSERVATION_TICK_SLOW,
+    OBSERVATION_SUMMARY_INJECTED, OBSERVER_BUFFER_INSPECTED,
+})
+
+# Step 5 event-specific extra-field reservations.
+# observation_tick / observation_tick_slow are METADATA-ONLY by design —
+# no titles, no OCR text, no clipboard content, no file paths.
+# (See design §3 "What we deliberately do NOT emit".)
+OBSERVATION_TICK_EXTRA_FIELDS = (
+    "active_app",              # str — e.g. "Google Chrome"
+    "active_title_len",        # int — length only
+    "ocr_chars",               # int — length of OCR result
+    "ocr_skipped",             # bool — true if OCR timed out
+    "clipboard_changed",       # bool
+    "clipboard_kind",          # "url" | "text" | "code" | "json" | "image_blob_redacted"
+    "recent_files_count",      # int
+    "idle_seconds",            # float — at time of poll
+    "cadence_used_s",          # int — 60 or 300, selected per Q1
+    "buffer_depth",            # int — current ring buffer length
+    "poll_duration_ms",        # float — for OBSERVATION_TICK_SLOW threshold
+)
+
+OBSERVATION_INJECTION_EXTRA_FIELDS = (
+    "tokens_used",             # int
+    "injection_reason",        # "always_local" | "possessive_match" |
+                               # "continuation_match" | "skill_flag"
+    "buffer_entries_summarized",  # int
+)
+
+OBSERVER_BUFFER_INSPECT_EXTRA_FIELDS = (
+    "client_ip",               # str — who hit the debug endpoint
+    "buffer_entries_returned", # int
+)
+
+
 # ── Helpers ────────────────────────────────────────────────────────────────────
 def _truncate(s, max_len: int = _PREVIEW_MAX) -> str:
     """Truncate a string to `max_len` chars. None/non-str → ''. Never raises."""

diff --git a/codec_dashboard.py b/codec_dashboard.py
@@ -389,6 +389,50 @@ async def status():
     }
 
 
+# Phase 2 Step 5 §Q5.6 — debug-gated buffer-inspect endpoint.
+# Anyone with PWA auth can call this with `?debug=1`. Every call emits
+# an `observer_buffer_inspected` audit event so privileged reads are
+# observable in the audit log. NOT linked from the main UI.
+@app.get("/api/observer/buffer")
+async def observer_buffer(request: Request, debug: int = 0):
+    """Return the current ring buffer state. Q5.6 design: debug-only,
+    auth-gated (covered by the dashboard's existing /api/* auth
+    middleware), audit-emitting."""
+    if int(debug) != 1:
+        return {"error": "set ?debug=1 to read live observer buffer"}
+    try:
+        from codec_observer import get_global_buffer
+        from codec_audit import OBSERVER_BUFFER_INSPECTED, log_event as _le
+        buf = get_global_buffer()
+        snap = buf.snapshot()
+        try:
+            client_ip = request.client.host if request.client else "unknown"
+        except Exception:
+            client_ip = "unknown"
+        try:
+            _le(
+                OBSERVER_BUFFER_INSPECTED, "codec-dashboard",
+                f"observer buffer inspected via /api/observer/buffer",
+                extra={
+                    "client_ip": client_ip,
+                    "buffer_entries_returned": len(snap),
+                },
+                outcome="ok", level="info",
+            )
+        except Exception:
+            pass
+        # Return only the metadata + a redacted summary, NOT the raw entries
+        # (raw entries contain titles + OCR text + clipboard content).
+        return {
+            "buffer_depth": len(snap),
+            "summary": buf.render_summary(),
+            "oldest_ts": snap[0].get("ts") if snap else None,
+            "newest_ts": snap[-1].get("ts") if snap else None,
+        }
+    except Exception as e:
+        return {"error": f"observer not available: {e}"}
+
+
 def _mask_sensitive(value: str) -> str:
     """Mask sensitive field values, showing only last 4 characters."""
     if not value or not isinstance(value, str):
@@ -2557,6 +2601,26 @@ async def _skill_stream():
                 "DO NOT emit [SKILL:...] tool-calling tags in this response — "
                 "the answer IS the rewritten text, no tools needed."
             )
+        # Phase 2 Step 5 — Observer summary injection (gated per §X).
+        # Local Qwen always injects; cloud transports (this chat path uses
+        # local-by-default but may be cloud-routed by the user — pass the
+        # detected transport tag) gate on possessive / continuation /
+        # skill-flag patterns. Returns (summary_or_None, reason); audit
+        # emit fires inside the helper ONLY when summary non-None.
+        try:
+            from codec_observer import maybe_inject_observation_summary
+            _obs_transport = "local" if "localhost" in (config.get("llm_base_url") or "") else "chat"
+            _obs_summary, _obs_reason = maybe_inject_observation_summary(
+                user_prompt=last_user_text or "",
+                transport=_obs_transport,
+                skill_name=None,           # post-LLM tag path, no skill resolved yet
+                skill_module=None,
+            )
+            if _obs_summary:
+                sys_prompt += f"\n\n{_obs_summary}"
+        except Exception as _e:
+            log.debug(f"[observer] injection failed (non-fatal): {_e}")
+
         # Prepend system message (or replace existing one)
         if messages and messages[0].get("role") == "system":
             messages[0]["content"] = sys_prompt + "\n\n" + messages[0]["content"]