fix(tests): hotfix v2 — skip Terminal-popup skills (memory_search/clipboard/self_improve) by AVADSA25 · Pull Request #7 · AVADSA25/codec

AVADSA25 · 2026-05-01T13:39:14Z

Summary

Second incident escalation. User reported (with screenshot) at 15:35 CEST that 3 Terminal windows auto-opened showing codec_clipboard_*.txt and codec_memory_*.txt files. Investigation traced the cascade to my own pytest runs during the Step 3 rebase + audit work earlier this session — each full suite run queued a Terminal popup via memory_search.run("test") → osascript "tell Terminal to do script ...".

Root cause

tests/test_mcp_all_tools.py::CANONICAL_PROMPTS had:

"memory_search": "test" → mod.run("test") → writes /tmp/codec_memory_<hash>.txt and opens it in a fresh Terminal window via osascript

Same pattern (Terminal popup via subprocess.Popen(["osascript", ...])) in:

clipboard.py — writes /tmp/codec_clipboard_<hash>.txt and opens
self_improve.py — writes Qwen-drafted .md proposals to ~/.codec/skill_proposals/, slow + audit noise

I ran the full pytest suite ~5× during the Step 3 rebase + pre-merge audit work earlier in this conversation. Each run = 1 queued Terminal window. macOS delivered them slowly — hence the user's perception that windows "kept appearing out of nowhere" several minutes after my last test run.

Verification

grep over the 24 skills that WILL still fire after this commit confirms NONE of them:

Open Terminal windows (Terminal.*do script)
Write /tmp/codec_*.txt files
Open browser tabs (webbrowser)
Spawn windows via subprocess.Popen.*open

Distribution after this PR: 39 skipped (was 36), 24 fired (was 27).

Cleanup performed

find /tmp -name "codec_*.txt" -delete (12 leftover files removed)
Closed any Terminal windows still showing those files via osascript
No PM2 restart, no _HTTP_BLOCKED change

Out of scope

Step 3 PR feat(askuser): pause-and-ask + stuck detection + step budget (Phase 1 Step 3) #5 untouched (will rebase onto this once merged)
Same incident doc (docs/INCIDENT-2026-05-01-spurious-skill-fires.md) covers the analysis; permanent prevention plan item feat(askuser): pause-and-ask + stuck detection + step budget (Phase 1 Step 3) #5 (CI/pre-commit gate that fails if any test writes to ~/.codec/* OR spawns Terminal popups) becomes higher priority after this third hotfix layer.

Hotfix layering for the same incident

PR	Skills added to SKIP_SKILLS	Reason
#6 (merged `fcbef2f`)	`reminders, notes, tts_say, generate_qr_code, qr_generator`	Apple state writes
Step 3 PR #5	`ask_user, stuck`	Interactive blockers (threading.Event hangs)
THIS PR	`memory_search, clipboard, self_improve`	Terminal-popup-spawning

🤖 Generated with Claude Code

…prevention plan User came back at 13:21 UTC (15:21 CEST) reporting CODEC was STILL firing every 5min — 5 windows / 5 same Notes. Investigation found a SECOND leak source distinct from the reminders one in the original incident doc. Root cause #2: Step 3 AskUserQuestion test-fixture leak. When my test runs (test_ask_user.py, test_destructive_consent.py) executed today between 12:22 and 13:22 UTC, the `temp_askuser_paths` monkeypatch fixture did not stick in some test orderings (likely module-cache reentry on the full suite with worktree-aware path resolution). Result: 11 entries written to ~/.codec/pending_questions.json (7 pending + 4 timed_out) AND 11 type="question" notifications in ~/.codec/notifications.json. Dashboard PWA polls these every 8s and renders an inline answer panel for each → user saw "5 same windows" and "TestAgent is asking a question". Root cause #3: same window saw 24 skill_proposal_staged emits because test_mcp_all_tools.py iterates EVERY MCP-exposed skill including self_improve. self_improve's run_once() calls Qwen and writes a .md proposal per gap — explains the cascade. Cleanup performed at 13:21 UTC (already done, documented here): - pending_questions.json: 11 → 0 (backup preserved) - notifications.json: 11 type=question removed (179 → 168, backup preserved) - Quit auto-opened Notes / Reminders / TextEdit - Killed NotificationCenter to clear stuck banners - Updated user's runtime ~/.codec/skills/reminders.py to FIXED version (read-mode for "list reminders" — prevents future leaks creating real Apple Reminders if any test or LLM ever calls reminders again) Permanent prevention plan added (6 items): 1. THIS hotfix already blocks reminders/notes/tts_say/qr in tests ✅ 2. Tighten Step 3 fixture monkeypatch BEFORE merge 3. Add self_improve to SKIP_SKILLS 4. Stop using Apple Reminders for monitoring checkpoints (decide format after Step 3 lands, per user's existing instruction) 5. Optional CI/pre-commit gate: fail if any test writes to ~/.codec/* 6. Document test-isolation contract in AGENTS.md §10 What was NOT done (per user contract): - No PM2 restart - No killing Claude.app's codec_mcp.py children - _HTTP_BLOCKED untouched - Backups preserved for forensic record

…clipboard, self_improve) Second incident escalation at 15:35 CEST. User came back showing screenshot of THREE auto-opening Terminal windows displaying: - codec_clipboard_aek3vj4c.txt (CODEC CLIPBOARD HISTORY ...) - codec_clipboard_j5dehi5i.txt (same content, second copy) - codec_memory_36ypw9hl.txt (CODEC MEMORY SEARCH: 'test' ...) Investigation: - /tmp/codec_memory_*.txt files: 10 of them, mtimes 14:01 → 15:27 CEST matching every full pytest suite I ran during this conversation - Each file is a memory_search results dump, opened via: subprocess.Popen(["osascript","-e", 'tell app "Terminal" to do script "cat <tmp> && ... && read && rm"']) - CANONICAL_PROMPTS["memory_search"] = "test" → mod.run("test") on every test suite invocation → opens 1 fresh Terminal window per run - Same pattern in skills/clipboard.py (codec_clipboard_*.txt) - self_improve writes Qwen-drafted .md proposals to ~/.codec/skill_proposals/ on every test run — slow, expensive, audit-log noise I ran the full pytest suite ~5 times today doing the Step 3 rebase + the pre-merge audit. Each run queued 1 memory_search Terminal popup. macOS delivered them slowly, hence the user perceiving them "popping up out of nowhere" several minutes AFTER my last test run. This commit: - Adds memory_search, clipboard, self_improve to SKIP_SKILLS - Result: 36 → 39 skipped, 27 → 24 fired - Verified: NONE of the 24 remaining-fired skills open Terminal windows, write temp files via osascript, or open browser tabs (grep clean) Cleanup performed at 15:36 CEST: - find /tmp -name "codec_*.txt" -delete (12 files removed) - closed any Terminal windows still showing those files - (no PM2 restart, no _HTTP_BLOCKED change) This is the third hotfix layer for the same incident: PR #6 (merged): reminders/notes/tts_say/qr_generator/generate_qr_code Step 3 PR #5: ask_user/stuck (interactive blockers) THIS: memory_search/clipboard/self_improve (Terminal popups) Permanent fix already documented in docs/INCIDENT-2026-05-01-spurious-skill-fires.md prevention plan item #5: add a CI/pre-commit gate that fails if any test writes to ~/.codec/* OR spawns a `Terminal "do script"` subprocess.

Mikarina13 added 2 commits May 1, 2026 15:25

AVADSA25 merged commit 91c2d92 into main May 1, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tests): hotfix v2 — skip Terminal-popup skills (memory_search/clipboard/self_improve)#7

fix(tests): hotfix v2 — skip Terminal-popup skills (memory_search/clipboard/self_improve)#7
AVADSA25 merged 2 commits intomainfrom
hotfix/incident-spurious-skill-fires

AVADSA25 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AVADSA25 commented May 1, 2026

Summary

Root cause

Verification

Cleanup performed

Out of scope

Hotfix layering for the same incident

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants