Skip to content

fix(tests): hotfix v2 — skip Terminal-popup skills (memory_search/clipboard/self_improve)#7

Merged
AVADSA25 merged 2 commits intomainfrom
hotfix/incident-spurious-skill-fires
May 1, 2026
Merged

fix(tests): hotfix v2 — skip Terminal-popup skills (memory_search/clipboard/self_improve)#7
AVADSA25 merged 2 commits intomainfrom
hotfix/incident-spurious-skill-fires

Conversation

@AVADSA25
Copy link
Copy Markdown
Owner

@AVADSA25 AVADSA25 commented May 1, 2026

Summary

Second incident escalation. User reported (with screenshot) at 15:35 CEST that 3 Terminal windows auto-opened showing codec_clipboard_*.txt and codec_memory_*.txt files. Investigation traced the cascade to my own pytest runs during the Step 3 rebase + audit work earlier this session — each full suite run queued a Terminal popup via memory_search.run("test")osascript "tell Terminal to do script ...".

Root cause

tests/test_mcp_all_tools.py::CANONICAL_PROMPTS had:

  • "memory_search": "test"mod.run("test") → writes /tmp/codec_memory_<hash>.txt and opens it in a fresh Terminal window via osascript

Same pattern (Terminal popup via subprocess.Popen(["osascript", ...])) in:

  • clipboard.py — writes /tmp/codec_clipboard_<hash>.txt and opens
  • self_improve.py — writes Qwen-drafted .md proposals to ~/.codec/skill_proposals/, slow + audit noise

I ran the full pytest suite ~5× during the Step 3 rebase + pre-merge audit work earlier in this conversation. Each run = 1 queued Terminal window. macOS delivered them slowly — hence the user's perception that windows "kept appearing out of nowhere" several minutes after my last test run.

Verification

grep over the 24 skills that WILL still fire after this commit confirms NONE of them:

  • Open Terminal windows (Terminal.*do script)
  • Write /tmp/codec_*.txt files
  • Open browser tabs (webbrowser)
  • Spawn windows via subprocess.Popen.*open

Distribution after this PR: 39 skipped (was 36), 24 fired (was 27).

Cleanup performed

  • find /tmp -name "codec_*.txt" -delete (12 leftover files removed)
  • Closed any Terminal windows still showing those files via osascript
  • No PM2 restart, no _HTTP_BLOCKED change

Out of scope

Hotfix layering for the same incident

PR Skills added to SKIP_SKILLS Reason
#6 (merged fcbef2f) reminders, notes, tts_say, generate_qr_code, qr_generator Apple state writes
Step 3 PR #5 ask_user, stuck Interactive blockers (threading.Event hangs)
THIS PR memory_search, clipboard, self_improve Terminal-popup-spawning

🤖 Generated with Claude Code

Mikarina13 added 2 commits May 1, 2026 15:25
…prevention plan

User came back at 13:21 UTC (15:21 CEST) reporting CODEC was STILL firing
every 5min — 5 windows / 5 same Notes. Investigation found a SECOND leak
source distinct from the reminders one in the original incident doc.

Root cause #2: Step 3 AskUserQuestion test-fixture leak. When my test
runs (test_ask_user.py, test_destructive_consent.py) executed today
between 12:22 and 13:22 UTC, the `temp_askuser_paths` monkeypatch fixture
did not stick in some test orderings (likely module-cache reentry on the
full suite with worktree-aware path resolution). Result: 11 entries
written to ~/.codec/pending_questions.json (7 pending + 4 timed_out)
AND 11 type="question" notifications in ~/.codec/notifications.json.
Dashboard PWA polls these every 8s and renders an inline answer panel
for each → user saw "5 same windows" and "TestAgent is asking a question".

Root cause #3: same window saw 24 skill_proposal_staged emits because
test_mcp_all_tools.py iterates EVERY MCP-exposed skill including
self_improve. self_improve's run_once() calls Qwen and writes a
.md proposal per gap — explains the cascade.

Cleanup performed at 13:21 UTC (already done, documented here):
- pending_questions.json: 11 → 0 (backup preserved)
- notifications.json: 11 type=question removed (179 → 168, backup preserved)
- Quit auto-opened Notes / Reminders / TextEdit
- Killed NotificationCenter to clear stuck banners
- Updated user's runtime ~/.codec/skills/reminders.py to FIXED version
  (read-mode for "list reminders" — prevents future leaks creating real
  Apple Reminders if any test or LLM ever calls reminders again)

Permanent prevention plan added (6 items):
1. THIS hotfix already blocks reminders/notes/tts_say/qr in tests ✅
2. Tighten Step 3 fixture monkeypatch BEFORE merge
3. Add self_improve to SKIP_SKILLS
4. Stop using Apple Reminders for monitoring checkpoints (decide format
   after Step 3 lands, per user's existing instruction)
5. Optional CI/pre-commit gate: fail if any test writes to ~/.codec/*
6. Document test-isolation contract in AGENTS.md §10

What was NOT done (per user contract):
- No PM2 restart
- No killing Claude.app's codec_mcp.py children
- _HTTP_BLOCKED untouched
- Backups preserved for forensic record
…clipboard, self_improve)

Second incident escalation at 15:35 CEST. User came back showing screenshot
of THREE auto-opening Terminal windows displaying:
  - codec_clipboard_aek3vj4c.txt   (CODEC CLIPBOARD HISTORY ...)
  - codec_clipboard_j5dehi5i.txt   (same content, second copy)
  - codec_memory_36ypw9hl.txt      (CODEC MEMORY SEARCH: 'test' ...)

Investigation:
- /tmp/codec_memory_*.txt files: 10 of them, mtimes 14:01 → 15:27 CEST
  matching every full pytest suite I ran during this conversation
- Each file is a memory_search results dump, opened via:
    subprocess.Popen(["osascript","-e",
      'tell app "Terminal" to do script "cat <tmp> && ... && read && rm"'])
- CANONICAL_PROMPTS["memory_search"] = "test" → mod.run("test") on every
  test suite invocation → opens 1 fresh Terminal window per run
- Same pattern in skills/clipboard.py (codec_clipboard_*.txt)
- self_improve writes Qwen-drafted .md proposals to ~/.codec/skill_proposals/
  on every test run — slow, expensive, audit-log noise

I ran the full pytest suite ~5 times today doing the Step 3 rebase + the
pre-merge audit. Each run queued 1 memory_search Terminal popup. macOS
delivered them slowly, hence the user perceiving them "popping up out of
nowhere" several minutes AFTER my last test run.

This commit:
- Adds memory_search, clipboard, self_improve to SKIP_SKILLS
- Result: 36 → 39 skipped, 27 → 24 fired
- Verified: NONE of the 24 remaining-fired skills open Terminal windows,
  write temp files via osascript, or open browser tabs (grep clean)

Cleanup performed at 15:36 CEST:
- find /tmp -name "codec_*.txt" -delete   (12 files removed)
- closed any Terminal windows still showing those files
- (no PM2 restart, no _HTTP_BLOCKED change)

This is the third hotfix layer for the same incident:
  PR #6 (merged): reminders/notes/tts_say/qr_generator/generate_qr_code
  Step 3 PR #5:   ask_user/stuck (interactive blockers)
  THIS:           memory_search/clipboard/self_improve (Terminal popups)

Permanent fix already documented in
docs/INCIDENT-2026-05-01-spurious-skill-fires.md prevention plan item #5:
add a CI/pre-commit gate that fails if any test writes to ~/.codec/* OR
spawns a `Terminal "do script"` subprocess.
@AVADSA25 AVADSA25 merged commit 91c2d92 into main May 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants