Skip to content

feat: auto-loop + continuity contract (ScheduleWakeup integration)#1

Open
zelinewang wants to merge 2 commits into
mainfrom
feat/auto-loop-continuity
Open

feat: auto-loop + continuity contract (ScheduleWakeup integration)#1
zelinewang wants to merge 2 commits into
mainfrom
feat/auto-loop-continuity

Conversation

@zelinewang
Copy link
Copy Markdown
Owner

Summary

Integrates Claude Code's built-in ScheduleWakeup tool (the /loop dynamic-mode primitive) into the /dev workflow so a single /dev invocation can run end-to-end — including waiting on CI, long builds, agent teams, or deploy health — without requiring the human to manually say "continue" between phases.

This PR bundles three logical change groups accumulated over ~5 weeks of daily use:

1. Month of local iteration (~72 net lines)

Small quality-of-life improvements that landed locally but never got pushed:

  • Language/framework detection table in P1 (TypeScript/Python/Go/Rust/Java/Kotlin/C++/Flutter indicator files → language-specific ECC reviewer)
  • Source Evaluation subsection in P1: mandatory credibility check on external info (web search / DeepWiki / LLM summaries) before a finding influences downstream decisions
  • Various phase-adaptation polish (proactive MCP use notes, database-reviewer on SQL tasks, architect agent for DEEP P3, etc.)

2. LOOP INTEGRATION (~192 net lines) — new dedicated section

Authoritative reference for ScheduleWakeup usage inside /dev:

  • Core Contract: tool schema, sentinel distinction (<<autonomous-loop-dynamic>> vs CronCreate's <<autonomous-loop>>)
  • 4 Standard Patterns: CI Wait (P9 DEEP), Long Task (P7/P8), Agent Merge (P7 DEEP Agent Teams), Deploy Health (P9 CICD) — each with concrete signal source, delay, max iter, success/failure/timeout semantics, and reason-field template
  • Cost Budget table: 300s prompt-cache TTL trap documented — 60–270s cache-hit regime, never 300s, 301–3600s cache-miss regime for idle waits
  • Exit Conditions Matrix (mandatory): every loop MUST declare success / failure / timeout signals before first ScheduleWakeup call
  • Anti-Patterns table (6 forbidden patterns): no exit conditions / 300s delay / no max_iter / cross-phase loop reuse / generic reason / retry without root-cause fix
  • CHECKPOINT interaction: do NOT ScheduleWakeup during human-decision checkpoints

3. CONTINUITY CONTRACT (~64 net lines) — new top-level section

Formalizes the end-to-end execution discipline:

  • /dev owns the task from P0 to P10 archive, pauses ONLY at CHECKPOINT 1, CHECKPOINT 2, or loop-escalation
  • Auto-loop default on STANDARD/DEEP tiers for phases with external wait (CI, long build, agent teams, deploy health). User does NOT need --loop flag.
  • --no-loop opts out (blocking mode). --loop is redundant/explicit-only.
  • Includes end-to-end flow diagrams for DEEP-with-CI and STANDARD-without-CI.

Why bundle these together?

The three groups are intertwined:

  • Auto-loop behavior (group 3) depends on the loop patterns defined in group 2
  • Loop patterns (group 2) reference the phase adaptations refined in group 1
  • Splitting into 3 PRs would force reviewers to hold partial state across reviews
  • As a solo-maintained open-source plugin, bundling keeps git history cleaner

Risk assessment

This skill is prompt/markdown — no executable code path. There is no crash, memory, or security risk. The only meaningful risks are:

  1. Cost drift — loop without exit condition → infinite wake-up cycles. Mitigated by mandatory Exit Conditions Matrix + max_iter cap on every pattern.
  2. Cache miss — picking 300s delay → full context reread every wake. Mitigated by explicit 300s-is-WORST callout in Cost Budget table; default is 270s.
  3. Semantic ambiguity — Claude picks a sub-optimal loop when user intended blocking mode. Mitigated by --no-loop escape hatch and per-wake status blocks giving user visibility.

Test plan

  • Review skills/dev-orchestrator/SKILL.md diff — focus on LOOP INTEGRATION section and CONTINUITY CONTRACT consistency
  • Verify all 4 loop patterns have concrete signals (not "check if done" handwaves)
  • Check internal references: every mention of "Pattern 1/2/3/4" resolves; every "LOOP INTEGRATION"/"CONTINUITY CONTRACT" cross-ref resolves
  • Verify --no-loop / --max-loop-iter / --loop-delay documented in OVERRIDE FLAGS table
  • Dry-run mental model: imagine /dev fix X + push PR + wait CI — does the flow make sense end-to-end?
  • No auto-merge — human review first

File stats

File Before After Change
skills/dev-orchestrator/SKILL.md 311 639 +328 net
commands/dev.md 11 11 +4 chars (description tweak + new flags in argument-hint)

Three logical change groups bundled in one PR (see PR description for full
breakdown). High-level:

1. Month of local iteration (~72 lines): language/framework detection table,
   source-evaluation rules, phase-adaptation polish, description tweaks.

2. LOOP INTEGRATION (+192 lines): ScheduleWakeup (/loop dynamic) as first-class
   tool inside /dev. 4 standard patterns (CI wait, long task, agent merge,
   deploy health), exit-conditions matrix, cost budget (300s cache-TTL trap
   documented), anti-patterns, CHECKPOINT interaction.

3. CONTINUITY CONTRACT (+64 lines): /dev owns the task end-to-end; stops only
   at CHECKPOINT 1, CHECKPOINT 2, or loop-escalation. Auto-loop is ON by
   default for STANDARD/DEEP tiers on phases with external wait. --no-loop
   opts out. --loop becomes redundant.

File growth: skills/dev-orchestrator/SKILL.md  311 -> 639 lines (+328 net).
File growth: commands/dev.md                     adds --loop/--no-loop/--max-loop-iter/--loop-delay flags.

Risk assessment: skill is prompt/markdown, no executable code path — there is
no crash/security risk. The only real risks are semantic (loop without exit
condition → cost drift; 300s delay → cache miss). Both are mitigated by the
new Exit Conditions Matrix (mandatory) and cache-budget table with 300s
explicitly flagged as the worst choice.
… fixes

Evolves from v2.1 (2 mandatory checkpoints) to v3.0 (zero mandatory checkpoints)
per design roadmap from 2026-03-08.

Key changes:

1. NEW CONTINUITY CONTRACT v3 (zero-checkpoint):
   - /dev runs P0 → P10 without mandatory pauses
   - 3 legitimate pauses only: Requirement Clarification Gate (P3 prefix,
     one-shot), External Wait (auto-loop), Anomaly Escalation (AI confidence <70%)
   - Models a senior engineer's workflow — clarify once up-front, then execute
     autonomously

2. Requirement Clarification Gate:
   - Fires at P3 prefix if requirement has load-bearing ambiguity AI can't
     resolve via claudemem + codebase + web + DAO reasoning
   - Single AskUserQuestion with 2-4 pointed options, no subsequent pauses
   - Calibration examples for when to clarify vs when to just do it

3. CHECKPOINT 1 + CHECKPOINT 2 removed:
   - P5 no longer pauses after plan (plan written to disk as artifact)
   - P9 no longer pauses after PR (user reviews via PR/git log/wrapup)
   - Safety exception: push to master/main/production triggers anomaly-escalate

4. Loop fixes addressing codex-review findings:
   - Pattern 1 success: direct → P10 (no CHECKPOINT detour)
   - Pattern 1 failure: max fix-cycles: 3 outer cap (was unbounded)
   - Universal unknown-signal circuit breaker: ≥3 consecutive unknown wakes
     → exit-escalate (prevents 90min burn on probe failures)
   - LOOP WAKE status block: Elapsed/cache fields marked estimated
     (Claude can't observe wall-clock or cache state directly)
   - P9 CICD → Pattern 4 (Deploy Health), not Pattern 1 (PR CI)

5. Language detection fixes:
   - TypeScript: tsconfig.json OR (.ts files AND package.json)
   - JavaScript: package.json without .ts files
   - Kotlin: build.gradle.kts (no longer collides with Java)
   - Java: pom.xml OR build.gradle (Groovy DSL, no .kts)

File growth: skills/dev-orchestrator/SKILL.md 639 → 736 lines (+97 net).
@zelinewang
Copy link
Copy Markdown
Owner Author

Pushed v3 refactor on feat/auto-loop-continuity (commit da57e4e).

Snapshot of pre-v3 state saved as tag v2.1-2checkpoints on commit a398a4b (current main HEAD).

Key changes in this update:

  • Removed CHECKPOINT 1 (after P5) and CHECKPOINT 2 (after P9) — now fully auto-run
  • Added Requirement Clarification Gate at P3 prefix (one-shot, only if needed)
  • Fixed 10 codex-review findings: Pattern 1 routing, fix-cycle cap, unknown-signal exit, LOOP WAKE fields, language detection ambiguities

Not auto-merging — human review first.

zelinewang added a commit that referenced this pull request Apr 27, 2026
Hardening of v5.2 feedback loop based on 3-Claude-agent independent review:
#1 NOTHING-filter separated (was AND-gated on length)
#2 Cooldown writes ONLY on save success (was pre-emptive — failed extract
   silently blocked retries for 30min)
#3 session-start reads latest retro by stored note ID (was relevance-
   ranked search — old notes outranked yesterday's lesson)
#4 TRANSCRIPT_PATH validated against $HOME/.claude/ prefix (was only
   existence check)
#5 Prompt injection defense: untrusted-block tags + minimal output sanitize
#6 session-start CONTEXT built in Python (was bash \n mixing)
#7 Strict alnum+dash branch sanitization (was just slash → dash)

Plus: atomic mkdir lock prevents concurrent extractor instances.

E2E verified all 7 fixes. See claude-code-config commit for full details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant