/dev — AI Development Orchestrator

One command. Any task. 4-layer enforcement.

/dev turns Claude Code into an end-to-end development workflow — for code, research, and infrastructure tasks. Describe what you want → the orchestrator investigates, plans, executes, verifies, and ships. Zero mandatory human checkpoints. Deterministic hooks enforce quality gates automatically.

You:   /dev "add user authentication with JWT and refresh tokens"

AI:    ┌─ INVESTIGATE ───────────────────────────
       │ Status: DONE | Sources: memory, code, docs, git
       │ Next: PLAN
       └─────────────────────────────────────────

       ┌─ PLAN ─────────────────────────────────
       │ Intent: build | Depth: deep
       │ 8 subtasks, TDD per each
       │ Next: EXECUTE
       └─────────────────────────────────────────

       [TDD task 1/8 → ... → 8/8 ✓]
       [47/47 tests passing. 0 regressions.]

       ┌─ SHIP ─────────────────────────────────
       │ PR #142 created. CI green.
       │ Progress saved to dev-progress/feat-jwt-auth.json
       └─────────────────────────────────────────

Architecture: 4-Layer Enforcement

Designed from 19 primary research sources (Anthropic, OpenAI, Stripe, arXiv, Martin Fowler, Mitchell Hashimoto). Core insight: deterministic infrastructure > prompt-level guidance.

┌─────────────────────────────────────────────────────┐
│  Layer 4: File-Backed State (survives everything)    │
│  .claude/dev-progress/<branch>.json                  │
├─────────────────────────────────────────────────────┤
│  Layer 3: Hooks (100% deterministic, zero context)   │
│  SessionStart → inject progress                      │
│  PreToolUse → verify-on-commit gate                  │
│  PostToolUse → auto-lint Python                      │
│  Stop → persist state                                │
├─────────────────────────────────────────────────────┤
│  Layer 2: Rules + CLAUDE.md (~80%, compact-safe)     │
│  phases.md, deep-investigation.md, tdd-protocol.md   │
│  wrapup-retrospective.md                             │
├─────────────────────────────────────────────────────┤
│  Layer 1: /dev Skill (on-demand, full guidance)      │
│  739 lines — only loaded when /dev is invoked        │
└─────────────────────────────────────────────────────┘

Layer	Compliance	Survives compact?	Context cost
Hooks	100% (deterministic)	✅ always active	Zero
Rules + CLAUDE.md	~80% (probabilistic)	✅ re-injected	~900 tokens
/dev Skill	~70% (probabilistic)	❌ ephemeral	~3,000 tokens (on-demand)

Version History

Version	Key Change	Enforcement
v1.0	Core orchestrator, 3 tiers	Text suggestions (AI can skip)
v2.0	TDD Protocol + verify-dev.sh	Mandatory status blocks
v2.1	Task routing (DEVELOP/RESEARCH/CICD)	Phase status blocks (16% compliance measured)
v3.0	Zero-checkpoint continuity contract	AI self-enforces (aspirational)
v4.0	4-layer harness architecture	Hooks (100%) + Rules (80%) + Skill (on-demand)

How It Works

Intent Routing (from ReSpecV philosophy)

Route by intent, not file count:

Intent	When	Workflow
Build	New feature, architecture	Full: investigate → plan → execute → verify → ship
Fix	Bug, hotfix, incident	Evidence-first: logs → root-cause → TDD fix → verify
Research	Investigate, analyze, learn	Investigate → notes → report (no TDD/worktree)
Deploy	Pipeline, infra, config	Impact + rollback strategy → dry-run → apply → monitor
Trivial	Typo, comment, config	Skip to execute

Core Phases

Investigate — Search claudemem, read code, check docs (context7), read git log. For bugs: check production logs FIRST.
Plan — State intent + success criteria + affected files in one status block.
Execute — TDD per subtask: RED → GREEN → VERIFY → COMMIT. Language-aware reviewers.
Verify — Auto-enforced by pre-commit hook (exit 2 if verify-dev.sh fails). Manual: /verify.
Ship — Push feature branch, create PR, update dev-progress/<branch>.json.

Deep Mode (`--deep`)

Adds: agent teams for parallel subtasks, ScheduleWakeup for CI wait (4 patterns: CI-wait, long-build, agent-merge, deploy-health), language-aware ECC reviewers, OpenSpec integration.

Hooks (Deterministic Enforcement)

All hooks run on the host machine at zero context cost. They fire automatically — no invocation needed.

`dev-verify-on-commit.sh` (PreToolUse:Bash)

Two-tier gate on every git commit:

verify-dev.sh failure → exit 2 (hard block — cannot be auto-accepted)
ruff lint issues → additionalContext warning (visible but non-blocking)

`dev-quality-gate.sh` (PostToolUse:Edit|Write)

Auto-formats Python files with ruff format --quiet after every edit. Non-blocking. Production validated: 3/3 correct triggers, zero false positives.

`dev-session-start.sh` (SessionStart)

Reads branch-scoped dev-progress/<branch>.json and injects task state via additionalContext. Enables cross-session resume. Skips if no active task (>24h stale).

`dev-session-end.sh` (Stop)

Updates updated_at timestamp in progress file for continuity.

Rules (Compact-Safe Guidance)

Rules files in ~/.claude/rules/dev-workflow/ are loaded at every session start and survive context compaction. ~80% compliance (probabilistic but durable).

File	Purpose	Lines
`phases.md`	5 phases, intent routing, status blocks, escalation	35
`deep-investigation.md`	Logs-first for bugs, counter-hypothesis check	34
`tdd-protocol.md`	VERIFY+COMMIT steps, when-to-apply by intent	16
`wrapup-retrospective.md`	6-dimension session retrospective template	43

Cross-Session State

Branch-scoped JSON progress files at .claude/dev-progress/<branch>.json:

{
  "task": "Add JWT authentication",
  "intent": "build",
  "phase": "execute",
  "subtasks": [
    {"name": "Create auth middleware", "status": "done"},
    {"name": "Add token refresh", "status": "in_progress"}
  ]
}

Helper script: dev-progress-update.sh create|phase|subtask-done|subtask-add|done

Wrapup Retrospective

Every /wrapup includes a mandatory Phase 3.5 evaluating 6 dimensions:

A. Phase compliance (did I follow the workflow?)
B. Investigation quality (did I explore enough context?)
C. User corrections (count + root cause)
D. Tool utilization (used vs should-have-used)
E. Hook effectiveness (trigger table)
F. Workflow design feedback (design problem vs execution problem)

This turns every session into a test session for the dev workflow itself.

Installation

Option A: Via claude-code-config (recommended for existing users)

cd ~/claude-code-config && git pull
# Hooks, rules, and scripts are in global/
# Run install.sh to sync to ~/.claude/
bash install.sh

Option B: Manual Install

# Hooks (user-level, register in ~/.claude/settings.json)
cp hooks/dev-*.sh ~/.claude/hooks/
chmod +x ~/.claude/hooks/dev-*.sh

# Rules
mkdir -p ~/.claude/rules/dev-workflow
cp rules/dev-workflow/*.md ~/.claude/rules/dev-workflow/

# Scripts
cp scripts/dev-progress-update.sh ~/.claude/scripts/
cp scripts/verify-dev.sh ~/.claude/scripts/
chmod +x ~/.claude/scripts/dev-*.sh ~/.claude/scripts/verify-dev.sh

# Skill
mkdir -p ~/.claude/skills/dev-orchestrator
cp skills/dev-orchestrator/SKILL.md ~/.claude/skills/dev-orchestrator/

# Command
cp commands/dev.md ~/.claude/commands/

# Register hooks in ~/.claude/settings.json (see hooks/ for JSON config)

Requirements

Claude Code (Opus 4.6 or 4.7)
ruff installed on host (pip install ruff) — for quality-gate hook
superpowers plugin — TDD, debugging, planning skills
claudemem — cross-session memory (recommended)

Usage

/dev "add dark mode with system preference detection"
/dev --deep "redesign payment processing pipeline"
/dev "fix the 500 error in video upload endpoint"
/dev "investigate how our auth system handles token rotation"

The orchestrator auto-detects intent and depth. Use --deep to force agent teams and ScheduleWakeup loops.

Design Principles

Deterministic enforcement > prompt guidance (Microsoft AGT: 0% violation vs 27%)
Progressive enforcement: observe → measure → selectively enforce (ARMO)
Design for obsolescence: every hook has an explicit "when to remove" condition
Feedforward + feedback: rules guide behavior, hooks catch mistakes (Martin Fowler)
File-backed state > in-context state: JSON progress files survive everything

Research Basis

v4 was designed from 19 primary sources:

Anthropic: 4 engineering blogs (harnesses, managed agents, long-running apps, hook spec)
OpenAI: harness engineering blog
Academic: 5 arXiv papers (Dive into Claude Code, NLAH, Inside the Scaffold, VeRO, OpenDev)
Industry: Stripe Minions (1,000+ PRs/week), Herashchenko, Sourcegraph, Vercel
Experts: Hashimoto (coined "harness engineering"), Willison, Fowler
Governance: Microsoft AGT (0% violation), EU AI Act, NIST, Singapore IMDA

Roadmap

v1.0 — Core orchestrator, 3 tiers, 7 verification rules
v2.0 — TDD Protocol + verify-dev.sh enforcement
v2.1 — Task routing (DEVELOP/RESEARCH/CICD) + status blocks
v3.0 — Zero-checkpoint continuity contract + ScheduleWakeup loops
v4.0 — 4-layer harness architecture (hooks + rules + state + skill)
v4.1 — Compliance measurement baseline (skill-comply)
v4.2 — Instruction fade-out countermeasure (periodic additionalContext)
v4.3 — /dev skill ↔ dev-progress.json deep integration
v5.0 — Adaptive harness (progressive obsolescence as models improve)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.claude-plugin		.claude-plugin
.claude		.claude
commands		commands
hooks		hooks
openspec/changes/v2-self-enforcing-protocol		openspec/changes/v2-self-enforcing-protocol
rules/dev-workflow		rules/dev-workflow
scripts		scripts
skills/dev-orchestrator		skills/dev-orchestrator
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

/dev — AI Development Orchestrator

Architecture: 4-Layer Enforcement

Version History

How It Works

Intent Routing (from ReSpecV philosophy)

Core Phases

Deep Mode (`--deep`)

Hooks (Deterministic Enforcement)

`dev-verify-on-commit.sh` (PreToolUse:Bash)

`dev-quality-gate.sh` (PostToolUse:Edit|Write)

`dev-session-start.sh` (SessionStart)

`dev-session-end.sh` (Stop)

Rules (Compact-Safe Guidance)

Cross-Session State

Wrapup Retrospective

Installation

Option A: Via claude-code-config (recommended for existing users)

Option B: Manual Install

Requirements

Usage

Design Principles

Research Basis

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

/dev — AI Development Orchestrator

Architecture: 4-Layer Enforcement

Version History

How It Works

Intent Routing (from ReSpecV philosophy)

Core Phases

Deep Mode (--deep)

Hooks (Deterministic Enforcement)

dev-verify-on-commit.sh (PreToolUse:Bash)

dev-quality-gate.sh (PostToolUse:Edit|Write)

dev-session-start.sh (SessionStart)

dev-session-end.sh (Stop)

Rules (Compact-Safe Guidance)

Cross-Session State

Wrapup Retrospective

Installation

Option A: Via claude-code-config (recommended for existing users)

Option B: Manual Install

Requirements

Usage

Design Principles

Research Basis

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Deep Mode (`--deep`)

`dev-verify-on-commit.sh` (PreToolUse:Bash)

`dev-quality-gate.sh` (PostToolUse:Edit|Write)

`dev-session-start.sh` (SessionStart)

`dev-session-end.sh` (Stop)

Packages