feat(SYMPH-145): Wire analysis data and update pipeline#69
Open
ericlitman wants to merge 142 commits intoOasAIStudio:mainfrom
Open
feat(SYMPH-145): Wire analysis data and update pipeline#69ericlitman wants to merge 142 commits intoOasAIStudio:mainfrom
ericlitman wants to merge 142 commits intoOasAIStudio:mainfrom
Conversation
Three extensions to symphony-ts for multi-model autonomous pipeline: 1. Runner abstraction (Task 5.1): Extract runner interface from Codex client, add ClaudeCodeRunner and GeminiRunner via Vercel AI SDK providers. Per-state runner selection via YAML config. 2. State machine (Task 5.2): Multi-stage workflows with typed transitions (agent/gate/terminal stages), per-stage config overrides, rework loops with configurable limits. Backward compatible — no stages = flat dispatch. 3. Ensemble gate (Task 5.3): Gate stages spawn parallel review agents, collect two-layer verdicts (JSON gate + plain text feedback), aggregate results, post to Linear. Supports human and automated gates. 77 new tests (217 total), typecheck clean.
Pipeline configuration for the founder's autonomous dev pipeline: - WORKFLOW.md with YAML frontmatter: Linear tracker, 5-stage state machine (investigate → implement → review → merge → done), per-stage runner/model overrides, ensemble gate with Codex + Gemini reviewers - 6 LiquidJS prompt templates: global clauses (headless mode, scope discipline, design references, verify lines, $BASE_URL), investigate, implement, review-adversarial, review-security, merge - Hook scripts: after-create (git clone + install), before-run (fetch + rebase) - validate.sh: config validation (YAML parsing, file checks, stage flow)
- Gemini runner: replace static import with lazy dynamic import() for ESM-only ai-sdk-provider-gemini-cli (require() returns empty module) - Claude Code runner: add model ID mapping (claude-sonnet-4-5 → sonnet) so YAML config can use standard Anthropic model names - Claude Code runner: add AbortController to generateText() calls for subprocess cleanup on close() - Add @ai-sdk/provider + transitive deps to package.json - Add integration smoke test (skipped by default, RUN_INTEGRATION=1) - 9 new unit tests for model mapping, abort, and provider behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code runner was spawning without bypassPermissions, causing the agent to stall waiting for interactive permission approval in headless mode. Adding permissionMode: "bypassPermissions" fixes E2E dispatch. Also adds WORKFLOW-flat.md for flat (no state machine) E2E testing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pace cleanup Three blocking issues prevented multi-stage pipelines from completing: 1. Stage transitions never fired: advanceStage only runs in onWorkerExit after ALL turns complete. Added [STAGE_COMPLETE] sentinel detection in the turn loop for early exit when agents signal stage completion. 2. Continuation turns lost stage context: buildContinuationPrompt had no stageName parameter. Added stage-aware continuation prompts with per-stage constraints (investigate/implement/merge). 3. Stale workspaces from prior runs: afterCreate hook only fires on new directories. Added workspace cleanup on fresh dispatch (attempt=null) before createForIssue. Also includes: stage-aware beforeRun hook (skip rebase on feature branches), runner/model overrides from stage config, Linear comment posting for investigation notes, and WORKFLOW-staged.md with full stage definitions. 234 tests passing, build clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use trimEnd().endsWith() instead of includes() to prevent false early exit when an agent mentions [STAGE_COMPLETE] conversationally rather than as the final output signal. Found by: Gemini (P1 → triaged as P2) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two P1 findings from Codex R2: 1. Workspace cleanup now only fires on initial stage (investigate) of staged pipelines. Flat dispatch and non-initial stages preserve existing workspaces, preventing data loss after service restarts. 2. Gate stages now claim the issue before firing ensemble review, preventing duplicate gate dispatch on subsequent poll ticks. Gate handler errors release the claim for retry. Found by: Codex (2 P1s) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…L thresholds Reviewers were operating blind (only saw issue metadata, no code). Now gate-handler fetches `git diff origin/main...HEAD` and includes it in the reviewer prompt. Reviewer `prompt` field renders as inline Review Focus instructions. Both Gemini reviewers in WORKFLOW-staged.md now have explicit PASS/FAIL criteria to prevent overly strict rework loops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ture failures Gemini rate limits (429) were treated as code review FAILs, causing infinite rework loops even when reviewers that ran approved the code. Changes: - runSingleReviewer retries up to 3 times with exponential backoff - Infrastructure failures return verdict "error" instead of "fail" - aggregateVerdicts ignores "error" results (only counts real pass/fail) - All-error still returns FAIL (can't skip review entirely) - formatGateComment shows "ERROR" label for infrastructure failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: stage machine execution — early exit, stage-aware prompts, workspace cleanup
B1: parseReviewerOutput now detects rate-limit text in 200 responses (e.g. "You have exhausted your capacity") and returns verdict "error" instead of "fail", preventing false rework loops. B4: handleEnsembleGate now posts a Linear comment when max rework attempts are exceeded, so the issue doesn't silently stall. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adversarial review P2: empty catch {} in escalation path swallowed
errors silently. Now logs console.warn with issue identifier and error.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: rate-limit text detection + escalation comment for ensemble gate
Add linear_state field to stage definitions and escalation_state to top-level config. The orchestrator now updates Linear issue states on stage dispatch (In Progress), gate entry (In Review), and escalation (Blocked). Includes WORKFLOW-staged.md config and active_states fix for In Review. 253 tests, typecheck clean. E2E validated: MOB-28 Todo→In Progress→In Review→Done. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#4) * feat: workpad system — structured progress tracking on Linear tickets Phase 11: Add structured workpad comments to Linear issues with sync_workpad dynamic tool for token-efficient updates, fileUpload media flow, and stage-specific workpad behavior (investigate creates, implement updates, merge finalizes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: R1 adversarial review — 1 P2 + test coverage gaps - Check commentUpdate.success in response (Codex finding) - Add 3 tests: missing comment field, empty id, update success=false (Sonnet finding) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: failure signal parsing — route agent failures by class (verify/review/spec/infra)
* fix: R1 adversarial review — escalation side effects + state corruption guard + empty message fallback
* fix: R1 adversarial review — 2 P1s + 1 P2
P1: Persist spec-failure escalations to tracker (updateIssueState + postComment)
P1: Persist review-escalation side effects when max rework exceeded
P2: Add test verifying reworkCount threading to spawnWorker
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: R2 adversarial review — prevent redispatch of escalated issues
* feat: max retry safety net + review rework routing (Wave 3)
- scheduleRetry bounded by maxRetryAttempts (default 5), escalates to Blocked
- Continuation retries exempt from limit (delayType: "continuation")
- handleReviewFailure routes through downstream gate's onRework
- onRework field on StageDefinition for YAML-driven rework targets
- Escalation fires side effects (updateIssueState + postComment)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: R1 adversarial review — 2 P1s + 1 P2
P1: Agent runner breaks early on [STAGE_FAILED: ...] signals (Codex finding)
- Without this, multi-turn agents could overwrite the failure signal,
and the orchestrator would never see it.
P1: completed set no longer permanently blocks resume (Codex finding)
- Issues in escalation state (Blocked) remain blocked.
- Issues moved to any other active state (Resume, Todo) get cleared
from completed and re-dispatched.
P2: lastCodexMessage empty string now filtered (Gemini finding)
- Both lastTurnMessage and lastCodexMessage check for empty strings
before being passed as agentMessage.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: R2 adversarial review — findDownstreamGate includes agent stages with onRework
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nce) (#8) * feat: failure signal parsing — route agent failures by class (verify/review/spec/infra) * fix: R1 adversarial review — escalation side effects + state corruption guard + empty message fallback * fix: R1 adversarial review — 2 P1s + 1 P2 P1: Persist spec-failure escalations to tracker (updateIssueState + postComment) P1: Persist review-escalation side effects when max rework exceeded P2: Add test verifying reworkCount threading to spawnWorker Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: R2 adversarial review — prevent redispatch of escalated issues * feat: max retry safety net + review rework routing (Wave 3) - scheduleRetry bounded by maxRetryAttempts (default 5), escalates to Blocked - Continuation retries exempt from limit (delayType: "continuation") - handleReviewFailure routes through downstream gate's onRework - onRework field on StageDefinition for YAML-driven rework targets - Escalation fires side effects (updateIssueState + postComment) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: R1 adversarial review — 2 P1s + 1 P2 P1: Agent runner breaks early on [STAGE_FAILED: ...] signals (Codex finding) - Without this, multi-turn agents could overwrite the failure signal, and the orchestrator would never see it. P1: completed set no longer permanently blocks resume (Codex finding) - Issues in escalation state (Blocked) remain blocked. - Issues moved to any other active state (Resume, Todo) get cleared from completed and re-dispatched. P2: lastCodexMessage empty string now filtered (Gemini finding) - Both lastTurnMessage and lastCodexMessage check for empty strings before being passed as agentMessage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: R2 adversarial review — findDownstreamGate includes agent stages with onRework Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent completed issues from being re-dispatched after merge Council Review Run 3 found that merged issues get re-dispatched because: 1. The merge/done stages had no linear_state, so the issue stayed "In Review" on Linear after completing the pipeline 2. The resume logic cleared the completed flag for ANY non-escalation active state, including "In Review" Two fixes (defense in depth): - Tighten resume guard: only "Resume" and "Todo" states clear completed flag - Add linear_state: Done to the terminal stage so issues move to "Done" on Linear when the pipeline finishes - advanceStage now fires updateIssueState for terminal stages with linearState 7 new regression tests covering all resume-guard scenarios and terminal linearState behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: council R1 — add updateIssueState to dispatchIssue terminal path Council review found that the gate-to-terminal path in dispatchIssue() was missing the updateIssueState call, making the linear_state: Done config dead code for gate-based workflows. Every successfully merged issue hits this path (gate approval → continuation → dispatchIssue → terminal short-circuit) and would never update the tracker to "Done". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Phase 17 — stall timeout, heartbeat, turn_failed race, graceful shutdown (#7) * fix: Phase 17 — stall timeout, heartbeat, turn_failed race, graceful shutdown - Add stall_timeout_ms: 900000 (15min) to WORKFLOW-staged.md config - Add workspace file-change heartbeat to ClaudeCodeRunner (polls dir mtime every 5s, emits activity_heartbeat events to reset stall timer) - Fix turn_failed race in agent runner (check lastTurn.status after signal checks) - Fix graceful shutdown race in runtime-host (move resolveExit after waitForIdle, add pendingExitCode tracking, add agent_runner_starting/error diagnostic logs) 328 tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: council R1 — heartbeat polls .git/index, guard catches all non-completed, fix test - Heartbeat: poll .git/index mtime instead of workspace root dir (detects git staging/commits, not just root-level file creation) - Guard: change `status === "failed"` to `status !== "completed"` to also catch `cancelled` turns - Test: fix misleading test that used `completed` status when claiming to test `failed` + STAGE_FAILED interaction Council review: 3 P2s found, 3 fixed. Cross-exam eliminated 3/10 findings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: Phase 20 — stress test bugs (heartbeat, merge race, hook resilience) 1. Heartbeat blind spot: ClaudeCodeRunner now watches workspace dir mtime alongside .git/index so review agents that never touch git still emit heartbeats and avoid stall timeout kills. 2. Merge abort race: reconcileRunningIssues() now skips terminal_state stop requests for workers in the final active stage (whose onComplete target is terminal). Prevents killing merge agents mid-flight. 3. beforeRun hook resilience: git fetch retries 3x with git lock handling, rebase is best-effort with abort fallback. Hook no longer fails the stage on git contention. 4. Stall timeout bumped from 15min to 30min. 332 tests (4 new), typecheck clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Closes open PRs and deletes remote branches when symphony-ts removes a workspace, preventing orphaned branches. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#9) Extend CodexUsage interface with optional cacheReadTokens, cacheWriteTokens, noCacheTokens, and reasoningTokens fields. Extract these from the AI SDK provider's inputTokenDetails/outputTokenDetails. Add the 4 new fields to LOG_FIELDS, emit them conditionally in structured logs, accumulate them in LiveSession and CodexTotals, and add tests verifying extraction, accumulation, and absence when the provider doesn't report them. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
… token accounting (#10) Emit a stage_completed log entry when a worker finishes, capturing accumulated LiveSession token counts (input, output, total, cache, reasoning), turn count, duration, and stage name. Adds stage_name and turns_used to LOG_FIELDS and stage_completed to ORCHESTRATOR_EVENTS. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
… 28) Restructured Linear into 7 teams (SYMPH, JONY, HSDATA, HSUI, HSMOB, STICK, HOUSE) with distinct issue prefixes per product. Each team has a Pipeline project with unique slugId. Added per-product WORKFLOW files, a WORKFLOW template for onboarding new products, and a launcher script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rement (#11) - Add promptChars and estimatedPromptTokens fields to AgentRunnerEvent - Measure rendered prompt size in runner.ts turn loop before startSession/continueTurn - Add turn_number, prompt_chars, estimated_prompt_tokens to LOG_FIELDS - Log turn_number, prompt_chars, estimated_prompt_tokens in logAgentEvent (runtime-host.ts) - Add tests verifying prompt size fields are correct and turn 1 > turn 2 for long templates - Add tests verifying new fields appear in structured log entries Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…l shutdown with worker abort and bounded timeout (#12) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…summary (#13) - Add 'shutdown_complete' to ORCHESTRATOR_EVENTS in src/domain/model.ts - Add 'workers_aborted' and 'timed_out' to LOG_FIELDS in src/logging/fields.ts - Change abortAllWorkers() to return the count of workers aborted - Track shutdownStart, workersAborted, and timedOut flag in shutdown() - Emit shutdown_complete log event after Promise.allSettled with workers_aborted, timed_out, and duration_ms fields - Add two new tests: shutdown_complete logged with correct fields, and timed_out=true when timeout fires Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…atch and reconciliation summary (#14) - Add poll_tick_completed to ORCHESTRATOR_EVENTS in src/domain/model.ts - Add dispatched_count, running_count, reconciled_stop_requests to LOG_FIELDS in src/logging/fields.ts - Extend PollTickResult with runningCount in src/orchestrator/core.ts - Add duration timing around pollOnce() in runPollCycle() - Emit poll_tick_completed info log with dispatched_count, running_count, reconciled_stop_requests, duration_ms - Add tests for poll_tick_completed event emission and dispatched_count accuracy Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…atch and reconciliation summary (#15) - poll_tick_completed already in ORCHESTRATOR_EVENTS (src/domain/model.ts) - dispatched_count, running_count, reconciled_stop_requests in LOG_FIELDS (src/logging/fields.ts) - logPollCycleResult() emits poll_tick_completed with dispatched_count, running_count, reconciled_stop_requests, duration_ms (src/orchestrator/runtime-host.ts) - Duration timing around pollOnce() in runPollCycle() - PollTickResult includes dispatchedIssueIds, runningCount, stopRequests (src/orchestrator/core.ts) - Tests: poll_tick_completed event logged after successful poll; dispatched_count reflects dispatched issues - Update conformance-test-matrix.md to document poll_tick_completed observability coverage Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…vent (#16) Add five new fields to the stage_completed structured log event that give operators a single-event view of total token cost per pipeline stage: - total_input_tokens: sum of per-turn input tokens across the stage - total_output_tokens: sum of per-turn output tokens across the stage - total_cache_read_tokens: accumulated cache-read tokens across all turns - total_cache_write_tokens: accumulated cache-write tokens across all turns - turn_count: number of turns executed in the stage New LiveSession fields codexTotalInputTokens and codexTotalOutputTokens accumulate turn-level deltas. Per-turn deltas are computed correctly by resetting lastReported* counters on session_started, so each turn's absolute counter starts from zero for delta computation. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(SYMPH-8): add accumulator fields and summation logic for stage-level token totals Add totalStageInputTokens, totalStageOutputTokens, totalStageTotalTokens, totalStageCacheReadTokens, and totalStageCacheWriteTokens to LiveSession. Accumulate turn deltas into these fields in applyCodexEventToSession(). Add single-turn, multi-turn, and zero-turn tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(SYMPH-8): add missing totalStage accumulator fields to runtime-host test fixtures TypeScript compilation was failing because LiveSession object literals in runtime-host tests were missing the new totalStage* fields added to the LiveSession interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…#19) Update total_input_tokens, total_output_tokens to use totalStage* accumulator fields. Add total_total_tokens. Make total_cache_read_tokens and total_cache_write_tokens conditional (omitted when zero) using totalStage* accumulators. Existing input_tokens, output_tokens, total_tokens preserve last-turn semantics. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…field (#20) Add StageRecord and ExecutionHistory interfaces to model.ts. Add issueExecutionHistory: Record<string, ExecutionHistory> to OrchestratorState and initialize it as {} in createInitialOrchestratorState. Add a thin scripts/test.mjs wrapper to translate --grep to vitest's -t flag so mocha-compatible verify commands work. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
sops couldn't find the age key on the server because the env var wasn't set and sops didn't check the default XDG path. Set the default to ~/.config/sops/age/keys.txt if not already provided.
…#111) xargs strips double quotes from values like CHANNEL_PROJECT_MAP JSON, causing "Expected property name" parse errors at runtime. Replace with bash-native whitespace trimming that preserves all characters. Affects both symphony-ctl and slack-bridge-ctl generate_env_dict(), plus the .env loading in symphony-ctl cmd_cleanup(). Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-116): Create symphony-onboard script Add ops/symphony-onboard script to automate onboarding new projects into the Symphony pipeline. Creates WORKFLOW file from template, generates CLAUDE.md in target repo, copies CI minimal workflow, and configures GitHub merge queue via Rulesets API. All steps are idempotent and support dry-run mode. Also adds two new template files: - pipeline-config/templates/ci-minimal.yml: generic CI workflow with package manager auto-detection - pipeline-config/templates/CLAUDE.md.tmpl: CLAUDE.md template with sed-friendly substitution placeholders Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-116): Fix Linear API query in symphony-onboard The `linear api` CLI expects a plain GraphQL query piped via stdin with --variables-json for variables, not a JSON body. The previous approach would fail with a GraphQL validation error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
#115) Add pipeline-config/ports.json as central port registry with unique per-product ports (4321-4328). Update all 8 WORKFLOW files to use their assigned ports instead of the shared 4321. Update run-pipeline.sh to read ports.json via jq and inject --port before user args. Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace stale Finesssee linear-cli references with correct schpet v1.11.1 syntax. Syncs SKILL.md, freeze-and-queue.sh, and adds missing gates/, references/, and script files from canonical source. - Binary: `linear` (not `linear-cli`) - API syntax: `linear api` (not `linear api query`) - Variable flag: `--variable key=val` (not `-v`) - No `-o json`, `--quiet`, `--compact` (Finesssee-only flags) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(dashboard): observability v3 — live activity, health, tokens - Broaden event tracking to 6 types (was only approval_auto_approved) - Stage-aware health thresholds (investigate=600s, implement=480s, review=600s, merge=300s) with graduated green/yellow/red - Fix token extraction: handle missing totalTokens, extract cache and reasoning token fields - Richer activity display: tool name + context + tokens + relative time - Expanded WorkerExitOutcome: failed_to_start, timed_out, error (replaces uninformative "abnormal" label) - Pipeline-level activity events: stage transitions, state changes, session starts — activity feed never empty for active sessions 12 files changed, 750 insertions(+), 56 deletions(-) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: council R1 — 0 P1s + 4 P2s - Tighten generic usage alias guard (require all 3 fields for input/output/total) - Fix stage_transition dead code (pass session to advanceStage directly) - Remove duplicate token display in turn events (let token badge handle it) - Add missing classifyExitOutcome tests (timed_out, error, passthrough) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(notifications): add Slack pipeline notifications (D56) Add per-product Slack channel notifications for high-value pipeline events: issue completed, issue failed, stall killed, infra error, pipeline started/stopped. Fires from runtime-host (integration layer), never from OrchestratorCore (pure state machine). Best-effort delivery via @slack/web-api — failures logged and swallowed, never affect pipeline correctness. Key design: - Pre-captures execution history before onWorkerExit() (which deletes it) - Guards against completed-vs-continuation false positives - Priority ordering: failed > stall > infra > completed - Config: WORKFLOW `slack_notify_channel` field + SLACK_NOTIFY_CHANNEL env fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: council R1 — 0 P1s + 4 P2s Fixes 4 P2 findings from council review on PR #116: 1. CLI now constructs PipelineNotifier when slackNotifyChannel and SLACK_BOT_TOKEN are both present. Previously feature was inert. 2. Re-read execution history after onWorkerExit() appends final stage record. Prevents missing last stage in notifications. 3. Compute durationMs using runAttempt.startedAt (normal case) or runningEntry.startedAt (stall timeout). Fixes 0ms duration for stall-killed workers. 4. Calculate retriesExhausted by comparing capturedRetryAttempt to maxRetryAttempts instead of hardcoding true. Correctly distinguishes spec failures from retry exhaustion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: council R2 — 3 P1s (typecheck, mock interface, multi-stage guard) - Add slackNotifyChannel: null to 9 test config literals (TS2741) - Extract PipelineNotificationSink interface for mock compatibility (TS2739) - Fix multi-stage completion guard: use completed.has() + !hasContinuationRetry instead of delta-based preCompletedHas check (broken for continuations) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors for CI Fix formatting issues flagged by Biome CI check, replace inline import() type expressions that Biome reformats incorrectly with proper top-level type imports, and replace non-null assertion with nullish coalescing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
… safety gate, and parent reference (#119) F1: Scenarios grouped under `### Feature:` headings now match when a task ref uses "<Feature Name> scenarios" pattern, while preserving direct name matching. F2: Tasks with scenario refs that match zero scenarios now cause a hard failure unless `--allow-empty-scenarios` is passed. The error lists unmatched task names. F3: Sub-issue bodies now start with a "Parent spec:" reference line containing the parent identifier/URL (Linear URL in live mode, spec title in dry-run). Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…122) * feat(SYMPH-122): rewrite CLI interface and add missing onboard steps Rewrite symphony-onboard CLI to use --product, --team-key, --team-name, --description (required) with --repo as optional override (default: mobilyze-llc/{product}). Remove --project-slug and --port as flags. Add 6 new steps for an 11-step flow: 1. Duplicate detection (check ports.json for existing product) 2. Linear team creation (idempotent via GraphQL check-before-create) 3. Linear project creation + team linking (idempotent) 4. Port auto-allocation (read ports.json, assign max+1, write entry) 5. run-pipeline.sh auto-registration (insert case entry before catch-all) 6-10. Existing steps (WORKFLOW gen, repo verify, CLAUDE.md, CI, merge queue) 11. Updated summary with all new fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-122): review fixes for symphony-onboard - Add missing linear CLI precondition check - Make duplicate detection run in dry-run mode (read-only check) - Remove dead LINK_QUERY code in step 3 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…#123) - Rename WORKFLOW-TOYS.md → WORKFLOW-toys.md (git mv) - Change "TOYS": 4328 → "toys": 4328 in ports.json - Add toys) case entry to run-pipeline.sh with WORKFLOW path and DEFAULT_REPO_URL - Add toys to help text and error message product list Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ons (#124) * feat(SYMPH-125): fix stale notification history for terminal completions advanceStage() deletes issueExecutionHistory synchronously during terminal transitions, so the postHistory re-read in runtime-host falls back to stale preHistory that's missing the final stage record. Fix: snapshot execution history in onWorkerExit after appending the stage record but before advanceStage deletes it. Expose via consumeExitHistorySnapshot() so runtime-host reads the correct history for terminal completion notifications. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-125): apply biome formatting to runtime-host.ts Collapse multi-line bracket access into single line to satisfy biome formatter. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-126): fix pipeline_stopped completedCount overcounting Remove the erroneous `state.completed.add()` call on the continuation path in `handleWorkerExit()`. Previously, issues that were merely advancing between stages (continuations) were added to `completed`, inflating `pipeline_stopped.completedCount`. Now only terminally completed issues are added to `completed`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-126): fix lint error - use literal key for retryAttempts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add `env` to `StartCliHostInput` so `startCliHost` reads `SLACK_BOT_TOKEN` from the resolved env instead of `process.env`. This enables proper dependency injection in tests and custom environments. Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-129): token history extraction + JSONL persistence Add core data pipeline for extracting stage_completed events from symphony.jsonl logs, enriching with Linear issue titles via CLI, and persisting to $SYMPHONY_HOME/data/token-history.jsonl. - ops/token-report.sh: bash wrapper with lockfile, env defaults, dir tree - ops/token-report.mjs: Node extract subcommand with HWM-based idempotency - Inode-aware and truncation-aware HWM for log rotation support - Partial line safety (discards incomplete trailing lines) - Malformed JSONL lines skipped with stderr warnings - Linear CLI title lookup with filesystem cache + graceful degradation - Config hash snapshots to config-history.jsonl each run - 9 tests covering all specified scenarios Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-129): fix biome lint errors (template literals + import ordering) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…hed hypotheses (#128) * feat(SYMPH-130): trend analysis + outlier detection with Linear-enriched hypotheses Implement the `analyze` subcommand in ops/token-report.mjs that reads token-history.jsonl and config-history.jsonl to produce structured analysis JSON covering: - Efficiency scorecard (cache efficiency, output ratio, wasted context, tokens/turn, first-pass rate, failure rate) with current/7d/30d trends - WoW delta computation for executive summary KPIs - Per-stage utilization trends with config-change markers - Per-ticket cost trends (median + mean) - Per-product token spend breakdown - Inflection detection with ticket-mix and config-change attribution - Outlier detection (>2σ) with Linear parent-spec hypothesis generation - Cold-start graceful degradation (<7d, 7-29d, >=30d tiers) Failed stages are included in spend but excluded from efficiency metrics. Linear CLI calls are cached and only made for outlier issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-130): apply biome formatting fixes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…aunchd (#129) * feat(SYMPH-131): HTML report + Slack digest + daily orchestration + launchd Add four new subcommands to token-report tooling: - render: generates self-contained dark-theme HTML report with inline SVG charts (sparklines via <polyline>, multi-line trend charts, WCAG AA contrast) - slack: posts ≤15-line digest to $SLACK_WEBHOOK_URL with graceful degradation - rotate: compresses JSONL >7d, deletes >14d, removes HTML reports >90d (replaces newsyslog config), with mtime <2h safety guard - daily: orchestrates extract→analyze→render→slack→rotate with proper failure semantics (slack failure is non-fatal, rotate failure exits non-zero) Also adds: - com.symphony.token-report.plist: daily cron at 06:00 local (America/New_York) - com.symphony.report-server.plist: static file server with KeepAlive: true - Deprecation notice on com.symphony.newsyslog.conf - Concurrent execution guard via lockfile in token-report.sh - Refactored runAnalyze() into computeAnalysis() + runAnalyze() for reuse Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(lint): apply biome auto-fixes for style violations Fix lint/style/useTemplate, lint/style/useConst, lint/style/noDelete, and formatting issues in token-report.mjs and token-report.test.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…130) * feat(SYMPH-134): remove synthetic noise and widen context extraction Remove synthetic session_start and state_change activity entries from orchestrator dispatch. Add unknown tool context extraction that shows the first string-valued argument truncated to 60 chars. Update dashboard empty state text to "Waiting for agent activity..." in both server-side and client-side copies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-134): fix biome formatter for unknown tool test assertion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-134): remove synthetic noise and widen context extraction Remove synthetic session_start and state_change activity entries from orchestrator dispatch. Add unknown tool context extraction that shows the first string-valued argument truncated to 60 chars. Update dashboard empty state text to "Waiting for agent activity..." in both server-side and client-side copies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-134): fix biome formatter for unknown tool test assertion Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(SYMPH-135): show last tool call in dashboard activity column Add last_tool_call field to RuntimeSnapshotRunningRow, derived from the most recent entry in the recentActivity ring buffer. Update both server-side TypeScript and client-side JavaScript rendering to prefer last_tool_call over activity_summary in the activity column. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-135): address biome lint and format issues Replace non-null assertion with explicit undefined check in deriveLastToolCall, and fix line-length formatting for nullish coalescing chains in dashboard-render. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-135): fix biome formatting in runtime-snapshot test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…132) * feat(SYMPH-139): replace Slack message builder with narrative digest Replace the ≤15-line message builder in runSlack() with a 9-section narrative markdown digest that surfaces all analysis data: 1. Title, 2. Executive Summary, 3. Tokens per Issue, 4. Efficiency Scorecard, 5. Per-Stage Spend, 6. Per-Product Breakdown, 7. Outliers, 8. Trend Inflections, 9. Report Link Additional changes: - Switch from webhook to Bot Token API (chat.postMessage) - Fix URL construction: single code path using BASE_URL with pro16.local:{port} fallback (no more redundant conditional) - Add DRY_RUN env var: logs digest to stderr instead of posting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-139): apply Biome format fixes to narrative digest Fix lint errors flagged by Biome CI: - Merge template literal string concatenation into single template - Replace unused template literals (no interpolation) with string literals - Reformat .slice().map() chains per Biome style Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-141): Scaffold Vite+ React project for token report UI Set up ops/token-report-ui/ with Vite+ (vp create vite:application), configured for React + TypeScript. Build produces a single self-contained index.html with all assets inlined via vite-plugin-singlefile. Includes sample analysis.json with realistic data matching computeAnalysis() shape. All vp commands verified: dev, build, test, check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-141): fix biome lint/format issues in token-report-ui scaffold Fix import ordering in App.test.tsx and vite.config.ts, reformat App.tsx and analysis.json to comply with biome line-width rules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…iving (#133) Add optional inputTokens and outputTokens fields to StageRecord interface and populate them from the running entry's stage accumulators when a stage is archived in onWorkerExit. Update both server-side and client-side dashboard rendering to display the new columns. Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nts (#136) Convert the 10 visual sections from renderHtml() in ops/token-report.mjs into individual JSX component files under pipeline-config/design-refs/ token-report-v2/. These serve as the design reference bundle that implementors will use to build the React-based report (replacing the HTML template string approach). Components: ReportHeader, ExecutiveSummary, EfficiencyScorecard, PerStageTrend, PerTicketCostTrend, OutlierAnalysis, IssueLeaderboard, StageEfficiency, PerProductBreakdown, ReportFooter + chartUtils + barrel. Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ot and update dashboard (#137) * feat(SYMPH-148): compute cumulative pipeline tokens in runtime snapshot and update dashboard Add per-type cumulative pipeline token fields (input, output, cache_read, cache_write) to the runtime snapshot, extending the existing total_pipeline_tokens pattern. Update both SSR and client-side dashboard rendering to display cumulative values in the token breakdown panel. - Add cacheReadTokens/cacheWriteTokens to StageRecord interface (optional for backward compat) - Populate cache token fields in StageRecord creation in core.ts - Add pipeline_tokens object to RuntimeSnapshotRunningRow with per-type cumulative sums - Update both SSR TypeScript and client-side JS token breakdown panels - Add tests for cumulative totals, stage transitions, first stage, and backward compat Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-148): apply biome formatter to pass CI lint check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-149): add failure diagnostics to orchestrator and agent runner - Add failureReason field to RunningEntry interface for capturing dispatch errors - Enhance dispatchIssue catch block to log error message + stack trace at WARN level with issue identifier for correlation - Capture error in transient running entry failureReason before scheduling retry - Add workspace path diagnostic logging to agent runner initialization - Add CC process PID logging on session_started event in agent runner - Add tests for dispatch failure error capture and retry entry population - Add tests for agent runner workspace path and PID startup diagnostics Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(SYMPH-143): convert Paper JSX design refs into typed React components Convert the 10 design-reference JSX sections into TypeScript React components under src/components/. Add types.ts with interfaces matching the analysis.json shape, chartUtils.tsx with shared SVG utilities, and barrel export with designTokens and reportCSS. Update App.tsx to compose all 10 sections with data mapping from raw analysis.json. Extend tests to cover each section's render (17 tests total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-143): fix Biome lint errors (formatting, SVG a11y, array index keys) - Format all component files to pass biome formatter (line length, import sort) - Add aria-label/role="img" to SVG elements in chartUtils.tsx (a11y) - Replace array index keys with stable keys in OutlierAnalysis and PerStageTrend - Add biome-ignore for intentional dangerouslySetInnerHTML CSS injection in App.tsx Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(SYMPH-143): fix formatting in core.test.ts to pass biome lint Pre-existing formatting issue on branch: merge two lines per biome formatter. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When the afterCreate hook threw (e.g. git clone race condition), the empty workspace directory persisted. On retry, ensureWorkspaceDirectory saw the directory existed and returned createdNow=false, causing the hook to be skipped entirely. The agent then ran against an empty workspace with no repo clone, burning tokens uselessly. Now the catch block inside the `if (createdNow)` branch removes the directory (best-effort) before re-throwing, so the next retry sees a missing directory, re-creates it, and re-runs the hook. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…139) createWorkspaceHookLogger received stdout and stderr from hook execution but never forwarded them to the StructuredLogger metadata object. This made it impossible to diagnose hook failures from logs since both fields were always null. Add conditional spread for stdout and stderr (only when non-empty) using the same pattern as other optional fields, and export the function for direct testability. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(SYMPH-144): Build SVG chart components for stage utilization and ticket cost Add StageUtilizationChart (stacked area/line with date X-axis), TicketCostChart (line chart with median/mean reference lines), and shared chart-utils library. Pure SVG, no external charting dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(SYMPH-144): fix biome lint/format errors in chart components - Use Math.PI instead of approximate constant in round() tests - Fix import ordering (value imports before type imports) - Reformat long lines per biome line-length rules - Replace array index keys with value-based keys (noArrayIndexKey) - Expand inline JSX attributes to multi-line per formatter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claw Dilize <clawdilize@pro16.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The afterCreate workspace hook requires REPO_URL to clone the target repo. When launched via symphony-ctl (launchd), env vars come from .env — which was missing REPO_URL, causing every hook invocation to fail silently with exit code 1. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- runAnalyze() now writes analysis.json to $SYMPHONY_HOME/data/ in addition to stdout - Move fixture from src/analysis.json to src/data/analysis.json, update imports - Add build_report() to token-report.sh: copies analysis.json → React src, runs pnpm build, copies dist/index.html to reports/ - Shell render subcommand now uses React build pipeline instead of legacy renderHtml() - Daily pipeline uses build_report() instead of mjs render subcommand - Deprecate renderHtml() with @deprecated JSDoc tag - Make StageTrend.wow_delta optional, add config_changes to match real computeAnalysis() output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move analysisData import after components/index.ts to satisfy biome organizeImports alphabetical ordering rule. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
runAnalyze()inops/token-report.mjsto writeanalysis.jsonto~/.symphony/data/src/analysis.json→src/data/analysis.jsonin React app; update import pathsops/token-report.shto add build step (copies analysis.json, runspnpm build) and copydist/index.htmlto reports directory in daily pipelinerenderHtml()with@deprecatedJSDoc commentTest plan
node ops/token-report.mjs analyze→ verify~/.symphony/data/analysis.jsonexists with keysexecutive_summary,efficiency_scorecard,per_stage_spendbash ops/token-report.sh daily→ verify~/.symphony/reports/$(date +%Y-%m-%d).htmlis generatedCloses SYMPH-145
🤖 Generated with Claude Code