Skip to content

fix(SYMPH-125): fix stale notification history for terminal completions#64

Open
ericlitman wants to merge 124 commits intoOasAIStudio:mainfrom
mobilyze-llc:symph-125/fix-stale-notification-history
Open

fix(SYMPH-125): fix stale notification history for terminal completions#64
ericlitman wants to merge 124 commits intoOasAIStudio:mainfrom
mobilyze-llc:symph-125/fix-stale-notification-history

Conversation

@ericlitman
Copy link
Copy Markdown

Summary

  • Bug: When a stage completed with a terminal transition (e.g. implement → done), advanceStage() deleted issueExecutionHistory before the runtime-host could read the final stage record. Notifications fell back to stale preHistory, sending empty execution history in issue_completed events.
  • Fix: Added lastExitHistorySnapshot map in OrchestratorCore that captures execution history after the stage record push but before advanceStage() clears it. runtime-host now consumes this snapshot first, falling back to the live state map, then preHistory.
  • Tests: Added 2 new tests covering single-stage and multi-stage terminal completion history integrity (31/31 orchestrator tests pass).

Test plan

  • npx vitest run tests/orchestrator/runtime-host.test.ts — 31/31 pass
  • npx tsc --noEmit — clean
  • Pre-existing codex test failures are unrelated

🤖 Generated with Claude Code

ericlitman and others added 30 commits March 16, 2026 20:29
Three extensions to symphony-ts for multi-model autonomous pipeline:

1. Runner abstraction (Task 5.1): Extract runner interface from Codex client,
   add ClaudeCodeRunner and GeminiRunner via Vercel AI SDK providers.
   Per-state runner selection via YAML config.

2. State machine (Task 5.2): Multi-stage workflows with typed transitions
   (agent/gate/terminal stages), per-stage config overrides, rework loops
   with configurable limits. Backward compatible — no stages = flat dispatch.

3. Ensemble gate (Task 5.3): Gate stages spawn parallel review agents,
   collect two-layer verdicts (JSON gate + plain text feedback), aggregate
   results, post to Linear. Supports human and automated gates.

77 new tests (217 total), typecheck clean.
Pipeline configuration for the founder's autonomous dev pipeline:

- WORKFLOW.md with YAML frontmatter: Linear tracker, 5-stage state machine
  (investigate → implement → review → merge → done), per-stage runner/model
  overrides, ensemble gate with Codex + Gemini reviewers
- 6 LiquidJS prompt templates: global clauses (headless mode, scope
  discipline, design references, verify lines, $BASE_URL), investigate,
  implement, review-adversarial, review-security, merge
- Hook scripts: after-create (git clone + install), before-run (fetch + rebase)
- validate.sh: config validation (YAML parsing, file checks, stage flow)
- Gemini runner: replace static import with lazy dynamic import() for
  ESM-only ai-sdk-provider-gemini-cli (require() returns empty module)
- Claude Code runner: add model ID mapping (claude-sonnet-4-5 → sonnet)
  so YAML config can use standard Anthropic model names
- Claude Code runner: add AbortController to generateText() calls for
  subprocess cleanup on close()
- Add @ai-sdk/provider + transitive deps to package.json
- Add integration smoke test (skipped by default, RUN_INTEGRATION=1)
- 9 new unit tests for model mapping, abort, and provider behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code runner was spawning without bypassPermissions, causing the
agent to stall waiting for interactive permission approval in headless
mode. Adding permissionMode: "bypassPermissions" fixes E2E dispatch.

Also adds WORKFLOW-flat.md for flat (no state machine) E2E testing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pace cleanup

Three blocking issues prevented multi-stage pipelines from completing:

1. Stage transitions never fired: advanceStage only runs in onWorkerExit
   after ALL turns complete. Added [STAGE_COMPLETE] sentinel detection in
   the turn loop for early exit when agents signal stage completion.

2. Continuation turns lost stage context: buildContinuationPrompt had no
   stageName parameter. Added stage-aware continuation prompts with
   per-stage constraints (investigate/implement/merge).

3. Stale workspaces from prior runs: afterCreate hook only fires on new
   directories. Added workspace cleanup on fresh dispatch (attempt=null)
   before createForIssue.

Also includes: stage-aware beforeRun hook (skip rebase on feature branches),
runner/model overrides from stage config, Linear comment posting for
investigation notes, and WORKFLOW-staged.md with full stage definitions.

234 tests passing, build clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use trimEnd().endsWith() instead of includes() to prevent false early exit
when an agent mentions [STAGE_COMPLETE] conversationally rather than as the
final output signal.

Found by: Gemini (P1 → triaged as P2)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two P1 findings from Codex R2:

1. Workspace cleanup now only fires on initial stage (investigate) of
   staged pipelines. Flat dispatch and non-initial stages preserve
   existing workspaces, preventing data loss after service restarts.

2. Gate stages now claim the issue before firing ensemble review,
   preventing duplicate gate dispatch on subsequent poll ticks.
   Gate handler errors release the claim for retry.

Found by: Codex (2 P1s)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…L thresholds

Reviewers were operating blind (only saw issue metadata, no code). Now
gate-handler fetches `git diff origin/main...HEAD` and includes it in
the reviewer prompt. Reviewer `prompt` field renders as inline Review
Focus instructions. Both Gemini reviewers in WORKFLOW-staged.md now have
explicit PASS/FAIL criteria to prevent overly strict rework loops.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ture failures

Gemini rate limits (429) were treated as code review FAILs, causing
infinite rework loops even when reviewers that ran approved the code.

Changes:
- runSingleReviewer retries up to 3 times with exponential backoff
- Infrastructure failures return verdict "error" instead of "fail"
- aggregateVerdicts ignores "error" results (only counts real pass/fail)
- All-error still returns FAIL (can't skip review entirely)
- formatGateComment shows "ERROR" label for infrastructure failures

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: stage machine execution — early exit, stage-aware prompts, workspace cleanup
B1: parseReviewerOutput now detects rate-limit text in 200 responses
(e.g. "You have exhausted your capacity") and returns verdict "error"
instead of "fail", preventing false rework loops.

B4: handleEnsembleGate now posts a Linear comment when max rework
attempts are exceeded, so the issue doesn't silently stall.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adversarial review P2: empty catch {} in escalation path swallowed
errors silently. Now logs console.warn with issue identifier and error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix: rate-limit text detection + escalation comment for ensemble gate
Add linear_state field to stage definitions and escalation_state to top-level
config. The orchestrator now updates Linear issue states on stage dispatch
(In Progress), gate entry (In Review), and escalation (Blocked). Includes
WORKFLOW-staged.md config and active_states fix for In Review.

253 tests, typecheck clean. E2E validated: MOB-28 Todo→In Progress→In Review→Done.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#4)

* feat: workpad system — structured progress tracking on Linear tickets

Phase 11: Add structured workpad comments to Linear issues with sync_workpad
dynamic tool for token-efficient updates, fileUpload media flow, and
stage-specific workpad behavior (investigate creates, implement updates,
merge finalizes).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R1 adversarial review — 1 P2 + test coverage gaps

- Check commentUpdate.success in response (Codex finding)
- Add 3 tests: missing comment field, empty id, update success=false (Sonnet finding)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: failure signal parsing — route agent failures by class (verify/review/spec/infra)

* fix: R1 adversarial review — escalation side effects + state corruption guard + empty message fallback

* fix: R1 adversarial review — 2 P1s + 1 P2

P1: Persist spec-failure escalations to tracker (updateIssueState + postComment)
P1: Persist review-escalation side effects when max rework exceeded
P2: Add test verifying reworkCount threading to spawnWorker

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R2 adversarial review — prevent redispatch of escalated issues

* feat: max retry safety net + review rework routing (Wave 3)

- scheduleRetry bounded by maxRetryAttempts (default 5), escalates to Blocked
- Continuation retries exempt from limit (delayType: "continuation")
- handleReviewFailure routes through downstream gate's onRework
- onRework field on StageDefinition for YAML-driven rework targets
- Escalation fires side effects (updateIssueState + postComment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R1 adversarial review — 2 P1s + 1 P2

P1: Agent runner breaks early on [STAGE_FAILED: ...] signals (Codex finding)
  - Without this, multi-turn agents could overwrite the failure signal,
    and the orchestrator would never see it.

P1: completed set no longer permanently blocks resume (Codex finding)
  - Issues in escalation state (Blocked) remain blocked.
  - Issues moved to any other active state (Resume, Todo) get cleared
    from completed and re-dispatched.

P2: lastCodexMessage empty string now filtered (Gemini finding)
  - Both lastTurnMessage and lastCodexMessage check for empty strings
    before being passed as agentMessage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R2 adversarial review — findDownstreamGate includes agent stages with onRework

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…nce) (#8)

* feat: failure signal parsing — route agent failures by class (verify/review/spec/infra)

* fix: R1 adversarial review — escalation side effects + state corruption guard + empty message fallback

* fix: R1 adversarial review — 2 P1s + 1 P2

P1: Persist spec-failure escalations to tracker (updateIssueState + postComment)
P1: Persist review-escalation side effects when max rework exceeded
P2: Add test verifying reworkCount threading to spawnWorker

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R2 adversarial review — prevent redispatch of escalated issues

* feat: max retry safety net + review rework routing (Wave 3)

- scheduleRetry bounded by maxRetryAttempts (default 5), escalates to Blocked
- Continuation retries exempt from limit (delayType: "continuation")
- handleReviewFailure routes through downstream gate's onRework
- onRework field on StageDefinition for YAML-driven rework targets
- Escalation fires side effects (updateIssueState + postComment)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R1 adversarial review — 2 P1s + 1 P2

P1: Agent runner breaks early on [STAGE_FAILED: ...] signals (Codex finding)
  - Without this, multi-turn agents could overwrite the failure signal,
    and the orchestrator would never see it.

P1: completed set no longer permanently blocks resume (Codex finding)
  - Issues in escalation state (Blocked) remain blocked.
  - Issues moved to any other active state (Resume, Todo) get cleared
    from completed and re-dispatched.

P2: lastCodexMessage empty string now filtered (Gemini finding)
  - Both lastTurnMessage and lastCodexMessage check for empty strings
    before being passed as agentMessage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: R2 adversarial review — findDownstreamGate includes agent stages with onRework

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent completed issues from being re-dispatched after merge

Council Review Run 3 found that merged issues get re-dispatched because:
1. The merge/done stages had no linear_state, so the issue stayed "In Review"
   on Linear after completing the pipeline
2. The resume logic cleared the completed flag for ANY non-escalation active
   state, including "In Review"

Two fixes (defense in depth):
- Tighten resume guard: only "Resume" and "Todo" states clear completed flag
- Add linear_state: Done to the terminal stage so issues move to "Done" on
  Linear when the pipeline finishes
- advanceStage now fires updateIssueState for terminal stages with linearState

7 new regression tests covering all resume-guard scenarios and terminal
linearState behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: council R1 — add updateIssueState to dispatchIssue terminal path

Council review found that the gate-to-terminal path in dispatchIssue()
was missing the updateIssueState call, making the linear_state: Done
config dead code for gate-based workflows. Every successfully merged
issue hits this path (gate approval → continuation → dispatchIssue →
terminal short-circuit) and would never update the tracker to "Done".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Phase 17 — stall timeout, heartbeat, turn_failed race, graceful shutdown (#7)

* fix: Phase 17 — stall timeout, heartbeat, turn_failed race, graceful shutdown

- Add stall_timeout_ms: 900000 (15min) to WORKFLOW-staged.md config
- Add workspace file-change heartbeat to ClaudeCodeRunner (polls dir mtime every 5s, emits activity_heartbeat events to reset stall timer)
- Fix turn_failed race in agent runner (check lastTurn.status after signal checks)
- Fix graceful shutdown race in runtime-host (move resolveExit after waitForIdle, add pendingExitCode tracking, add agent_runner_starting/error diagnostic logs)

328 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: council R1 — heartbeat polls .git/index, guard catches all non-completed, fix test

- Heartbeat: poll .git/index mtime instead of workspace root dir (detects
  git staging/commits, not just root-level file creation)
- Guard: change `status === "failed"` to `status !== "completed"` to also
  catch `cancelled` turns
- Test: fix misleading test that used `completed` status when claiming to
  test `failed` + STAGE_FAILED interaction

Council review: 3 P2s found, 3 fixed. Cross-exam eliminated 3/10 findings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Phase 20 — stress test bugs (heartbeat, merge race, hook resilience)

1. Heartbeat blind spot: ClaudeCodeRunner now watches workspace dir mtime
   alongside .git/index so review agents that never touch git still emit
   heartbeats and avoid stall timeout kills.

2. Merge abort race: reconcileRunningIssues() now skips terminal_state
   stop requests for workers in the final active stage (whose onComplete
   target is terminal). Prevents killing merge agents mid-flight.

3. beforeRun hook resilience: git fetch retries 3x with git lock
   handling, rebase is best-effort with abort fallback. Hook no longer
   fails the stage on git contention.

4. Stall timeout bumped from 15min to 30min.

332 tests (4 new), typecheck clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Closes open PRs and deletes remote branches when symphony-ts
removes a workspace, preventing orphaned branches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#9)

Extend CodexUsage interface with optional cacheReadTokens, cacheWriteTokens,
noCacheTokens, and reasoningTokens fields. Extract these from the AI SDK
provider's inputTokenDetails/outputTokenDetails. Add the 4 new fields to
LOG_FIELDS, emit them conditionally in structured logs, accumulate them in
LiveSession and CodexTotals, and add tests verifying extraction, accumulation,
and absence when the provider doesn't report them.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
… token accounting (#10)

Emit a stage_completed log entry when a worker finishes, capturing
accumulated LiveSession token counts (input, output, total, cache, reasoning),
turn count, duration, and stage name. Adds stage_name and turns_used to
LOG_FIELDS and stage_completed to ORCHESTRATOR_EVENTS.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
… 28)

Restructured Linear into 7 teams (SYMPH, JONY, HSDATA, HSUI, HSMOB, STICK,
HOUSE) with distinct issue prefixes per product. Each team has a Pipeline
project with unique slugId. Added per-product WORKFLOW files, a WORKFLOW
template for onboarding new products, and a launcher script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rement (#11)

- Add promptChars and estimatedPromptTokens fields to AgentRunnerEvent
- Measure rendered prompt size in runner.ts turn loop before startSession/continueTurn
- Add turn_number, prompt_chars, estimated_prompt_tokens to LOG_FIELDS
- Log turn_number, prompt_chars, estimated_prompt_tokens in logAgentEvent (runtime-host.ts)
- Add tests verifying prompt size fields are correct and turn 1 > turn 2 for long templates
- Add tests verifying new fields appear in structured log entries

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…l shutdown with worker abort and bounded timeout (#12)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…summary (#13)

- Add 'shutdown_complete' to ORCHESTRATOR_EVENTS in src/domain/model.ts
- Add 'workers_aborted' and 'timed_out' to LOG_FIELDS in src/logging/fields.ts
- Change abortAllWorkers() to return the count of workers aborted
- Track shutdownStart, workersAborted, and timedOut flag in shutdown()
- Emit shutdown_complete log event after Promise.allSettled with workers_aborted, timed_out, and duration_ms fields
- Add two new tests: shutdown_complete logged with correct fields, and timed_out=true when timeout fires

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…atch and reconciliation summary (#14)

- Add poll_tick_completed to ORCHESTRATOR_EVENTS in src/domain/model.ts
- Add dispatched_count, running_count, reconciled_stop_requests to LOG_FIELDS in src/logging/fields.ts
- Extend PollTickResult with runningCount in src/orchestrator/core.ts
- Add duration timing around pollOnce() in runPollCycle()
- Emit poll_tick_completed info log with dispatched_count, running_count, reconciled_stop_requests, duration_ms
- Add tests for poll_tick_completed event emission and dispatched_count accuracy

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…atch and reconciliation summary (#15)

- poll_tick_completed already in ORCHESTRATOR_EVENTS (src/domain/model.ts)
- dispatched_count, running_count, reconciled_stop_requests in LOG_FIELDS (src/logging/fields.ts)
- logPollCycleResult() emits poll_tick_completed with dispatched_count, running_count,
  reconciled_stop_requests, duration_ms (src/orchestrator/runtime-host.ts)
- Duration timing around pollOnce() in runPollCycle()
- PollTickResult includes dispatchedIssueIds, runningCount, stopRequests (src/orchestrator/core.ts)
- Tests: poll_tick_completed event logged after successful poll; dispatched_count reflects dispatched issues
- Update conformance-test-matrix.md to document poll_tick_completed observability coverage

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…vent (#16)

Add five new fields to the stage_completed structured log event that give
operators a single-event view of total token cost per pipeline stage:

- total_input_tokens: sum of per-turn input tokens across the stage
- total_output_tokens: sum of per-turn output tokens across the stage
- total_cache_read_tokens: accumulated cache-read tokens across all turns
- total_cache_write_tokens: accumulated cache-write tokens across all turns
- turn_count: number of turns executed in the stage

New LiveSession fields codexTotalInputTokens and codexTotalOutputTokens
accumulate turn-level deltas. Per-turn deltas are computed correctly by
resetting lastReported* counters on session_started, so each turn's
absolute counter starts from zero for delta computation.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(SYMPH-8): add accumulator fields and summation logic for stage-level token totals

Add totalStageInputTokens, totalStageOutputTokens, totalStageTotalTokens,
totalStageCacheReadTokens, and totalStageCacheWriteTokens to LiveSession.
Accumulate turn deltas into these fields in applyCodexEventToSession().
Add single-turn, multi-turn, and zero-turn tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(SYMPH-8): add missing totalStage accumulator fields to runtime-host test fixtures

TypeScript compilation was failing because LiveSession object literals
in runtime-host tests were missing the new totalStage* fields added
to the LiveSession interface.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…#19)

Update total_input_tokens, total_output_tokens to use totalStage* accumulator
fields. Add total_total_tokens. Make total_cache_read_tokens and
total_cache_write_tokens conditional (omitted when zero) using totalStage*
accumulators. Existing input_tokens, output_tokens, total_tokens preserve
last-turn semantics.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…field (#20)

Add StageRecord and ExecutionHistory interfaces to model.ts. Add
issueExecutionHistory: Record<string, ExecutionHistory> to OrchestratorState
and initialize it as {} in createInitialOrchestratorState. Add a thin
scripts/test.mjs wrapper to translate --grep to vitest's -t flag so
mocha-compatible verify commands work.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
ericlitman and others added 28 commits March 22, 2026 22:33
* feat(SYMPH-81): add CI auto-bump calver workflow

Add a calver version bump step to the post-merge-gate workflow that
runs after all gate checks pass. The version format is YYYY.MM.DD.SEQ
where SEQ increments within the same UTC day and resets to 1 on a new
day. Bump commits include [skip ci] to prevent infinite CI loops.

Disable the release.yml workflow with `if: false` since calver replaces
the tag-based release flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(SYMPH-81): fix calver date format to use zero-padded month/day

Use %m and %d instead of %-m and %-d so version format matches spec
(e.g., 2026.03.22.1 instead of 2026.3.22.1).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-81): harden calver bump with retry loop, permissions, and lockfile safety

- Add `permissions: contents: write` so GITHUB_TOKEN can push to main
- Replace `npm version` with direct node file write to avoid lockfile side-effects
- Add 3-attempt retry loop with pull --rebase to handle concurrent merge races
- Move git config before retry loop to avoid redundant calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The default WORKFLOW_PATH pointed at the root WORKFLOW.md (a stale
legacy file) instead of the per-product config in pipeline-config/.
Now defaults to pipeline-config/workflows/WORKFLOW-${SYMPHONY_PROJECT}.md,
matching the naming convention used by all product workflows.

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(SYMPH-74): add session continuity and channel-project mapping

Add session continuity so thread replies resume the existing Claude Code
session via the `resume` option, while new top-level messages start fresh.
Add runtime `/project set <path>` slash command to update the in-memory
channel-to-project mapping. All state is in-memory (Map) for v1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-74): apply biome formatting fixes to pass CI lint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace the old mergeability-check + polling flow with:
- Merge Queue Context section explaining GitHub merge queue behavior
- Direct gh pr merge --squash --delete-branch (no pre-check needed)
- gh pr checks --watch --required --fail-fast (blocks efficiently
  instead of sleep/poll loops)
- Explicit DO NOT list to prevent agents from retrying or bypassing

Reduces merge stage token consumption by eliminating polling waste.
Applied directly to main — pipeline convergence failure on this ticket
(modifies same WORKFLOW files other PRs touch).

Co-authored-by: Eric Litman <eric@mobilyze.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add unified deploy script (ops/symphony-deploy) for pulling,
building, and restarting symphony-ts on the server. Supports
--dry-run, --no-restart, --symphony, --config flags.

Remove scripts/deploy-skills.sh — skills now deploy via
claude-config git repo + deploy.sh symlinks.

Add --version flag to run-pipeline.sh.
Update all CSS custom properties in DASHBOARD_STYLES from light to dark
color scheme. Keeps teal/green accent identity. No layout or JS changes.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…d daemon packaging (#100)

Extract inline chunking, reactions, and streaming logic from handler.ts
into dedicated testable modules (src/chunking.ts, src/reactions.ts,
src/streaming.ts). Add 39K character limit enforcement for Slack message
chunking with paragraph-boundary splitting. Create daemon packaging files
for the Slack bridge service (ops/slack-bridge-ctl, ops/com.slack-bridge.plist).

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…tead of fork (#99)

* feat(SYMPH-85): add --repo flag to gh pr create/merge in all prompts and workflows

Prevents pipeline PRs from targeting the upstream fork parent repo
by explicitly specifying the repo via gh repo view resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-85): use git remote URL instead of gh repo view for --repo flag

gh repo view also follows the fork graph upstream, so use
git remote get-url origin to reliably derive the current repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ing (#101)

* feat(SYMPH-85): add --repo flag to gh pr create/merge in all prompts and workflows

Prevents pipeline PRs from targeting the upstream fork parent repo
by explicitly specifying the repo via gh repo view resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-85): use git remote URL instead of gh repo view for --repo flag

gh repo view also follows the fork graph upstream, so use
git remote get-url origin to reliably derive the current repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(SYMPH-87): create webhook server entry point and update ops tooling

Add src/slack-bot/server.ts as a standalone HTTP server entry point for
the Slack bot webhook receiver using node:http. Includes loadSlackBotConfig(),
createSlackBotServer(), and startSlackBotServer() exports. Routes POST
/api/webhooks/slack to Chat SDK and GET /health for liveness checks.

Update ops/slack-bridge-ctl to point to the new entry point and add
SLACK_BOT_PORT to .env.example.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-87): fix biome lint/format violations in server and test files

Remove non-null assertions (noNonNullAssertion) by using type cast after
validation guard. Auto-format long lines to satisfy biome line width rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Pipeline agents leave the server repo on feature branches after work
completes. GitHub deletes the remote ref on merge, but the local branch
remains. symphony-deploy's git pull then fails trying to fetch a deleted
remote tracking ref. Fix: always checkout main before pulling.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
#103)

- Update package.json version from 0.1.8 to 2026.03.23.1 (calver)
- Add state.failed Set to track failed issues separately from completed
- Wire completed/failed counts in runtime snapshot using Set sizes
  instead of iterating deleted issueExecutionHistory entries
- Remove "JSON details" links and inline pipeline stage from issue rows
- Allow issue titles to wrap instead of truncating with ellipsis
- Update 16 test assertions across 6 test files

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-91): replace Chat SDK with @slack/bolt Socket Mode

Remove chat, @chat-adapter/slack, and @chat-adapter/state-memory dependencies
and replace with @slack/bolt for Socket Mode connectivity. This eliminates the
webhook-based HTTP server in favor of WebSocket-based Socket Mode.

Key changes:
- src/slack-bot/index.ts: Replace Chat class with Bolt App (socketMode: true),
  export createSlackBoltApp() and startSlackBot()
- src/slack-bot/handler.ts: Rewrite handler from (thread, message) to Bolt's
  ({ message, say, client }) signature with bot message filtering
- src/reactions.ts: Change from adapter.addReaction/removeReaction to
  client.reactions.add/remove with { channel, timestamp, name }
- src/slack-bot/types.ts: Replace signingSecret with appToken in SlackBotConfig
- src/slack-bot/server.ts: Standalone entry point reading SLACK_APP_TOKEN
  instead of SLACK_SIGNING_SECRET, no HTTP server needed
- Updated all consuming tests to use new Bolt handler signature

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix biome formatting in slack-bot files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-92): Add progressive streaming and markdown formatting

Replace collect-then-post pattern with progressive streaming via Slack's
ChatStreamer API. Add StreamConsumer wrapper with lazy initialization,
39K overflow handling, and error cleanup. Add markdownToMrkdwn converter
for non-streamed content (errors, slash commands, warnings). Add
setStatus "is thinking..." indicator before streaming starts.

- New: src/slack-bot/stream-consumer.ts — StreamConsumer class
- New: src/slack-bot/format.ts — markdownToMrkdwn converter
- Updated: src/slack-bot/handler.ts — progressive streaming + setStatus
- Updated: src/slack-bot/index.ts — export new modules
- Updated: test files for new streaming behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-92): fix biome lint formatting and noNonNullAssertion errors

- Auto-fix indentation (tabs→spaces) across new and modified files
- Replace non-null assertions with nullish coalescing in format.ts
- Reorder imports per biome organizeImports in stream-consumer.test.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-93): Update env config and add slack-bot test coverage

- Update .env.example: add SLACK_APP_TOKEN, remove BASE_URL,
  SLACK_SIGNING_SECRET, and SLACK_BOT_PORT (now using Socket Mode)
- Rewrite reactions.test.ts as direct unit tests of src/reactions.ts
- Add handler tests: session continuity (resume), fresh thread, /project set
- Create index.test.ts: socketMode construction, env var validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-93): apply biome formatting to test files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add src/test-alpha.ts as the test foundation file and
tests/test-alpha.test.ts with basic validation tests.
This is the first issue in the chain — nothing blocks it.

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: dashboard bugs — calver version, completed counts, title wrapping

- Update package.json version from 0.1.8 to 2026.03.23.1 (calver)
- Add state.failed Set to track failed issues separately from completed
- Wire completed/failed counts in runtime snapshot using Set sizes
  instead of iterating deleted issueExecutionHistory entries
- Remove "JSON details" links and inline pipeline stage from issue rows
- Allow issue titles to wrap instead of truncating with ellipsis
- Update 16 test assertions across 6 test files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: update encrypted secrets

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…rmat (#109)

- Add section 3b to symphony-deploy that restarts the slack bridge
  service alongside symphony-ctl on every deploy. Uses the same
  env-change vs code-change logic (reinstall if .env changed, restart
  if code changed, skip if neither).
- Fix sops decrypt to specify --input-type dotenv --output-type dotenv,
  since .env.enc uses dotenv format with inline ENC[] markers but the
  .enc extension causes sops to default to JSON parsing.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
sops couldn't find the age key on the server because the env var
wasn't set and sops didn't check the default XDG path. Set the
default to ~/.config/sops/age/keys.txt if not already provided.
…#111)

xargs strips double quotes from values like CHANNEL_PROJECT_MAP JSON,
causing "Expected property name" parse errors at runtime. Replace with
bash-native whitespace trimming that preserves all characters.

Affects both symphony-ctl and slack-bridge-ctl generate_env_dict(),
plus the .env loading in symphony-ctl cmd_cleanup().

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(SYMPH-116): Create symphony-onboard script

Add ops/symphony-onboard script to automate onboarding new projects into
the Symphony pipeline. Creates WORKFLOW file from template, generates
CLAUDE.md in target repo, copies CI minimal workflow, and configures
GitHub merge queue via Rulesets API. All steps are idempotent and support
dry-run mode.

Also adds two new template files:
- pipeline-config/templates/ci-minimal.yml: generic CI workflow with
  package manager auto-detection
- pipeline-config/templates/CLAUDE.md.tmpl: CLAUDE.md template with
  sed-friendly substitution placeholders

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-116): Fix Linear API query in symphony-onboard

The `linear api` CLI expects a plain GraphQL query piped via stdin with
--variables-json for variables, not a JSON body. The previous approach
would fail with a GraphQL validation error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
#115)

Add pipeline-config/ports.json as central port registry with unique
per-product ports (4321-4328). Update all 8 WORKFLOW files to use their
assigned ports instead of the shared 4321. Update run-pipeline.sh to
read ports.json via jq and inject --port before user args.

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace stale Finesssee linear-cli references with correct schpet
v1.11.1 syntax. Syncs SKILL.md, freeze-and-queue.sh, and adds
missing gates/, references/, and script files from canonical source.

- Binary: `linear` (not `linear-cli`)
- API syntax: `linear api` (not `linear api query`)
- Variable flag: `--variable key=val` (not `-v`)
- No `-o json`, `--quiet`, `--compact` (Finesssee-only flags)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(dashboard): observability v3 — live activity, health, tokens

- Broaden event tracking to 6 types (was only approval_auto_approved)
- Stage-aware health thresholds (investigate=600s, implement=480s,
  review=600s, merge=300s) with graduated green/yellow/red
- Fix token extraction: handle missing totalTokens, extract cache
  and reasoning token fields
- Richer activity display: tool name + context + tokens + relative time
- Expanded WorkerExitOutcome: failed_to_start, timed_out, error
  (replaces uninformative "abnormal" label)
- Pipeline-level activity events: stage transitions, state changes,
  session starts — activity feed never empty for active sessions

12 files changed, 750 insertions(+), 56 deletions(-)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: council R1 — 0 P1s + 4 P2s

- Tighten generic usage alias guard (require all 3 fields for input/output/total)
- Fix stage_transition dead code (pass session to advanceStage directly)
- Remove duplicate token display in turn events (let token badge handle it)
- Add missing classifyExitOutcome tests (timed_out, error, passthrough)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(notifications): add Slack pipeline notifications (D56)

Add per-product Slack channel notifications for high-value pipeline
events: issue completed, issue failed, stall killed, infra error,
pipeline started/stopped. Fires from runtime-host (integration layer),
never from OrchestratorCore (pure state machine). Best-effort delivery
via @slack/web-api — failures logged and swallowed, never affect
pipeline correctness.

Key design:
- Pre-captures execution history before onWorkerExit() (which deletes it)
- Guards against completed-vs-continuation false positives
- Priority ordering: failed > stall > infra > completed
- Config: WORKFLOW `slack_notify_channel` field + SLACK_NOTIFY_CHANNEL env fallback

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: council R1 — 0 P1s + 4 P2s

Fixes 4 P2 findings from council review on PR #116:

1. CLI now constructs PipelineNotifier when slackNotifyChannel and
   SLACK_BOT_TOKEN are both present. Previously feature was inert.

2. Re-read execution history after onWorkerExit() appends final stage
   record. Prevents missing last stage in notifications.

3. Compute durationMs using runAttempt.startedAt (normal case) or
   runningEntry.startedAt (stall timeout). Fixes 0ms duration for
   stall-killed workers.

4. Calculate retriesExhausted by comparing capturedRetryAttempt to
   maxRetryAttempts instead of hardcoding true. Correctly distinguishes
   spec failures from retry exhaustion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: council R2 — 3 P1s (typecheck, mock interface, multi-stage guard)

- Add slackNotifyChannel: null to 9 test config literals (TS2741)
- Extract PipelineNotificationSink interface for mock compatibility (TS2739)
- Fix multi-stage completion guard: use completed.has() + !hasContinuationRetry
  instead of delta-based preCompletedHas check (broken for continuations)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve Biome lint and format errors for CI

Fix formatting issues flagged by Biome CI check, replace inline
import() type expressions that Biome reformats incorrectly with
proper top-level type imports, and replace non-null assertion with
nullish coalescing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
… safety gate, and parent reference (#119)

F1: Scenarios grouped under `### Feature:` headings now match when a task ref
uses "<Feature Name> scenarios" pattern, while preserving direct name matching.

F2: Tasks with scenario refs that match zero scenarios now cause a hard failure
unless `--allow-empty-scenarios` is passed. The error lists unmatched task names.

F3: Sub-issue bodies now start with a "Parent spec:" reference line containing
the parent identifier/URL (Linear URL in live mode, spec title in dry-run).

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…122)

* feat(SYMPH-122): rewrite CLI interface and add missing onboard steps

Rewrite symphony-onboard CLI to use --product, --team-key, --team-name,
--description (required) with --repo as optional override (default:
mobilyze-llc/{product}). Remove --project-slug and --port as flags.

Add 6 new steps for an 11-step flow:
1. Duplicate detection (check ports.json for existing product)
2. Linear team creation (idempotent via GraphQL check-before-create)
3. Linear project creation + team linking (idempotent)
4. Port auto-allocation (read ports.json, assign max+1, write entry)
5. run-pipeline.sh auto-registration (insert case entry before catch-all)
6-10. Existing steps (WORKFLOW gen, repo verify, CLAUDE.md, CI, merge queue)
11. Updated summary with all new fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(SYMPH-122): review fixes for symphony-onboard

- Add missing linear CLI precondition check
- Make duplicate detection run in dry-run mode (read-only check)
- Remove dead LINK_QUERY code in step 3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…#123)

- Rename WORKFLOW-TOYS.md → WORKFLOW-toys.md (git mv)
- Change "TOYS": 4328 → "toys": 4328 in ports.json
- Add toys) case entry to run-pipeline.sh with WORKFLOW path and DEFAULT_REPO_URL
- Add toys to help text and error message product list

Co-authored-by: Claw Dilize <clawdilize@pro16.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
advanceStage() deletes issueExecutionHistory synchronously during
terminal transitions, so the postHistory re-read in runtime-host
falls back to stale preHistory that's missing the final stage record.

Fix: snapshot execution history in onWorkerExit after appending the
stage record but before advanceStage deletes it. Expose via
consumeExitHistorySnapshot() so runtime-host reads the correct
history for terminal completion notifications.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ericlitman ericlitman force-pushed the symph-125/fix-stale-notification-history branch from 87f7dfa to ec88e23 Compare March 24, 2026 17:53
Collapse multi-line bracket access into single line to satisfy biome formatter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant