Skip to content

feat: use interrupt() as graceful timeout alternative#353

Merged
sjnims merged 2 commits intomainfrom
feature/interrupt-timeout-strategy
Jan 31, 2026
Merged

feat: use interrupt() as graceful timeout alternative#353
sjnims merged 2 commits intomainfrom
feature/interrupt-timeout-strategy

Conversation

@sjnims
Copy link
Copy Markdown
Owner

@sjnims sjnims commented Jan 31, 2026

Description

Replace single-tier AbortController.abort() timeout with a two-tier mechanism:

  1. Soft timeout at timeout_ms: calls query.interrupt() for graceful stop
  2. Hard timeout at timeout_ms + interrupt_grace_ms: calls controller.abort() as fallback

New config fields timeout_strategy and interrupt_grace_ms control behavior. Default: "interrupt_first" with 10s grace period. "abort_only" preserves legacy behavior.

Key design: A QueryHolder pattern ({ query: Query | undefined }) allows timeout callbacks to lazily access the query object, which doesn't exist when timers start. If query isn't created when soft timeout fires, falls back to hard abort.

Type of Change

  • New feature (non-breaking change that adds functionality)

Component(s) Affected

Pipeline Stages

  • Stage 3: Execution (src/stages/3-execution/)
  • Stage 4: Evaluation (src/stages/4-evaluation/)

Core Infrastructure

  • Configuration (src/config/)
  • Types (src/types/)

Other

  • Tests (tests/)
  • Documentation (CLAUDE.md, README.md)

Motivation and Context

When a scenario times out, the current behavior is a hard AbortController.abort() which loses any partial results the agent may have produced. Using query.interrupt() first gives the SDK a chance to gracefully wind down and return partial results before the hard abort fires as a safety net.

Fixes #347

How Has This Been Tested?

Test Configuration:

  • Node.js version: 22
  • OS: macOS (Darwin 25.2.0)

Test Steps:

  1. npm run build — clean compilation
  2. npm run typecheck — type safety verified
  3. npm run lint — no lint errors
  4. npm run format:check — formatting clean
  5. npm run knip — no dead exports
  6. npm test — 1605 tests passed, 5 skipped

Checklist

General

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors

TypeScript / Code Quality

  • All functions have explicit return types
  • Strict TypeScript checks pass (npm run typecheck)
  • ESM import/export patterns used correctly
  • Unused parameters prefixed with _
  • No any types without justification

Documentation

  • I have updated CLAUDE.md if behavior or commands changed
  • I have updated inline JSDoc comments where applicable
  • I have verified all links work correctly

Linting

  • I have run npm run lint and fixed all issues
  • I have run npm run format:check

Testing

  • I have run npm test and all tests pass
  • I have added tests for new functionality
  • Test coverage meets thresholds (78% lines, 75% functions, 65% branches)

Stage-Specific Checks

Stage 3: Execution (click to expand)
  • Claude Agent SDK integration works correctly
  • Tool capture via PreToolUse hooks functions properly
  • Timeout handling works as expected
  • Session isolation prevents cross-contamination
  • Permission bypass works for automated execution

Files Changed

File Change
src/types/transcript.ts Added "interrupted" to TranscriptErrorType and TerminationType
src/types/config.ts Added TimeoutStrategy type, timeout_strategy/interrupt_grace_ms to ExecutionConfig
src/types/index.ts Re-exported TimeoutStrategy
src/config/schema.ts Added TimeoutStrategySchema, new fields to ExecutionConfigSchema
src/config/defaults.ts Added defaults: timeout_strategy: "interrupt_first", interrupt_grace_ms: 10000
src/stages/3-execution/timeout-strategy.ts NewcreateTimeout() helper with QueryHolder pattern
src/stages/3-execution/transcript-builder.ts Added createInterruptedError() helper
src/stages/3-execution/agent-executor.ts Updated ExecutionContext, prepareExecutionContext, both execute functions, deriveTerminationType
src/stages/3-execution/session-batching.ts Updated executeBatch and executeScenarioWithRetry to use two-tier timeout
src/stages/4-evaluation/metrics.ts Added interrupted: 0 to error counts record
tests/mocks/sdk-mock.ts Added shouldInterrupt, interruptRequested flag, functional interrupt()
tests/unit/stages/3-execution/timeout-strategy.test.ts New — Tests for both strategies and edge cases
tests/unit/stages/3-execution/agent-executor.test.ts Added deriveTerminationType tests for "interrupted"
tests/unit/stages/3-execution/session-batching.test.ts Added batch tests for both timeout strategies

Reviewer Notes

Areas that need special attention:

  • The QueryHolder pattern in timeout-strategy.ts — verify the lazy access approach is sound
  • Priority ordering in deriveTerminationType: stopReceived > timeout > interrupted > error > clean

Known limitations or trade-offs:

  • If interrupt() succeeds but the for-await loop hangs, the hard abort fires after grace period as a safety net
  • The shouldInterrupt mock simulates interrupt by yielding the result message and returning early, which is a simplification of real SDK behavior

🤖 Generated with Claude Code

Replace single-tier AbortController.abort() timeout with a two-tier
mechanism: soft timeout calls query.interrupt() for graceful stop,
hard timeout calls controller.abort() as fallback after grace period.

New config fields timeout_strategy and interrupt_grace_ms control
behavior. Default: "interrupt_first" with 10s grace period.
"abort_only" preserves legacy behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sjnims sjnims self-assigned this Jan 31, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Jan 31, 2026

Claude finished @sjnims's task in 1m 23s —— View job


Review: PR #353 — feat: use interrupt() as graceful timeout alternative

  • Run quality checks
  • Review diff and analyze changes
  • Post inline comments for specific issues
  • Post summary review

Verdict: ✅ Looks good

Well-structured feature that cleanly adds a two-tier timeout mechanism. The QueryHolder pattern is a pragmatic solution to the timing problem, and the fallback logic is sound.

Quality Checks

Check Result
npm run typecheck ✅ Pass
npm run lint ✅ Pass
npm run format:check ✅ Pass
npm run knip ✅ Pass

What's Good

  • Clean abstraction: timeout-strategy.ts is well-isolated with a single responsibility — easy to test, easy to reason about
  • Backward compatible: "abort_only" strategy preserves legacy behavior; ?? "interrupt_first" fallbacks ensure existing configs continue working
  • Thorough edge case handling: Falls through to hard abort when query isn't created yet, swallows interrupt() errors, cleanup cancels both timers
  • Good test coverage: Tests cover both strategies, the no-query fallback, error swallowing, and cleanup of both timer tiers
  • Priority ordering in deriveTerminationType is correct: stopReceived > timeout > interrupted > error > clean

Suggestions

  1. Stale JSDoc (see inline comment): The JSDoc on deriveTerminationType still lists 4 priority levels (missing "interrupted"), and the TerminationType JSDoc in transcript.ts doesn't document the new variant. Minor, but worth keeping in sync.

  2. Duplicated interrupt-error logic: The "if interrupted but no timeout, push interrupted error" block appears in three places (executeScenario, executeScenarioWithCheckpoint, executeBatch). Consider extracting a small helper like addInterruptErrorIfNeeded(ctx) to reduce the duplication. Not blocking — it's only 5 lines each, and the current approach is explicit.

  3. TwoTierTimeout.queryHolder field: The returned TwoTierTimeout includes queryHolder, but callers already have their own reference to queryHolder (they pass it in). The field on the return type is unused by consumers. Consider removing it from the return interface to keep the API minimal.


@codecov
Copy link
Copy Markdown

codecov bot commented Jan 31, 2026

Codecov Report

❌ Patch coverage is 91.83673% with 4 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/stages/3-execution/timeout-strategy.ts 86.36% 1 Missing and 2 partials ⚠️
src/stages/3-execution/transcript-builder.ts 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@claude claude bot added enhancement New feature or request effort:medium 1-4 hours stage:execution Stage 3: Agent SDK execution with tool capture stage:evaluation Stage 4: Programmatic detection and LLM judge component:config Configuration loading (src/config/) component:types TypeScript interfaces (src/types/) sdk:agent Claude Agent SDK usage (Stage 3) test:unit Unit test changes (tests/unit/) labels Jan 31, 2026
- Update JSDoc on deriveTerminationType to include "interrupted" priority
- Update JSDoc on TerminationType to document "interrupted" variant
- Extract addInterruptErrorIfNeeded helper to reduce duplication
- Remove unused queryHolder field from TwoTierTimeout return interface

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sjnims
Copy link
Copy Markdown
Owner Author

sjnims commented Jan 31, 2026

Addressed all three suggestions in 37ef25f:

  1. Stale JSDoc — Updated JSDoc on both deriveTerminationType (now lists 5 priority levels including "interrupted") and TerminationType (documents the new variant).

  2. Duplicated interrupt-error logic — Extracted addInterruptErrorIfNeeded(interrupted, errors) helper in timeout-strategy.ts, used in all three call sites.

  3. TwoTierTimeout.queryHolder field — Removed from the interface and return value since callers already hold their own reference.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Jan 31, 2026

Claude finished @sjnims's task in 2m 23s —— View job


Re-review: PR #353 — feat: use interrupt() as graceful timeout alternative

  • Run quality checks
  • Review updated diff and verify fixes
  • Post inline comments for any issues
  • Post summary review

Verdict: ✅ Looks good

All three suggestions from the previous review have been properly addressed in 37ef25f.

Quality Checks

Check Result
npm run typecheck ✅ Pass
npm run lint ✅ Pass
npm run format:check ✅ Pass
npm run knip ✅ Pass
npm test ✅ 1605 passed, 5 skipped

Fixes Verified

  1. Stale JSDocderiveTerminationType JSDoc (agent-executor.ts:352-360) now correctly lists all 5 priority levels including "interrupted". TerminationType JSDoc (transcript.ts:188-193) documents the new "interrupted" variant.

  2. Duplicated interrupt-error logic — Extracted addInterruptErrorIfNeeded() helper in timeout-strategy.ts:95-104, used in all three call sites (agent-executor.ts:599, agent-executor.ts:699, session-batching.ts:777). Clean extraction.

  3. TwoTierTimeout.queryHolder field — Removed from the TwoTierTimeout interface (timeout-strategy.ts:23-28). Return value at line 88 only includes cleanup and interrupted. Callers manage their own queryHolder reference.

No new issues found. The code is clean and ready to merge.


@sjnims sjnims merged commit d88927a into main Jan 31, 2026
15 checks passed
@sjnims sjnims deleted the feature/interrupt-timeout-strategy branch January 31, 2026 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:config Configuration loading (src/config/) component:types TypeScript interfaces (src/types/) effort:medium 1-4 hours enhancement New feature or request sdk:agent Claude Agent SDK usage (Stage 3) stage:evaluation Stage 4: Programmatic detection and LLM judge stage:execution Stage 3: Agent SDK execution with tool capture test:unit Unit test changes (tests/unit/)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Use interrupt() method as graceful timeout alternative

1 participant