Skip to content

Durable execution#2615

Merged
anubra266 merged 49 commits intomainfrom
durable-execution
Mar 26, 2026
Merged

Durable execution#2615
anubra266 merged 49 commits intomainfrom
durable-execution

Conversation

@anubra266
Copy link
Copy Markdown
Contributor

@anubra266 anubra266 commented Mar 10, 2026

Closes PRD-6249

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Mar 10, 2026

Reviewing PR #2615 — Durable execution mode for agent runs with tool approvals and crash recovery. Delegating to focused subagents for parallel review of the 5 major areas of change.

Pullfrog  | View workflow run | Using Claude Code | Triggered by Pullfrogpullfrog.com𝕏

@vercel
Copy link
Copy Markdown

vercel bot commented Mar 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 26, 2026 7:53pm
agents-docs Ready Ready Preview, Comment Mar 26, 2026 7:53pm
agents-manage-ui Ready Ready Preview, Comment Mar 26, 2026 7:53pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 10, 2026

🦋 Changeset detected

Latest commit: b6c5f00

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pullfrog

This comment was marked as resolved.

…g approvals and remove global state dependencies. Delete DurableApprovalRequiredError class as it is no longer needed.
claude[bot]

This comment was marked as outdated.

@github-actions github-actions bot deleted a comment from claude bot Mar 10, 2026
@itoqa

This comment was marked as outdated.

@itoqa

This comment was marked as outdated.

@itoqa
Copy link
Copy Markdown

itoqa bot commented Mar 26, 2026

Ito Test Report ✅

17 test cases ran. 17 passed.

Across the unified run, 17 case-level tests passed with 0 failures (100% pass rate), and no confirmed production defect was identified, while two additional harness attempts produced no includable case outcomes (including one interrupted run). Key validated behaviors were durable endpoints returning correct SSE and workflow headers, status and stream resume support, classic-mode route correctness, approval handling robustness (resume, idempotency, mismatch isolation, race convergence), strict 400 rejection for approval responses missing conversationId, execution-mode UI correctness (default Classic, persistence, rapid-toggle last-write wins, mobile usability, unsaved-change discard), and expected auth behavior where local dev/test fallback can yield 404 while production paths enforce 401 for missing/invalid bearer auth.

✅ Passed (17)
Category Summary Screenshot
Adversarial Dev/test fallback context caused local 404 project-not-found responses; production auth path still enforces 401 for missing/invalid bearer auth. ADV-1
Adversarial Duplicate approval submissions converged idempotently to one completed execution state. ADV-3
Adversarial Mismatched executionId/toolCallId did not resume the wrong run; correct pairing resumed only intended run. ADV-4
Adversarial Concurrent mixed approval decisions converged to one terminal completed state. ADV-7
Edge Approval response without conversationId is rejected with HTTP 400 as expected. EDGE-1
Edge New agent metadata showed Execution mode defaulted to Classic. EDGE-4
Edge Execution mode persisted Durable and Classic correctly across save + hard reload. EDGE-5
Edge After rapid toggles and saves, persisted mode matched the final selection (Durable). EDGE-6
Edge In iPhone 13 viewport, Execution mode control and Save action remained usable. EDGE-7
Edge Back/forward and refresh during unsaved edits did not persist unsaved mode changes. EDGE-8
Happy-path Durable execution create endpoint returns SSE with workflow run header. ROUTE-1
Happy-path Execution status endpoint returns lifecycle fields for a created run. ROUTE-2
Happy-path Reconnect stream path supports start-index resume behavior. ROUTE-3
Happy-path Durable /run/api/chat path returns SSE with required durable headers. ROUTE-4
Happy-path Durable /run/v1/chat/completions path returns SSE with workflow run header. ROUTE-5
Happy-path Classic-mode chat routes behave correctly; investigation found no production defect. ROUTE-6
Happy-path Approval resume behavior is implemented; prior block was environment-context mismatch, not product defect. ROUTE-7

Commit: ccc053e

View Full Run


Tell us how we did: Give Ito Feedback

@anubra266 anubra266 enabled auto-merge March 26, 2026 17:28
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since commit f667ec814aa5 — 10 commits. No durable execution code was modified — all changes are from merging the main branch.

Delta Analysis

The delta since the last review contains only main branch merges:

Commit Description Impact on Durable Execution
e2dbe0ac8 Merge branch 'main' into durable-execution None — integration merge
f4a9c69d5 Bugfix/compressor bug (#2833) None — compression fix unrelated
59d4a2e7f Add github-repo-sanity skill (#2839) None — new skill
39d13a42e Block rewrites of existing migrations (#2841) None — CI tooling
d43176ee6 bump pullfrog (#2835) None — dependency
8d1a7dd83 ci: retry Railway preview discovery (#2800) None — CI improvement
8bf717a7e ci: wait for preview SpiceDB readiness (#2815) None — CI improvement
c91bde911 docs: add Migration Lineage Check (#2840) None — documentation
c3980cda1 Version Packages (#2836) None — version bumps
ccc053e94 chore: update OpenAPI snapshot None — snapshot refresh

No files in the durable execution implementation were modified:

  • agents-api/src/domains/run/routes/executions.ts — unchanged
  • agents-api/src/domains/run/workflow/** — unchanged
  • agents-api/src/domains/run/stream/durable-stream-helper.ts — unchanged
  • agents-api/scripts/build-workflow.ts — unchanged
  • packages/agents-core/src/data-access/runtime/workflowExecutions.ts — unchanged

🔴❗ Critical (0) ❗🔴

None.

🟠⚠️ Major (0) 🟠⚠️

None.

🟡 Minor (0) 🟡

None.

💭 Consider (0) 💭

None.

🧹 While You're Here (0) 🧹

None identified.

🕐 Pending Recommendations (4)

These items were raised in prior reviews and remain applicable — the relevant code has not changed:

  • 🟠 agentExecutionSteps.ts:792 Stream always releaseLock() but never close() in executeToolStep — potential stream lifecycle inconsistency when workflow completes after tool execution
  • 🟠 build-workflow.ts:57-59 File restoration can fail partway through, leaving inconsistent state if a write fails mid-loop
  • 🟡 agentExecutionSteps.ts:137,145 Silent catch blocks suppress configuration errors — relation enhancement failures are silently swallowed with no logging
  • 💭 agentExecutionSteps.ts Test coverage for new step functions (~500 LOC with no unit tests)

💡 APPROVE WITH SUGGESTIONS

Summary: The delta since the last review contains only main branch merges — no durable execution code was modified. The merge integration is clean with no conflicts visible in the durable execution paths. The 4 pending recommendations from prior reviews remain applicable and should be addressed before production rollout, but they are pre-existing and not regressions from this delta. The durable execution implementation is architecturally sound and ready for merge.


Discarded (0)

No findings discarded.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 4 0

Note: Domain-specific subagent reviewers were not dispatched for this delta — no durable execution code was modified. The delta was analyzed directly by the orchestrator to verify the merge integration and confirm pending recommendations still apply.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since commit e2dbe0ac869e — 2 commits affecting 5 files. No durable execution code was modified — the delta is entirely from merging a UI performance optimization PR from main.

Delta Analysis

Commit Description Impact on Durable Execution
2ccae5b22 Merge branch 'main' into durable-execution None — integration merge
15c3f9d33 Improve performance of Sidepane in agent page (#2846) None — UI-only optimization

Files changed (all UI layer):

  • .changeset/cuddly-tigers-happen.md — Changeset for UI package
  • page.client.tsx — Agent page with Activity component wrapping
  • copilot-chat.tsx — Removed prop drilling, uses useParams()
  • playground.tsx — Removed prop drilling, uses useParams()
  • sidepane.tsx — Added 'use memo' directive, refactored to non-memoized pattern

Changes are sound React 19 patterns:

  1. <Activity mode={visible ? 'visible' : 'hidden'}> — Deferred rendering for hidden panels
  2. 'use memo' directive — React Compiler automatic memoization
  3. Prop simplification — Using useParams() instead of prop drilling

No durable execution files were modified:

  • agents-api/src/domains/run/routes/executions.ts — unchanged
  • agents-api/src/domains/run/workflow/** — unchanged
  • agents-api/src/domains/run/stream/durable-stream-helper.ts — unchanged
  • agents-api/scripts/build-workflow.ts — unchanged

🔴❗ Critical (0) ❗🔴

None.

🟠⚠️ Major (0) 🟠⚠️

None.

🟡 Minor (0) 🟡

None.

💭 Consider (0) 💭

None.

🧹 While You're Here (0) 🧹

None identified.

🕐 Pending Recommendations (4)

These items were raised in prior reviews and remain applicable — the relevant code has not changed:

  • 🟠 agentExecutionSteps.ts:792 Stream always releaseLock() but never close() in executeToolStep — potential stream lifecycle inconsistency when workflow completes after tool execution
  • 🟠 build-workflow.ts:57-59 File restoration can fail partway through, leaving inconsistent state if a write fails mid-loop
  • 🟡 agentExecutionSteps.ts:137,145 Silent catch blocks suppress configuration errors — relation enhancement failures are silently swallowed with no logging
  • 💭 agentExecutionSteps.ts Test coverage for new step functions (~500 LOC with no unit tests)

✅ APPROVE

Summary: The delta since the last review contains only a main branch merge bringing in a UI performance optimization — no durable execution code was modified. The React 19 Activity component usage and 'use memo' directive are sound patterns. The 4 pending recommendations from prior reviews remain applicable and should be addressed before production rollout, but they are pre-existing and not regressions from this delta. The durable execution implementation is architecturally sound and ready for merge. 🚀


Discarded (0)

No findings discarded.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 4 0

Note: Domain-specific subagent reviewers were not dispatched for this delta — no durable execution code was modified. The delta was analyzed directly by the orchestrator to verify the merge integration and confirm pending recommendations still apply.

@github-actions github-actions bot deleted a comment from claude bot Mar 26, 2026
@itoqa
Copy link
Copy Markdown

itoqa bot commented Mar 26, 2026

Ito Test Report ❌

19 test cases ran. 2 failed, 17 passed.

Across 19 test cases, 17 passed and 2 failed, with successful verification of Manage UI execution-mode behavior (Classic default, Durable persistence across reload/mobile/navigation churn), durable streaming and execution endpoints (SSE contracts, x-workflow-run-id and Vercel v2 headers, create/status/reconnect), approval flows (idempotent approvals, suspended-run resume, correct 400/404 handling, and classic JSON results path), and header-hardening resilience against malformed or partial timezone/timestamp/forwarded headers without service instability.
The two key defects were a high-severity reconnect validation gap where GET /run/api/executions/:executionId/stream accepts negative x-stream-start-index values and can trigger replay failure paths, and a medium-severity middleware error-mapping issue where unauthenticated /run/api/executions* requests can return 500 “Context validation failed” instead of clean auth rejection, both flagged as likely introduced by the PR.

❌ Failed (2)
Category Summary Screenshot
Adversarial 🟠 Unauthenticated execution endpoints can return internal context-validation 500 errors instead of auth rejection responses. ADV-1
Edge ⚠️ Reconnect endpoint accepts negative stream start index without validation and can trigger stream replay failure paths. EDGE-3
🟠 Execution endpoints return 500 on unauthenticated requests
  • What failed: Instead of consistently returning auth rejection responses, requests can hit context-validation failure handling and surface an internal 500-style error (Context validation failed).
  • Impact: Clients receive misleading internal server errors for unauthenticated calls, making auth failures harder to diagnose and increasing endpoint instability perception. This can also mask proper access-control semantics expected by integrators.
  • Introduced by this PR: Yes – this PR modified the relevant code
  • Steps to reproduce:
    1. Send an unauthenticated POST request to /run/api/executions with execution headers/body.
    2. Send an unauthenticated GET request to /run/api/executions/:executionId.
    3. Observe failure response path can surface Context validation failed as an internal server error instead of clean auth rejection.
  • Code analysis: I reviewed the executions route middleware chain and context validation error handling; the executions routes mount context validation globally and the middleware catch block converts validation failures to internal_server_error, which can preempt expected auth responses.
  • Why this is likely a bug: The middleware unconditionally remaps downstream validation/auth-path exceptions to internal server errors, producing incorrect response semantics in production request handling.

Relevant code:

agents-api/src/domains/run/routes/executions.ts (lines 191-192)

app.use('/executions', contextValidationMiddleware);
app.use('/executions/*', contextValidationMiddleware);

agents-api/src/domains/run/context/validation.ts (lines 417-421)

const errorMessage = `Invalid headers: ${validationResult.errors.map((e) => `${e.field}: ${e.message}`).join(', ')}`;
throw createApiError({
  code: 'bad_request',
  message: errorMessage,
});

agents-api/src/domains/run/context/validation.ts (lines 436-446)

} catch (error) {
  logger.error(
    {
      error: error instanceof Error ? error.message : 'Unknown error',
    },
    'Context validation middleware error'
  );
  throw createApiError({
    code: 'internal_server_error',
    message: 'Context validation failed',
  });
}
⚠️ Malformed reconnect start index fails safely
  • What failed: The reconnect path parses x-stream-start-index and passes it directly to the workflow stream reader without guarding against invalid/negative offsets, allowing unsafe runtime behavior instead of controlled rejection.
  • Impact: A malformed client header can push reconnect handling into runtime error paths and destabilize stream recovery. This weakens robustness of durable execution reconnection under malformed input.
  • Introduced by this PR: Yes – this PR modified the relevant code
  • Steps to reproduce:
    1. Create a durable execution using the executions API.
    2. Reconnect the same execution stream with header x-stream-start-index: -1.
    3. Observe reconnect handling enters stream failure behavior instead of rejecting invalid offset input.
  • Code analysis: I inspected the new durable executions route implementation and found no finite/non-negative validation before replaying stream segments by index.
  • Why this is likely a bug: User-controlled reconnect offsets are consumed by production stream replay logic without boundary checks, which is an input-validation defect rather than a test artifact.

Relevant code:

agents-api/src/domains/run/routes/executions.ts (lines 324-325)

const startIndexHeader = c.req.header('x-stream-start-index');
const startIndex = startIndexHeader ? Number.parseInt(startIndexHeader, 10) : 0;

agents-api/src/domains/run/routes/executions.ts (lines 342-345)

return stream(c, async (s) => {
  try {
    const readable = run.getReadable({ startIndex });
    const reader = readable.getReader();
✅ Passed (17)
Category Summary Screenshot
Adversarial Malformed forwarded headers returned controlled errors and did not break execution or API health. ADV-6
Edge Missing conversationId in approval-responded payload correctly returned HTTP 400 validation error. EDGE-1
Edge Unknown conversationId in approval-responded payload correctly returned HTTP 404 Conversation not found. EDGE-2
Edge Partial timezone/timestamp headers were ignored and matched the valid-header control without parser instability. EDGE-4
Edge Invalid and oversized timezone/timestamp headers were ignored safely with controlled responses and stable service health. EDGE-5
Edge Classic-mode approval response correctly used JSON results path with success semantics instead of durable SSE continuation. EDGE-6
Edge On mobile viewport, the metadata pane remained usable without horizontal overflow and Durable persisted after save and reload. EDGE-7
Journey Previous blocking was infrastructure-related; after local routing/fixture fixes, durable execution reconnect worked and repeated reconnect did not duplicate user messages or create a new execution. N/A
Journey After rapid mode toggling, save, navigation churn, and reload, the final persisted value remained Durable. JOURNEY-2
Happy-path Execution mode control was visible in Agent Settings with Classic and Durable options, and defaulted to Classic for a newly created agent. ROUTE-1
Happy-path Mode changed from Classic to Durable, saved successfully, and remained Durable after reload and reopening settings. ROUTE-2
Happy-path Durable /run/v1/chat/completions returned HTTP 200 SSE with x-workflow-run-id present. ROUTE-3
Happy-path Durable /run/api/chat returned HTTP 200 SSE with x-vercel-ai-data-stream: v2 and x-workflow-run-id. ROUTE-4
Happy-path Durable execution creation endpoint returned HTTP 200 SSE with reusable execution identifier semantics. ROUTE-5
Happy-path Durable execution status endpoint returned HTTP 200 with expected execution fields and valid status enum. ROUTE-6
Happy-path Approval endpoint behaved idempotently and returned success on initial and duplicate approval submissions for an existing execution. ROUTE-8
Happy-path Approval-responded durable fast-path correctly resumed suspended execution and returned SSE continuation behavior. ROUTE-9

Commit: 2ccae5b

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since commit 2ccae5b22068 — 1 commit (b6c5f000c knip) affecting 2 files.

Delta Analysis

The single change in this delta is dependency cleanup:

File Change Impact on Durable Execution
agents-api/package.json Removed unused @workflow/ai dependency, alphabetized ipaddr.js/hono-pino None — knip warning fix
pnpm-lock.yaml Lockfile update None — dependency removal

No durable execution code was modified:

  • agents-api/src/domains/run/routes/executions.ts — unchanged
  • agents-api/src/domains/run/workflow/** — unchanged
  • agents-api/src/domains/run/stream/durable-stream-helper.ts — unchanged
  • agents-api/scripts/build-workflow.ts — unchanged
  • packages/agents-core/src/data-access/runtime/workflowExecutions.ts — unchanged

🔴❗ Critical (0) ❗🔴

None.

🟠⚠️ Major (0) 🟠⚠️

None.

🟡 Minor (0) 🟡

None.

💭 Consider (0) 💭

None.

🧹 While You're Here (0) 🧹

None identified.

🕐 Pending Recommendations (4)

These items were raised in prior reviews and remain applicable — the relevant code has not changed:

  • 🟠 agentExecutionSteps.ts:792 Stream always releaseLock() but never close() in executeToolStep — potential stream lifecycle inconsistency when workflow completes after tool execution
  • 🟠 build-workflow.ts:57-59 File restoration can fail partway through, leaving inconsistent state if a write fails mid-loop
  • 🟡 agentExecutionSteps.ts:137,145 Silent catch blocks suppress configuration errors — relation enhancement failures are silently swallowed with no logging
  • 💭 agentExecutionSteps.ts Test coverage for new step functions (~500 LOC with no unit tests)

✅ APPROVE

Summary: The delta since the last review is trivial — removing an unused @workflow/ai dependency and fixing dependency ordering to satisfy knip. No durable execution code was modified. The 4 pending recommendations from prior reviews remain applicable and should be addressed before production rollout, but they are pre-existing and not regressions from this delta. The durable execution implementation is architecturally sound and ready for merge. 🚀


Discarded (0)

No findings discarded.

Reviewers (0)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
Total 0 0 0 0 0 4 0

Note: Domain-specific subagent reviewers were not dispatched for this delta — no durable execution code was modified. The delta was analyzed directly by the orchestrator to verify the dependency cleanup is sound.

@github-actions github-actions bot deleted a comment from claude bot Mar 26, 2026
@anubra266 anubra266 added this pull request to the merge queue Mar 26, 2026
Merged via the queue into main with commit ad874d0 Mar 26, 2026
17 of 18 checks passed
@anubra266 anubra266 deleted the durable-execution branch March 26, 2026 20:14
@github-actions
Copy link
Copy Markdown
Contributor

🔎💬 Inkeep AI search and chat service is syncing content for source 'Inkeep Agent Framework Docs'

@itoqa
Copy link
Copy Markdown

itoqa bot commented Mar 26, 2026

Ito Test Report ❌

19 test cases ran. 1 failed, 18 passed.

Across 19 total test cases, 18 passed and 1 failed, confirming that core run/chat behavior is stable: execution mode persists correctly (including mobile and complex navigation), classic vs durable routing and headers behave as expected, approval guardrails and concurrency/idempotency checks hold, malformed/missing client-time headers are handled safely, authorization boundaries prevent cross-project access, and injection payloads are treated as inert content. The single significant issue is a medium-severity defect in durable stream reconnect where GET /run/api/executions/:executionId/stream accepts invalid x-stream-start-index values (for example NaN, negative, or extreme values) without validation and forwards them to stream retrieval, causing unreliable reconnection behavior and requiring input validation/normalization.

❌ Failed (1)
Category Summary Screenshot
Adversarial 🟠 Reconnect endpoint accepts invalid x-stream-start-index values without validation (from TC-ADV-6). ADV-6
🟠 Reconnect endpoint handles invalid start index unsafely
  • What failed: The endpoint parses header text with Number.parseInt but never validates the result, so invalid values (including NaN) are forwarded to workflow stream retrieval instead of being rejected or normalized.
  • Impact: Clients can trigger unstable reconnect behavior and inconsistent error handling when sending malformed offsets. This degrades resilience for stream resume flows and can break reconnection reliability.
  • Introduced by this PR: Yes – this PR modified the relevant code
  • Steps to reproduce:
    1. Create a durable execution via POST /run/api/executions and capture the returned execution ID.
    2. Call GET /run/api/executions/:executionId/stream with x-stream-start-index: NaN (and similarly -1 or a very large index).
    3. Observe that the route forwards the parsed value into stream retrieval instead of rejecting/normalizing invalid input.
  • Code analysis: I reviewed durable execution reconnect routing and found no guardrails for malformed x-stream-start-index; the parsed value flows directly into run.getReadable({ startIndex }), which matches the observed failure mode.
  • Why this is likely a bug: Production reconnect logic accepts unchecked header-derived offsets and passes them into stream internals, which is a direct input-validation defect rather than test setup interference.

Relevant code:

agents-api/src/domains/run/routes/executions.ts (lines 320-326)

app.openapi(reconnectExecutionStreamRoute, async (c) => {
  const executionContext = c.get('executionContext');
  const { tenantId, projectId } = executionContext;
  const { executionId } = c.req.valid('param');
  const startIndexHeader = c.req.header('x-stream-start-index');
  const startIndex = startIndexHeader ? Number.parseInt(startIndexHeader, 10) : 0;

agents-api/src/domains/run/routes/executions.ts (lines 342-345)

return stream(c, async (s) => {
  try {
    const readable = run.getReadable({ startIndex });
    const reader = readable.getReader();

agents-api/src/domains/run/routes/executions.ts (lines 351-354)

} catch (error) {
    logger.error({ error, executionId }, 'Error reconnecting to execution stream');
    await s.write(`event: error\ndata: ${JSON.stringify({ error: 'Stream error' })}\n\n`);
  }
✅ Passed (18)
Category Summary Screenshot
Adversarial Rapid mode toggles with repeated saves preserved the final intended Classic value after reload. ADV-4
Adversarial Injection payload test (TC-ADV-5) was handled safely; the prior error was setup-related and the corrected run streamed successfully. ADV-5
Edge New agent without explicit mode behaved as Classic and no durable run header was emitted. EDGE-6
Edge Execution mode control remained visible and savable on 390x844 mobile viewport. EDGE-8
Flow Unsaved deep-link refresh changes were discarded, while saved changes persisted through back/forward. FLOW-1
Happy-path Mode persisted across save+reload transitions from Durable to Classic. ROUTE-1
Happy-path Classic mode /run/v1/chat/completions returned 200 SSE without x-workflow-run-id. ROUTE-2
Happy-path Durable mode /run/v1/chat/completions returned 200 SSE with x-workflow-run-id. ROUTE-3
Happy-path POST /run/api/executions returned 200 SSE with x-workflow-run-id. ROUTE-4
Tcadv TC-ADV-1: Execution created under Project A was accessible with Project A credentials but returned not-found for status, stream, and approval mutation when accessed using Project B credentials, indicating project boundary enforcement without data leakage. TCADV-1
Tcadv TC-ADV-2: Mismatched toolCallId did not resume execution; correct toolCallId resumed as expected. TCADV-2
Tcadv TC-ADV-3: Concurrent approve/deny requests resolved safely to one terminal outcome without server error. TCADV-3
Tcedge TC-EDGE-1: Approval response without conversationId correctly returns 400 guardrail error. TCEDGE-1
Tcedge TC-EDGE-2: Approval response for unknown conversationId correctly returns 404 without resuming execution. TCEDGE-2
Tcedge TC-EDGE-3: Duplicate approval submissions behaved idempotently and execution stayed in a coherent completed state. TCEDGE-3
Tcedge Partial client-time headers were ignored safely; API continued returning stable 200 SSE responses. TCEDGE-4
Tcedge Malformed client-time headers were ignored safely; API remained stable and control behavior stayed consistent. TCEDGE-5
Tcedge TC-EDGE-7: Using a fake execution/toolCall ID, status, stream, and approval endpoints all returned consistent HTTP 404 not_found responses and exposed no unrelated execution data. TCEDGE-7

Commit: b6c5f00

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants