Skip to content

feat: stream resumption for interrupted conversations#2710

Open
anubra266 wants to merge 2 commits intodurable-executionfrom
feat/stream-resumption
Open

feat: stream resumption for interrupted conversations#2710
anubra266 wants to merge 2 commits intodurable-executionfrom
feat/stream-resumption

Conversation

@anubra266
Copy link
Contributor

@anubra266 anubra266 commented Mar 16, 2026

Summary

Implements the Vercel AI SDK resume pattern using an in-memory stream buffer, so users who disconnect mid-stream can return and receive the response seamlessly.

  • stream-buffer-registry: new module that buffers encoded SSE chunks per conversationId in memory using a string[] + EventEmitter. On resume, replays historical chunks then forwards live ones.
  • chatDataStream: tees the encoded SSE stream — one copy to the HTTP client, one to the buffer registry. Cleans up automatically when the stream completes.
  • GET /run/v1/conversations/:conversationId/stream: new endpoint the Vercel AI SDK useChat calls on mount. Returns the buffered stream if one is active, or 204 if not (causing the SDK to fall back to loaded DB messages).

Closes PRD-6312

@changeset-bot
Copy link

changeset-bot bot commented Mar 16, 2026

⚠️ No Changeset found

Latest commit: 03d9b64

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Mar 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 25, 2026 0:27am
agents-docs Error Error Mar 25, 2026 0:27am
agents-manage-ui Ready Ready Preview, Comment Mar 25, 2026 0:27am

Request Review

@pullfrog
Copy link
Contributor

pullfrog bot commented Mar 16, 2026

TL;DR — Adds two complementary stream-resumption mechanisms for interrupted conversations: an in-memory buffer (classic mode) that tees encoded SSE chunks and replays them on reconnect via GET /run/v1/conversations/:conversationId/stream, and a durable execution engine (new executionMode: 'durable') backed by WDK workflows that persists execution state across crashes, supports tool-approval suspend/resume, and exposes a full /run/api/executions REST surface.

Key changes

  • stream-buffer-registry — New in-memory per-conversation Uint8Array[] buffer with EventEmitter-based live tailing and 60 s post-completion TTL, enabling classic-mode stream replay.
  • GET /conversations/:conversationId/stream — New endpoint that replays the buffered stream (200) or returns 204 when no active stream exists, implementing the Vercel AI SDK resume pattern.
  • chatDataStream tee — The Vercel data stream is now tee()'d: one copy goes to the HTTP client, the other feeds the buffer registry; the [DONE] sentinel is removed from SSEStreamHelper.complete().
  • stream-registryglobalThis — The StreamHelper registry is moved from a module-scoped Map to a globalThis-backed singleton so the WDK workflow bundle and the main Hono app share the same instance.
  • Durable execution workflow — New agentExecutionWorkflow with 'use workflow' / 'use step' directives: initializeTask → callLlm → (transfer | tool_calls | completion) loop, with defineHook-based tool-approval suspend/resume.
  • /run/api/executions routes — New POST /executions (start), GET /executions/:id (status), GET /executions/:id/stream (reconnect), and POST /executions/:id/approvals/:toolCallId (approve/deny) endpoints.
  • durable-stream-helperWritableBackedHonoSSEStream and WritableBackedVercelWriter adapters that write SSE events to WDK WritableStream<Uint8Array> for durable persistence.
  • agentExecutionSteps — 880-line step library (callLlmStep, executeToolStep, markWorkflow{Running,Suspended,Resuming,Complete,Failed}Step) that reconstructs the full agent context per step.
  • tool-approval durable path — When durableWorkflowRunId is set, approval returns { approved: 'pending' } instead of blocking, sets ctx.pendingDurableApproval, and the workflow suspends via defineHook.
  • tool-wrapper effectiveToolCallId — Pre-approved tool calls reuse the original toolCallId for stream continuity across suspend/resume cycles.
  • executionMode schema + UI — New execution_mode column on agents (manage DB), workflow_executions table (runtime DB), Zod schemas, DAL functions, and a Select dropdown in the metadata editor.
  • build-workflow.ts ?raw inlining — Pre-processes Vite ?raw imports into string literals before WDK esbuild bundling, then restores the originals.

Summary | 50 files | 12 commits | base: mainfeat/stream-resumption


In-memory stream-buffer-registry

Before: Disconnecting mid-stream meant losing the entire response; the client had to wait for the next DB-persisted message load.
After: Encoded SSE chunks are buffered per conversationId in memory. Reconnecting clients receive a full replay of historical chunks followed by live-tailed new ones, with automatic cleanup 60 s after stream completion.

The registry uses EventEmitter for live chunk forwarding and a done flag to handle the race between replay and completion. In chatDataStream, the encoded stream is tee()'d — one branch serves the HTTP client, the other feeds the buffer asynchronously. The writeDone() / [DONE] sentinel was removed from SSEStreamHelper.complete() since the Vercel AI SDK data stream format doesn't use it and it interfered with clean stream termination.

stream-buffer-registry.ts · chatDataStream.ts · conversations.ts · stream-helpers.ts


Durable execution workflow engine

Before: Agent execution was ephemeral — a server crash or process restart lost all in-flight state, and tool approvals required the original HTTP connection to remain open.
After: Agents with executionMode: 'durable' run inside a WDK workflow with checkpointed steps. The workflow suspends on tool-approval requests via defineHook, persists state to the workflow_executions table, and resumes when the approval endpoint is called.

The workflow loop (agentExecutionWorkflow) iterates through LLM calls and tool executions with transfer support. Each step reconstructs the full agent context (project, sub-agent relations, tools, credentials) via buildAgentForStep. Stream output is written to WDK WritableStream via adapter classes, allowing clients to reconnect at any point via getRun(id).getReadable().

How does tool approval suspend/resume work?

When a tool requires approval in durable mode, waitForToolApproval sets ctx.pendingDurableApproval and returns { approved: 'pending' } instead of blocking on PendingToolApprovalManager. The workflow then calls markWorkflowSuspendedStep (updating the DB status to suspended with a continuationStreamNamespace), creates a defineHook with token tool-approval:{conversationId}:{runId}:{toolCallId}, and awaits it. When the client calls POST /executions/:id/approvals/:toolCallId (or sends an approval via the /chat endpoint for suspended durable executions), toolApprovalHook.resume(token, decision) unblocks the workflow, which then calls executeToolStep with the pre-approved decision and continues the LLM loop on a new stream namespace.

agentExecution.ts · agentExecutionSteps.ts · durable-stream-helper.ts · executions.ts


Tool approval durable path and effectiveToolCallId

Before: Tool approval was purely in-memory via PendingToolApprovalManager — the approving client had to be connected to the same process that initiated the tool call.
After: In durable mode, approval state is externalized through WDK hooks. The tool-wrapper maps pre-approved tool calls back to their original toolCallId so stream events (input start/delta, output, errors) maintain continuity across suspend/resume boundaries.

The parseAndCheckApproval return type gains a pendingApproval?: true discriminant. handleStopWhenConditions short-circuits when pendingDurableApproval is set. The PendingToolApprovalManager itself gets a minor refactor (local pendingApprovals variable to avoid repeated this. access).

tool-approval.ts · tool-wrapper.ts · Agent.ts · agent-types.ts


Schema, DAL, and UI changes

Before: No executionMode concept existed; workflow execution state was not tracked in the database.
After: agents.execution_mode column (manage DB, default 'classic'), workflow_executions table (runtime DB) with composite PK and conversation index, full CRUD DAL, Zod schemas, and a Select dropdown in the agent metadata editor.

Layer Change
Manage schema execution_mode varchar(50) NOT NULL DEFAULT 'classic' on agents
Runtime schema New workflow_executions table with status enum (running/suspended/completed/failed)
Validation WorkflowExecutionSelectSchema, InsertSchema, UpdateSchema; AgentInsertSchema gains executionMode
DAL createWorkflowExecution, getWorkflowExecution, getWorkflowExecutionByConversation, updateWorkflowExecutionStatus
UI executionMode field in AgentMetadata, Select component in MetadataEditor, serialization in serialize.ts

manage-schema.ts · runtime-schema.ts · workflowExecutions.ts · metadata-editor.tsx

Pullfrog  | View workflow run | Using Claude Code | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough review of the durable execution mode and stream resumption feature. This is a well-structured addition, but there are several issues that need attention before merge — ranging from a credential-handling blocker to race conditions and type-safety concerns.

Critical (blocks merge):

  • buildAgentForStep passes credentialStoreRegistry: undefined and userId: undefined, which breaks any MCP tool requiring authenticated credentials
  • Orphan migration file 0022_futuristic_ozymandias.sql (identical to 0023) will cause confusion and potential conflicts
  • Race condition in createReadable() — TOCTOU gap between replaying and subscribing

High:

  • [DONE] sentinel removal from SSEStreamHelper.complete() breaks the agents-sdk which uses it as a stream-termination signal
  • WritableBackedVercelWriter.write() fire-and-forget async pattern violates WritableStream contract
  • Unguarded JSON.parse in generateTaskHandler can crash the task handler on malformed metadata

Medium:

  • Changeset should be minor (new column, new table, new endpoints)
  • approved: 'pending' string in a boolean union is fragile
  • Metadata replacement vs merge in updateWorkflowExecutionStatus

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

Comment on lines +157 to +158
credentialStoreRegistry: undefined,
userId: undefined,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

credentialStoreRegistry: undefined and userId: undefined are passed to getMcpToolById. In the core data-access layer, discoverToolsFromServer throws when a tool has a credential reference and the registry is undefined. This will break any MCP tool requiring credentials in the durable path, while the non-durable path (generateTaskHandler.ts) passes the real registry from the request context.

Since workflow steps can't access the Hono request context, you'll likely need to serialize credential store config into the workflow payload or use a service-level credential store that doesn't depend on the request.

Comment on lines 381 to 385
async complete(finishReason = 'stop'): Promise<void> {
await this.flushQueuedOperations();

await this.writeCompletion(finishReason);
await this.writeDone();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the [DONE] sentinel from SSEStreamHelper.complete() is a breaking change. The agents-sdk (packages/agents-sdk/src/agent.ts) uses data: [DONE] as a break signal to stop reading the stream. Without it, the SDK will hang waiting for the stream to close via TCP FIN instead of cleanly terminating.

If durable mode doesn't need [DONE], consider keeping it in SSEStreamHelper and omitting it only in the durable stream adapters.

Comment on lines +57 to +61
write(chunk: unknown): void {
const bytes = encoder.encode(`data: ${JSON.stringify(chunk)}\n\n`);
this.writer.write(bytes).catch((err) => {
logger.warn({ err }, 'Failed to write to durable stream');
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write() is synchronous but calls this.writer.write(bytes).catch(...) without awaiting it. If two write() calls happen in rapid succession, the second writer.write() is called before the first resolves. This violates the WritableStream contract (must not call write() while a previous write is pending) and can silently reorder chunks or throw.

Either make write() async and await the write, or chain writes via a promise queue.

Comment on lines +380 to +383
const message = error instanceof Error ? error.message : String(error);
if (!message.includes('not found') && !message.includes('already')) {
throw error;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swallowing errors whose message includes('not found') or includes('already') is fragile string-matching that could mask real errors (e.g. a database 'connection not found' error). Consider catching a specific error type or checking for an exact error code from the WDK hook resumption API.

@@ -0,0 +1,5 @@
---
"@inkeep/agents-api": patch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changeset is patch, but the PR adds a new column (execution_mode) to the agents table, a new workflow_executions table, and new API endpoints (/executions). Per the project's semver guidance, schema changes requiring migration should be minor.

Comment on lines +79 to +86
const updateData: Record<string, unknown> = {
status: params.status,
updatedAt: now,
};

if (params.metadata !== undefined) {
updateData.metadata = params.metadata;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateWorkflowExecutionStatus fully replaces the metadata column when params.metadata is provided, rather than merging with existing metadata. This means markWorkflowSuspendedStep (which sets { continuationStreamNamespace }) followed by markWorkflowFailedStep (which sets { error }) will erase the namespace. If any code later needs the namespace after a failure, it will be lost. Consider using a JSON merge or documenting that metadata is intentionally replaced per-status-transition.

Comment on lines +408 to +410
const firstArtifactData = (messageResponse.result as any)?.artifacts?.[0]?.parts?.[0]
?.data as { type?: string; toolCallId?: string; toolName?: string; args?: unknown };
if (firstArtifactData?.type === 'durable-approval-required') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reaching into (messageResponse.result as any)?.artifacts?.[0]?.parts?.[0]?.data with a chain of any casts and checking a magic string 'durable-approval-required' is fragile. If the artifact shape changes in generateTaskHandler.ts, this silently breaks. Consider extracting a shared type/constant for the durable-approval-required artifact shape.

Comment on lines +90 to +101
await executeToolStep({
payload,
currentSubAgentId,
toolCallId: toolCall.toolCallId,
toolName: toolCall.toolName,
args: toolCall.args,
workflowRunId,
streamNamespace: `r${approvalRound}`,
taskId,
preApproved: approvalResult.approved,
approvalReason: approvalResult.reason,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of executeToolStep is discarded — { type: 'needs_approval' } is silently ignored. If the tool executed inside executeToolStep itself triggers another tool that requires approval (nested tool calling), the pending approval is lost and the workflow continues as if the tool completed successfully.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(12) Total Issues | Risk: High

🔴❗ Critical (3) ❗🔴

🔴 1) stream-buffer-registry.ts Module-scoped singleton not shared across bundle boundaries

Issue: StreamBufferRegistry uses a module-level singleton (export const streamBufferRegistry), but the existing stream-registry.ts uses globalThis with a key to ensure the registry is shared across module boundaries (e.g., WDK workflow bundle and main Hono app bundle).

Why: The comment in stream-registry.ts explicitly states this pattern is needed so both bundles resolve to the same Map. Without this, the buffer registered by chatDataStream (main bundle) won't be found by the resume endpoint or workflow steps (potentially different bundle), causing stream resumption to silently fail with 204 responses.

Fix: Follow the globalThis pattern from stream-registry.ts:

const REGISTRY_KEY = '__inkeep_streamBufferRegistry';

function getBufferRegistry(): Map<string, BufferEntry> {
  const g = globalThis as Record<string, unknown>;
  if (!g[REGISTRY_KEY]) {
    g[REGISTRY_KEY] = new Map<string, BufferEntry>();
  }
  return g[REGISTRY_KEY] as Map<string, BufferEntry>;
}

Refs:

🔴 2) 0022_futuristic_ozymandias.sql Duplicate migration file

Issue: This migration file is byte-identical to 0023_futuristic_ozymandias.sql — both create the same workflow_executions table. The journal references 0022_superb_micromacro at index 22, but this orphaned file is not registered.

Why: If accidentally included in a migration run, it will fail with 'table already exists'. This indicates possible migration state corruption from rebasing/merging that could cause deployment failures.

Fix: Remove this orphaned file using pnpm db:drop or manually delete since it's NOT in the journal. Per AGENTS.md: "NEVER manually delete migration files - use pnpm db:drop".

Refs:

🔴 3) system PR title/description misrepresents actual scope

Issue: The PR is titled "feat: in-memory stream resumption" but the actual diff is +2460/-145 lines implementing comprehensive durable execution infrastructure. The stream resumption feature is ~260 lines; durable execution infrastructure is ~2200+ lines.

Why: The changeset itself says "Add durable execution mode for agent runs with tool approvals and crash recovery." Future maintainers reviewing this PR will be misled about the scope of changes introduced.

Fix: Update PR title to accurately reflect scope: "feat: durable execution mode for agent runs with tool approvals, crash recovery, and stream resumption". Lead with durable execution architecture in the description.

Inline Comments:

  • 🔴 Critical: stream-buffer-registry.ts:25 Module-scoped singleton pattern
  • 🔴 Critical: chatDataStream.ts:656-667 Fire-and-forget async IIFE swallows errors
  • 🔴 Critical: 0022_futuristic_ozymandias.sql:1 Duplicate migration file

🟠⚠️ Major (5) 🟠⚠️

🟠 1) chatDataStream.ts:448-491 Durable execution path doesn't register with stream buffer

Issue: The durable execution path streams directly from run.readable without registering with streamBufferRegistry. Stream resumption via GET /conversations/:conversationId/stream won't work for durable executions.

Why: This is the use case where crash recovery is most needed, yet the resume endpoint returns 204 because the buffer was never registered.

Fix: Either tee run.readable and register with buffer, update the resume endpoint to check durable executions, or document the limitation clearly.

Refs:

🟠 2) stream-buffer-registry.ts:15-25 Overwriting existing buffer leaves active consumers hanging

Issue: If a new chat request arrives for a conversation with an active buffer, register() overwrites without emitting 'done' to existing subscribers, leaving them hanging indefinitely.

Why: Previous resume clients will wait forever for a 'done' event that never comes.

Fix: Emit 'done' on existing emitter before overwriting (see inline comment for code).

🟠 3) stream-buffer-registry.ts:49-69 Race condition between done check and event listener setup

Issue: After checking if (entry.done) and before attaching the 'done' listener, the stream could complete, causing the 'done' event to be missed.

Why: This can cause the resume stream to hang indefinitely in edge cases.

Fix: Subscribe to events FIRST, then replay buffered chunks (see inline comment for code).

🟠 4) multi-file Missing test coverage for critical new functionality

Issue: The StreamBufferRegistry class (core of stream resumption), the resume endpoint, and durable execution workflow steps have zero test coverage. Per AGENTS.md: "ALL new work MUST include these three components - NO EXCEPTIONS: 1. Unit Tests".

Why: Without tests, race conditions, memory leaks, and cleanup failures could slip through undetected. The existing generateTaskHandler.test.ts adds mocks for durable methods but no tests exercise them.

Fix: Add dedicated tests for:

  • StreamBufferRegistry: registration, push, replay, cleanup timeouts, concurrent access
  • Resume endpoint: 204 responses, authorization, SSE headers
  • Durable execution: workflow context propagation, approval queue consumption

Refs:

🟠 5) multi-file Missing changeset for @inkeep/agents-core schema changes

Issue: Schema changes (new execution_mode column, new workflow_executions table) are in @inkeep/agents-core but the changeset only bumps @inkeep/agents-api.

Why: Downstream consumers of @inkeep/agents-core who depend on Drizzle schema types won't receive proper semver signaling.

Fix: Add @inkeep/agents-core: minor to the changeset per data-model-changes skill guidance.

Inline Comments:

  • 🟠 Major: stream-buffer-registry.ts:15-25 Overwriting existing buffer
  • 🟠 Major: stream-buffer-registry.ts:49-69 Race condition in createReadable
  • 🟠 Major: chatDataStream.ts:448-491 Durable path missing buffer registration

🟡 Minor (2) 🟡

🟡 1) stream-buffer-registry.ts:10 Hardcoded cleanup delay without configurability

Issue: The 60-second cleanup delay is hardcoded with no environment variable override and no observability into buffer lifecycle.
Why: Self-hosted deployments may need different windows; debugging buffer state during incidents is difficult.
Fix: Move to execution-limits constants; add logging for buffer events.

🟡 2) multi-file executionMode naming is opaque to customers

Issue: 'classic' implies 'old/legacy'; 'durable' is infrastructure jargon that doesn't communicate customer benefit.
Why: Customers choosing between modes won't understand tradeoffs.
Fix: Consider renaming to 'ephemeral' | 'resumable' or document clearly.

Inline Comments:

  • 🟡 Minor: stream-buffer-registry.ts:10 Hardcoded cleanup delay

💭 Consider (2) 💭

💭 1) executions.ts:100-101 Path parameter style inconsistency

Issue: Uses :executionId Express-style while conversations.ts uses {conversationId} OpenAPI-style.
Fix: Standardize on OpenAPI-style {param} format for consistency.

💭 2) stream-buffer-registry.ts No graceful shutdown handling

Issue: Active setTimeout handles will keep the process alive during shutdown. Active streams may be left inconsistent.
Fix: Add a shutdown() method that clears timeouts and emits 'done' to all active buffers.


🚫 REQUEST CHANGES

Summary: This PR introduces valuable durable execution infrastructure, but has several critical issues that need addressing before merge:

  1. The stream buffer registry uses a module-scoped singleton that won't work across bundle boundaries — this will cause stream resumption to silently fail in production.
  2. Duplicate migration file indicates migration state corruption that could fail deployments.
  3. Missing error handling in the background buffering task will swallow errors silently.
  4. Durable mode doesn't actually support stream resumption via the documented endpoint.
  5. Critical functionality lacks test coverage per AGENTS.md requirements.

The architecture is sound and the feature is valuable, but these issues need resolution to avoid production incidents.

Discarded (8)
Location Issue Reason Discarded
tool-approval.ts Cross-tenant tool approval bypass via unvalidated toolCallId Further investigation shows the conversationId is validated before approval; toolCallId is internally generated and not user-controllable in the classic path
agentExecutionSteps.ts Code duplication with generateTaskHandler WDK workflow steps have serialization constraints requiring separate implementations; documented architectural decision
workflowExecutions.ts Missing optimistic locking on status updates WDK handles step sequencing; concurrent updates are expected to fail-fast at the workflow level
PendingToolApprovalManager.ts Variable extraction refactoring unrelated to feature Incidental cleanup that doesn't change behavior; not blocking
stream-helpers.ts Removal of writeDone() [DONE] marker OpenAI spec doesn't require this; Vercel AI SDK handles stream termination via connection close
conversations.ts Authorization check returns 204 same as no-stream This is intentional information-leakage protection; logging suggestion is valid but optional
durable-stream-helper.ts Silent write failures in write() Errors are logged at warn level; stream write failures are inherently non-recoverable
agent-types.ts Record type allows undefined access Current code uses optional chaining correctly; runtime safety is preserved
Reviewers (11)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 3 1 0 0 2 0 0
pr-review-sre 7 2 1 0 2 0 2
pr-review-architecture 8 2 0 0 1 0 5
pr-review-tests 7 1 0 0 0 0 6
pr-review-errors 7 0 0 0 1 0 6
pr-review-breaking-changes 5 2 0 0 1 0 2
pr-review-precision 6 1 0 0 0 0 5
pr-review-product 5 0 1 0 0 0 4
pr-review-consistency 6 0 1 0 0 0 5
pr-review-types 5 0 0 0 0 0 5
pr-review-security-iam 1 0 0 0 0 0 1
Total 60 9 3 0 7 0 41

Note: Many findings were deduplicated across reviewers (e.g., multiple reviewers flagged the singleton pattern and error handling issues).

@@ -0,0 +1,15 @@
CREATE TABLE "workflow_executions" (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Duplicate migration file

Issue: This migration file (0022_futuristic_ozymandias.sql) is byte-identical to 0023_futuristic_ozymandias.sql — both create the same workflow_executions table. The journal (drizzle/runtime/meta/_journal.json) references 0022_superb_micromacro at index 22, but this file is not registered and appears to be a duplicate.

Why: If this orphaned file is accidentally included in a migration run, it will fail with 'table already exists'. This indicates possible migration state corruption that could cause deployment failures.

Fix: Remove this orphaned file using pnpm db:drop to manage migration state properly, or manually delete ONLY this file since it's NOT in the journal. The journaled migrations should remain.

Refs:

Comment on lines +448 to +491
if (agent.executionMode === 'durable') {
const requestId = `chatds-${Date.now()}`;
const run = await start(agentExecutionWorkflow, [
{
tenantId,
projectId,
agentId,
conversationId,
userMessage: userText,
messageParts: messageParts.length > 0 ? messageParts : undefined,
requestId,
resolvedRef: executionContext.resolvedRef,
forwardedHeaders:
Object.keys(forwardedHeaders).length > 0 ? forwardedHeaders : undefined,
outputFormat: 'vercel',
},
]);
logger.info(
{ runId: run.runId, conversationId, agentId },
'Durable execution started via /chat'
);
c.header('x-workflow-run-id', run.runId);
c.header('content-type', 'text/event-stream');
c.header('cache-control', 'no-cache');
c.header('connection', 'keep-alive');
c.header('x-vercel-ai-data-stream', 'v2');
c.header('x-accel-buffering', 'no');
return stream(c, async (s) => {
try {
const reader = run.readable.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
await s.write(value);
}
} catch (error) {
logger.error(
{ error, runId: run.runId },
'Error streaming durable execution via /chat'
);
await s.write(`event: error\ndata: ${JSON.stringify({ error: 'Stream error' })}\n\n`);
}
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Durable execution path doesn't register with stream buffer

Issue: The durable execution path (executionMode === 'durable') streams directly from run.readable without registering with streamBufferRegistry. This means stream resumption via GET /conversations/:conversationId/stream won't work for durable executions — the endpoint returns 204 because the buffer was never registered.

Why: The PR describes stream resumption as a key feature, but it only works for 'classic' mode. This is the use case where crash recovery is most needed.

Fix: Either:

  1. Tee run.readable and register with streamBufferRegistry similar to the non-durable path, OR
  2. Update the resume endpoint to also check for active durable executions via getRun().getReadable(), OR
  3. Document that stream resumption via GET /conversations/:id/stream only works for classic mode, while durable mode uses the separate GET /executions/:id/stream endpoint

Refs:

timeoutId?: ReturnType<typeof setTimeout>;
}

const CLEANUP_DELAY_MS = 60_000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Hardcoded cleanup delay without configurability

Issue: The 60-second cleanup delay is hardcoded and not configurable via environment variable. Additionally, there's no observability into buffer lifecycle events (registration, completion, cleanup).

Why: During incident debugging, it's difficult to understand buffer state. Self-hosted deployments may need different cleanup windows. Under load, completed streams accumulate in memory for 60s.

Fix: Consider:

  1. Move CLEANUP_DELAY_MS to execution-limits constants for environment override support
  2. Add trace spans or log statements for buffer registration, completion, and cleanup events
  3. Expose a getStats() method for health checks: { activeBuffers: number, totalBytes: number }

Refs:

@github-actions github-actions bot deleted a comment from claude bot Mar 16, 2026
@anubra266 anubra266 changed the base branch from main to durable-execution March 16, 2026 22:09
@anubra266 anubra266 changed the base branch from durable-execution to main March 16, 2026 22:10
@anubra266 anubra266 changed the base branch from main to durable-execution March 16, 2026 22:10
@anubra266 anubra266 changed the base branch from durable-execution to main March 16, 2026 22:12
@anubra266 anubra266 changed the base branch from main to durable-execution March 16, 2026 22:12
@anubra266 anubra266 force-pushed the feat/stream-resumption branch from c63a846 to 42077a3 Compare March 16, 2026 22:15
@vercel vercel bot temporarily deployed to Preview – agents-docs March 16, 2026 22:15 Inactive
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(7) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

Inline Comments:

  • 🔴 Critical: stream-buffer-registry.ts:13 Module-scoped singleton not shared across bundle boundaries
  • 🔴 Critical: chatDataStream.ts:656-667 Fire-and-forget async IIFE lacks error handling

🟠⚠️ Major (4) 🟠⚠️

🟠 1) stream-buffer-registry.ts Missing test coverage for critical new functionality

files: stream-buffer-registry.ts, conversations.ts:376-405

Issue: The StreamBufferRegistry class (79 lines) and the new resume endpoint have zero test coverage. Per AGENTS.md: "ALL new work MUST include... Unit Tests - NO EXCEPTIONS".

Why: Without tests, these regressions could go undetected:

  • Memory leaks if cleanup timer fails to delete entries after 60s
  • Race condition if createReadable() called between chunk replay and event subscription
  • EventEmitter listener leaks if onDone handler fails to clean up
  • Authorization bypass in resume endpoint (userId check untested)

Fix: Create agents-api/src/domains/run/stream/__tests__/stream-buffer-registry.test.ts with test cases for:

  • register() creates new buffer entry
  • push() adds chunks and emits events
  • complete() marks done and schedules 60s cleanup
  • createReadable() replays buffered chunks then forwards live ones
  • Race condition scenarios
  • Resume endpoint authorization (204 vs 200 responses)

Refs:

Inline Comments:

  • 🟠 Major: stream-buffer-registry.ts:48-70 Race condition between chunk replay and event subscription
  • 🟠 Major: conversations.ts:387-389 Error response pattern inconsistency — 204 for unauthorized vs 404
  • 🟠 Major: conversations.ts:404 Resume endpoint lacks error handling around stream piping

🟡 Minor (1) 🟡

🟡 1) stream-buffer-registry.ts:10 Hardcoded cleanup delay without observability

Issue: The 60-second cleanup delay (CLEANUP_DELAY_MS) is hardcoded with no environment variable override and no logging/metrics for buffer lifecycle events.
Why: During incident debugging, understanding buffer state is difficult. Self-hosted deployments may need different cleanup windows.
Fix: Consider moving to execution-limits constants; add debug logging for register/complete/cleanup events.


🚫 REQUEST CHANGES

Summary: This PR implements the Vercel AI SDK resume pattern for stream reconnection — a valuable feature. However, there are several issues that need addressing before merge:

  1. The stream buffer registry uses a module-scoped singleton that won't work across bundle boundaries — this will cause stream resumption to silently fail with 204 responses. Follow the globalThis pattern from the existing stream-registry.ts.

  2. Fire-and-forget async IIFE in chatDataStream.ts has no error handling — errors become unhandled promise rejections with no logging, making debugging impossible.

  3. Missing test coverage for critical new functionality violates AGENTS.md requirements.

  4. Race condition in createReadable() between replaying buffered chunks and subscribing to new chunks — chunks can be permanently lost.

The architecture is sound and the feature is well-scoped. These issues are straightforward to fix.

Discarded (6)
Location Issue Reason Discarded
stream-buffer-registry.ts:3-8 Unbounded memory growth from Uint8Array[] accumulation Valid concern but applies to existing stream-registry pattern too; acceptable for in-memory architecture with 60s cleanup
stream-buffer-registry.ts:39-41 setTimeout keeps process alive during shutdown Minor operational concern; .unref() is nice-to-have, not blocking
stream-buffer-registry.ts:34-42 No error state distinction in complete() The SSE stream protocol handles success/error via event content, not buffer state
stream-buffer-registry.ts:27-32 push() silently ignores missing entries Intentional defensive behavior; logging would be noisy
chatDataStream.ts:653 tee() backpressure coupling Expected behavior for resumption; buffer reader completing is the goal
conversations.ts:396 Debug-level logging insufficient Valid but nitpick; info level would be better but not blocking
Reviewers (5)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 1 0 0 0 1 0 0
pr-review-sre 6 1 0 0 1 0 4
pr-review-tests 4 1 0 0 0 0 3
pr-review-consistency 2 0 0 0 1 0 1
pr-review-errors 5 0 0 0 2 0 3
Total 18 2 0 0 5 0 11

Note: Many findings were deduplicated across reviewers (e.g., error handling in IIFE flagged by both standards and errors reviewers).

Comment on lines +387 to +389
if (!conversation || conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Error response pattern inconsistency — 204 for unauthorized vs established 404 pattern

Issue: Returns HTTP 204 when conversation is not found or doesn't belong to the authenticated user, but the sibling endpoint in this same file (get conversation, lines 287-291) throws createApiError({ code: 'not_found' }) for the identical check.

Why: This creates two different error semantics for the same authorization failure condition within the same route file. The OpenAPI spec declares 204 only for "No active stream", not for authorization failures.

Fix: Distinguish between the two 204 cases:

Suggested change
if (!conversation || conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}
if (!conversation || conversation.userId !== endUserId) {
throw createApiError({ code: 'not_found', message: 'Conversation not found' });
}

Refs:

c.header('x-vercel-ai-data-stream', 'v2');
c.header('x-accel-buffering', 'no');

return stream(c, (s) => s.pipe(readable));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Resume endpoint lacks error handling around stream piping

Issue: The stream pipe has no try-catch, unlike every other streaming endpoint in this codebase (see executions.ts:276-289, chatDataStream.ts:475-490) which wrap stream operations in try-catch blocks with error logging and error event transmission.

Why: If the readable stream errors during transmission (e.g., EventEmitter error, client disconnect), the error propagates as an unhandled exception with no logging. No error event is written to the SSE stream, so clients see an abrupt disconnect with no indication of what went wrong.

Fix: Follow the established pattern:

Suggested change
return stream(c, (s) => s.pipe(readable));
return stream(c, async (s) => {
try {
const reader = readable.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
await s.write(value);
}
} catch (error) {
logger.error({ error, conversationId }, 'Error resuming conversation stream');
await s.write(`event: error\ndata: ${JSON.stringify({ error: 'Stream error' })}\n\n`);
}
});

Refs:

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: Medium

Delta Review: This review is scoped to changes in commit 329e1ec9b which addresses prior review feedback. The delta is +42/-17 lines across 2 files.

✅ Prior Critical Issues — ADDRESSED

The following critical issues from prior reviews have been addressed in this delta:

Issue Status How Addressed
Module-scoped singleton not shared across bundle boundaries ✅ Fixed Now uses globalThis pattern matching stream-registry.ts (lines 11-19)
Fire-and-forget async IIFE lacks error handling ✅ Fixed Added try/catch with error logging (lines 664-666)
Race condition between chunk replay and event subscription ✅ Fixed Now subscribes to emitter events BEFORE replaying buffered chunks (lines 77-82)
Missing cancel() handler for listener cleanup ✅ Fixed Added cancel() handler that removes event listeners (lines 88-93)
Overwriting buffer leaves existing consumers hanging ✅ Fixed Now emits 'done' to existing subscribers before overwriting (lines 30-33)

💭 Consider (1) 💭

Inline Comments:

  • 💭 Consider: chatDataStream.ts:664-667 Error logged but buffer not marked as errored — resumed streams won't know if source terminated due to error vs completion

🕐 Pending Recommendations (3)

These issues were raised in prior reviews and remain unaddressed:


💡 APPROVE WITH SUGGESTIONS

Summary: The delta commit addresses all the critical issues raised in prior reviews — excellent work! The globalThis singleton pattern, race condition fix, cancel handler, and error handling are all properly implemented. The remaining pending recommendations from prior reviews (test coverage, error response consistency, stream piping error handling) are still valid but are not blocking. The "Consider" item about distinguishing error vs normal completion in the buffer is a minor enhancement opportunity. This PR is in good shape for merge once the pending major items (particularly test coverage) are addressed or explicitly deferred.

Discarded (0)

No findings were discarded in this delta review.

Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-sre 1 0 0 0 1 0 0
pr-review-tests 0 0 0 0 0 0 0
Total 1 0 0 0 1 0 0

Note: This was a focused delta review. Prior review findings (test coverage, etc.) were confirmed as still pending but not re-raised per review scope guidelines.

Comment on lines +664 to +667
} catch (error) {
logger.error({ error, conversationId }, 'Error buffering stream for resumption');
} finally {
streamBufferRegistry.complete(conversationId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Consider: Error logged but stream buffer not marked as errored

Issue: The catch block logs the error but completes the buffer normally via finally. Consumers resuming via createReadable() will receive all buffered chunks followed by a clean close, with no indication the stream terminated due to error.

Why: If the source stream throws after buffering partial content, the resume endpoint serves truncated data as if it were complete. The client has no signal that the response is incomplete. This is a minor concern since the SSE stream format itself typically includes terminal events that indicate completion status.

Fix: Consider adding an error(conversationId) method to StreamBufferRegistry that marks the entry as errored and emits an error event, allowing resumed streams to relay the error condition to clients. Alternatively, the Vercel AI SDK data stream format may already handle this via error events in the stream content.

Refs:

@github-actions github-actions bot deleted a comment from claude bot Mar 16, 2026
@vercel vercel bot temporarily deployed to Preview – agents-docs March 16, 2026 23:34 Inactive
@pullfrog
Copy link
Contributor

pullfrog bot commented Mar 16, 2026

TL;DR — Adds in-memory stream buffering so clients that disconnect mid-response can reconnect and replay the full SSE stream from the beginning. Implements the Vercel AI SDK resume pattern via a new StreamBufferRegistry singleton and a GET /run/v1/conversations/:conversationId/stream endpoint.

Key changes

  • stream-buffer-registry.ts — New in-memory registry that buffers encoded SSE chunks per conversationId and replays them to late-joining readers.
  • chatDataStream.ts — Tees the encoded SSE stream so one copy goes to the HTTP client and the other feeds the buffer registry.
  • conversations.ts — New GET /{conversationId}/stream route that returns the buffered stream (200) or signals "no active stream" (204).
  • runAuth.ts — Skips Proof-of-Work verification for GET requests so stream-resume calls from web_client apps aren't blocked.
  • app-credential-auth.test.ts — Existing PoW tests updated to use POST; new test covers GET-skip behavior.

Summary | 6 files | 3 commits | base: durable-executionfeat/stream-resumption


StreamBufferRegistry — in-memory SSE replay buffer

Before: No mechanism to recover a dropped SSE connection; a disconnected client lost all streamed content.
After: Every active conversation stream is buffered in memory. Late joiners receive all historical chunks, then live-tail new ones until the stream completes.

The registry stores Uint8Array chunks per conversationId and exposes an EventEmitter-based createReadable() that replays buffered data then forwards live events. Entries are cleaned up 60 seconds after stream completion. A globalThis-pinned Map ensures the singleton survives module reloads in dev.

What happens when a new stream starts for the same conversation? register() checks for an existing entry — if found, it marks it done (emitting 'done' to any connected readers) and clears its cleanup timeout before replacing it with a fresh buffer.

stream-buffer-registry.ts


chatDataStream.ts — stream tee for buffering

Before: The encoded SSE stream was piped directly to the HTTP response.
After: The stream is tee'd — one branch pipes to the client, the other feeds streamBufferRegistry.push() in a fire-and-forget async loop that calls complete() when done.

chatDataStream.ts


GET /{conversationId}/stream — resume endpoint

Before: No endpoint for stream reconnection.
After: GET /run/v1/conversations/:conversationId/stream returns the buffered stream (200 with SSE headers) or 204 if no active stream exists, matching the contract the Vercel AI SDK useChat expects on mount.

The handler validates conversation ownership via endUserId before exposing the stream.

conversations.ts


runAuth.ts — PoW bypass for GET

Before: Proof-of-Work was verified on every request from web_client apps, including GETs.
After: PoW check is skipped for GET requests, since they carry no mutating payload and the resume endpoint must work without a PoW token.

runAuth.ts · app-credential-auth.test.ts

Pullfrog  | View workflow run | Using Claude Code | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Delta Review: This review is scoped to changes in commit 5ca41d078 since the last automated review. The delta is +44/-8 lines across 2 files.

✅ Delta Changes — APPROVED

The delta adds a PoW (Proof of Work) check bypass for GET requests in web_client app authentication. This is a sound security decision:

Aspect Analysis
Why skip PoW for GET? The stream resume endpoint (GET /conversations/:conversationId/stream) needs to work without PoW since the Vercel AI SDK's useChat hook calls it on mount — before any user interaction that could trigger PoW solving.
Security preserved GET requests still require: valid x-inkeep-app-id, valid Origin matching allowedDomains, valid anonymous JWT (signed, not expired), and JWT app claim matching request app ID.
PoW intent maintained Per the PoW spec, PoW prevents "unlimited chat messages" and "mass identity creation" — both POST operations. GET requests don't create identities, send messages, or consume LLM tokens.
Test coverage New test should skip PoW for GET requests (e.g. stream resume) verifies the behavior. Existing PoW tests updated to use POST method appropriately.

💭 Consider (1) 💭

💭 1) runAuth.ts:515 HEAD requests also safe but not included in skip

Issue: The condition method !== 'GET' means HEAD requests (another safe HTTP method) still require PoW. The codebase has precedent for treating GET and HEAD together (branchScopedDb.ts:97).

Why: Very low risk — HEAD requests aren't used by the Vercel AI SDK or this feature. If HEAD requests were ever made to web_client authenticated endpoints, they'd unexpectedly require PoW.

Fix: Optional: ['GET', 'HEAD'].includes(method) for consistency. Not blocking since HEAD isn't in scope.

🕐 Pending Recommendations (3)

These issues were raised in prior reviews and remain unaddressed (outside delta scope):


💡 APPROVE WITH SUGGESTIONS

Summary: The delta commit correctly implements PoW bypass for GET requests, enabling the stream resume endpoint to work with the Vercel AI SDK. The security analysis confirms this change aligns with PoW's intended purpose (preventing POST-based abuse) while maintaining all other authentication requirements for GET requests. The pending recommendations from prior reviews remain valid but are outside this delta's scope. 🎉

Discarded (0)

No findings were discarded in this delta review.

Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-security-iam 0 0 0 0 0 0 0
pr-review-standards 0 0 0 0 0 0 0
pr-review-tests 1 0 1 0 0 0 0
Total 1 0 1 0 0 0 0

Note: Security-IAM reviewer confirmed the change is appropriate. Tests reviewer noted a minor consideration about HEAD requests.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(1) Total Issues | Risk: High

Delta Review: This review is scoped to changes in commit abff65491 which reorders the authorization logic in the resume endpoint. The delta is +24/-14 lines in conversations.ts.

🔴❗ Critical (1) ❗🔴

Inline Comments:

  • 🔴 Critical: conversations.ts:390-392 Authorization bypass when conversation not found in database

💭 Consider (1) 💭

💭 1) conversations.ts:382-392 Orphaned EventEmitter listeners on failed authorization

Issue: The reordering calls createReadable(conversationId) at line 382 before validating ownership. When authorization fails at line 391, the ReadableStream's EventEmitter listeners (attached in stream-buffer-registry.ts:77-78) remain attached because cancel() is not called on the abandoned stream.

Why: Minor memory pressure for adversarial traffic (repeated unauthorized resume attempts on active streams). Impact is bounded: listeners are cleaned up automatically when the stream completes or the 60-second cleanup timer fires. The tradeoff appears intentional to optimize the happy path.

Fix: Optional — consider checking ownership with streamBufferRegistry.has() before calling createReadable(), or explicitly cancel the readable when ownership fails.

🕐 Pending Recommendations (3)

These issues were raised in prior reviews and remain unaddressed:


🚫 REQUEST CHANGES

Summary: The delta introduces a critical authorization bypass — the condition at line 390 allows streaming when the conversation is not found in the database. The fix is a one-liner: change conversation && to !conversation ||. The prior critical issues from earlier reviews have been addressed, but this new delta introduces a regression. The "Consider" item about listener leakage is minor and bounded by the 60-second cleanup.

Discarded (0)

No findings were discarded in this delta review.

Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-security-iam 1 0 0 0 1 0 0
pr-review-standards 1 0 0 0 1 0 0
pr-review-sre 1 0 1 0 0 0 0
Total 3 0 1 0 1 0 0

Note: All three reviewers independently identified the same critical authorization issue — high convergence indicates high confidence.

Comment on lines +390 to +392
if (conversation && conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Authorization bypass when conversation not found in database

Issue: The condition conversation && conversation.userId !== endUserId returns 204 only when the conversation exists AND belongs to a different user. If conversation is null (not found in DB), the condition evaluates to false and execution proceeds to return the stream without authorization.

Why: This allows an attacker who knows a conversationId with an active stream buffer to resume the stream even if the conversation record is missing (e.g., deleted, race condition, or DB replication lag). Compare to the correct pattern at lines 410-411 which handles the null case: if (!conversation || conversation.userId !== endUserId).

Fix: (1-click apply)

Suggested change
if (conversation && conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}
if (!conversation || conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}

Refs:

@github-actions github-actions bot deleted a comment from claude bot Mar 17, 2026
@vercel vercel bot temporarily deployed to Preview – agents-docs March 17, 2026 01:10 Inactive
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(5) Total Issues | Risk: High

🔴❗ Critical (1) ❗🔴

🔴 1) conversations.ts:390-392 Authorization bypass when conversation not found in database

Issue: The authorization check uses conversation && conversation.userId !== endUserId which only denies access when the conversation IS found AND belongs to a different user. When the conversation is NOT found in the database (conversation is null/undefined), the condition evaluates to false and execution continues to line 402, returning the stream to the caller.

Why: This enables cross-user data access: if User A starts a chat (stream buffered under conversationId), and User B calls the resume endpoint with that conversationId before the conversation is persisted to the database (or if the conversation was deleted), User B receives User A's stream. The stream buffer registry is keyed by conversationId alone without tenant isolation, amplifying the risk.

Fix:

// Change line 390 from:
if (conversation && conversation.userId !== endUserId) {
// to:
if (!conversation || conversation.userId !== endUserId) {

This matches the correct authorization pattern used at line 410 in the same file.

Refs:

🟠⚠️ Major (2) 🟠⚠️

🟠 1) stream-buffer-registry.ts Missing test coverage for critical new functionality

files: stream-buffer-registry.ts (106 lines), conversations.ts:376-415

Issue: The StreamBufferRegistry class and the new resume endpoint have zero test coverage. Per AGENTS.md: "ALL new work MUST include... Unit Tests - NO EXCEPTIONS".

Why: Without tests, these regressions could go undetected:

  • Memory leaks if cleanup timer fails to delete entries after 60s
  • EventEmitter listener leaks if onDone handler fails to clean up
  • Authorization bypass in resume endpoint (the critical bug at line 390 would have been caught by an authorization test)

Fix: Create agents-api/src/domains/run/stream/__tests__/stream-buffer-registry.test.ts with test cases for register, push, complete, createReadable, and cleanup timeout behavior.

Refs:

🟠 2) conversations.ts:402 Resume endpoint lacks error handling around stream piping

Issue: The stream(c, (s) => s.pipe(readable)) call has no try-catch. If the readable stream errors during transmission, the error propagates as an unhandled exception with no logging, and no error event is written to the SSE stream.

Why: Clients will see an abrupt connection termination with no indication of what went wrong. Compare to chatDataStream.ts:487-492 which logs errors and writes an SSE error event.

Fix:

return stream(c, async (s) => {
  try {
    const reader = readable.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      await s.write(value);
    }
  } catch (error) {
    logger.error({ error, conversationId }, 'Error resuming conversation stream');
    await s.write(`event: error\ndata: ${JSON.stringify({ error: 'Stream error' })}\n\n`);
  }
});

Refs:

🟡 Minor (1) 🟡

🟡 1) stream-buffer-registry.ts:40 No EventEmitter maxListeners configuration

Issue: Each BufferEntry creates a new EventEmitter() with the default maxListeners limit of 10. If multiple clients simultaneously resume the same conversation stream (user opens multiple tabs), exceeding 10 concurrent resume attempts will trigger Node.js MaxListenersExceededWarning.

Why: This can mask real listener leaks in production logs.

Fix: Call emitter.setMaxListeners(0) to disable the warning, or set an explicit reasonable limit like emitter.setMaxListeners(50).

💭 Consider (1) 💭

💭 1) conversations.ts:382-392 Orphaned EventEmitter listeners on failed authorization

Issue: The code calls createReadable(conversationId) at line 382 before validating ownership. When authorization fails at line 391, the ReadableStream's EventEmitter listeners remain attached because cancel() is not called on the abandoned stream.

Why: Minor memory pressure for adversarial traffic. Impact is bounded by the 60-second cleanup timer.

Fix: Consider calling streamBufferRegistry.has(conversationId) before createReadable(), or explicitly cancel the readable when ownership fails.


🚫 REQUEST CHANGES

Summary: This PR implements a valuable stream resumption feature following the Vercel AI SDK pattern. However, there is a critical authorization bypass at line 390 that must be fixed before merge — the condition allows streaming to any user when the conversation is not found in the database. Additionally, the new functionality lacks test coverage per AGENTS.md requirements, and the stream piping lacks error handling.

The architecture is sound (globalThis singleton pattern, EventEmitter for live tailing, 60s cleanup), and prior feedback on race conditions and error handling in the buffer IIFE has been addressed. The remaining issues are straightforward to fix.

Discarded (4)
Location Issue Reason Discarded
stream-buffer-registry.ts Unbounded memory growth in chunks array Valid concern but consistent with existing stream-registry pattern; acceptable for in-memory architecture with 60s cleanup. Would require significant redesign.
conversations.ts Response pattern inconsistency (204 vs 404 for auth failures) Intentional per PR description — Vercel AI SDK expects 204 to fall back to DB messages. Different semantic than GET conversation endpoint.
conversations.ts SSE header casing inconsistency Pre-existing split-world issue in codebase; not introduced by this PR.
runAuth.ts:515 HEAD requests not included in PoW skip HEAD requests not used by this feature or Vercel AI SDK. Very low risk.
Reviewers (6)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 0 0 0 0 0 0 0
pr-review-security-iam 2 1 0 0 0 0 1
pr-review-sre 6 1 1 0 0 0 4
pr-review-tests 3 1 0 0 0 0 2
pr-review-errors 3 1 0 0 0 0 2
pr-review-consistency 4 0 0 0 0 0 4
Total 18 4 1 0 0 0 13

Note: The authorization bypass (Critical) was independently identified by security-iam and errors reviewers with HIGH confidence convergence.

@github-actions github-actions bot deleted a comment from claude bot Mar 17, 2026
@itoqa
Copy link

itoqa bot commented Mar 17, 2026

Ito Test Report ❌

6 test cases ran. 5 passed, 1 failed.

✅ Most core stream-behavior checks included in this report passed, including SSE startup headers, deterministic 204 behavior for non-active resume paths, cleanup-window behavior, and conversationId fuzz handling. 🔍 Verification found one confirmed security bug in resume authorization: under an active in-memory stream, a second user can receive streamed content when conversation ownership lookup is missing, which is a production-code issue.

✅ Passed (5)
Test Case Summary Timestamp Screenshot
ROUTE-1 Chat stream returned 200 SSE with resumable headers and chunked data stream payload. 11:51 ROUTE-1_11-51.png
ROUTE-3 Unknown pre-initialized conversation resume returned deterministic 204 with empty body. 11:51 ROUTE-3_11-51.png
ROUTE-4 Immediate and post-65s resume checks for conv-core-1 both returned 204, confirming no stale stream remained available after completion window. 14:50 ROUTE-4_14-50.png
EDGE-6 Direct deep-link resume request before chat initialization returned deterministic 204 behavior. 11:52 EDGE-6_11-52.png
ADV-4 Traversal-like and oversized conversationId inputs were handled safely with HTTP 204 responses and no 500/server-trace leakage. 1:18:56 ADV-4_1-18-56.png
❌ Failed (1)
Test Case Summary Timestamp Screenshot
ADV-1 Token B resumed conv-userA-1 and received SSE payload with streamed content (HTTP 200), indicating cross-user isolation failure. 1:18:56 ADV-1_1-18-56.png
Cross-user conversationId theft attempt – Failed
  • Where: GET /run/v1/conversations/{conversationId}/stream resume endpoint (run conversations route)

  • Steps to reproduce: Start an active stream as User A, then call the resume endpoint for the same conversationId as User B while the in-memory stream buffer still exists.

  • What failed: User B received 200 text/event-stream and streamed payload for User A's conversation instead of being denied.

  • Code analysis: The resume route creates a readable stream from in-memory registry first, then only denies access when a conversation row is found and owned by a different user. If no conversation row is returned at that moment, the code still proceeds to stream data, creating an ownership-check gap.

  • Relevant code:

    agents-api/src/domains/run/routes/conversations.ts (lines 382-402)

    const readable = streamBufferRegistry.createReadable(conversationId);
    
    if (readable) {
      const conversation = await getConversation(runDbClient)({
        scopes: { tenantId, projectId },
        conversationId,
      });
    
      if (conversation && conversation.userId !== endUserId) {
        return new Response(null, { status: 204 });
      }
    
      return stream(c, (s) => s.pipe(readable));
    }

    agents-api/src/domains/run/stream/stream-buffer-registry.ts (lines 62-66)

    createReadable(conversationId: string): ReadableStream<Uint8Array> | null {
      const entry = this.buffers.get(conversationId);
      if (!entry) return null;

    agents-api/src/domains/run/routes/chatDataStream.ts (lines 475-485)

    streamBufferRegistry.register(conversationId);
    return stream(c, async (s) => {
      const reader = run.readable.getReader();
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        streamBufferRegistry.push(conversationId, encoded);
        await s.write(value);
      }
    });
  • Why this is likely a bug: The authorization gate permits streaming when the conversation lookup is null, so ownership is not strictly enforced for active buffered streams.

  • Introduced by this PR: Yes - this PR modified the relevant code.

  • Timestamp: 1:18:56

📋 View Recording

Screen Recording

@anubra266 anubra266 changed the title feat: in-memory stream resumption for interrupted conversations feat: stream resumption for interrupted conversations Mar 18, 2026
@pullfrog
Copy link
Contributor

pullfrog bot commented Mar 25, 2026

TL;DR — Adds an in-memory stream buffer that captures SSE chunks during active conversations, enabling clients to reconnect and replay the full stream via a new GET /run/v1/conversations/{conversationId}/stream endpoint. Proof-of-Work verification is skipped for GET requests to allow unauthenticated stream resumption from web clients.

Key changes

  • Add StreamBufferRegistry for in-memory stream capture — A new process-global registry buffers Uint8Array chunks per conversation, exposes a ReadableStream for replay, and auto-cleans entries 60 seconds after stream completion.
  • Add GET /conversations/{conversationId}/stream resume endpoint — Returns the buffered SSE stream for an active conversation (200) or signals no active stream (204), with end-user ownership validation.
  • Wire buffer registration into chatDataStream — Both the durable-execution streaming path and the standard AI-SDK streaming path now register, push chunks to, and complete the buffer alongside the primary client stream.
  • Skip Proof-of-Work for GET requeststryAppCredentialAuth now bypasses PoW verification on GET so web clients can resume streams without a challenge.
  • Update PoW tests to use POST — Existing PoW test cases changed from GET to POST to match the new semantics, plus a new test asserting PoW is skipped on GET.

Summary | 6 files | 2 commits | base: durable-executionfeat/stream-resumption


In-memory stream buffer registry

Before: Stream chunks were written directly to the client SSE connection with no retention — if a client disconnected, the stream data was lost.
After: Every active stream registers with StreamBufferRegistry, which accumulates chunks in memory and can produce a ReadableStream that replays buffered + live data to a new consumer.

The registry is stored on globalThis to survive module reloads in dev. Each entry tracks a Uint8Array[] chunk array, a done flag, and a Node EventEmitter for live-tail subscriptions. createReadable first enqueues all existing chunks, then subscribes to live chunk/done events — giving reconnecting clients a seamless replay. Entries are garbage-collected 60 seconds after the stream completes.

How does cleanup work for long-running or abandoned streams? On complete(), a 60-second setTimeout removes the buffer entry. If a new stream starts for the same conversation before cleanup fires, register() cancels the pending timeout, marks the old entry as done, and replaces it with a fresh buffer.

stream-buffer-registry.ts


Stream resume endpoint

Before: No API existed to reconnect to an in-progress conversation stream.
After: GET /conversations/{conversationId}/stream returns the live SSE stream (200) if a buffer exists and the end-user owns the conversation, or 204 otherwise.

The handler checks the buffer registry first to avoid a database round-trip when no stream is active. Ownership is validated by comparing the authenticated endUserId against conversation.userId.

conversations.ts · chatDataStream.ts


PoW bypass for GET requests

Before: All web-client requests (including GET) required Proof-of-Work verification.
After: tryAppCredentialAuth skips PoW when request.method === 'GET', since GET endpoints like stream resume are read-only and don't create server-side work that PoW is designed to rate-limit.

Existing PoW tests were updated to use POST to reflect the narrowed scope, and a new should skip PoW for GET requests test validates the bypass.

runAuth.ts · app-credential-auth.test.ts

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: High

Re-review: This review is scoped to the delta since the last automated review and validates prior findings.

🔴❗ Critical (1) ❗🔴

Inline Comments:

  • 🔴 Critical: conversations.ts:438-439 Authorization bypass when conversation not found in database — STILL UNRESOLVED

🕐 Pending Recommendations (3)

These issues were raised in prior reviews and remain unaddressed:

  • 🟠 stream-buffer-registry.ts Missing test coverage for StreamBufferRegistry class — per AGENTS.md: "ALL new work MUST include... Unit Tests"
  • 🟠 conversations.ts:450 Resume endpoint lacks error handling around stream piping — compare to chatDataStream.ts patterns that wrap streams in try-catch
  • 🟡 stream-buffer-registry.ts:11 Hardcoded 60s cleanup delay without observability

✅ Prior Critical Issues — ADDRESSED

The following critical issues from prior reviews have been addressed:

Issue Status
Module-scoped singleton not shared across bundle boundaries ✅ Fixed — now uses globalThis pattern (lines 12, 15-21)
Fire-and-forget async IIFE lacks error handling ✅ Fixed — added try/catch with error logging (chatDataStream.ts:633-634)
Race condition between chunk replay and event subscription ✅ Fixed — subscribes to events BEFORE replaying buffered chunks (lines 81-82)
Missing cancel() handler for listener cleanup ✅ Fixed — added cancel() handler (lines 92-96)
Overwriting buffer leaves existing consumers hanging ✅ Fixed — emits 'done' to existing subscribers (lines 32-35)

💭 Consider (1) 💭

💭 1) runAuth.ts:523 HEAD requests also safe but not included in PoW skip

Issue: The condition method !== 'GET' means HEAD requests still require PoW. The codebase has precedent for treating GET and HEAD together.

Why: Very low risk — HEAD isn't used by this feature. Optional enhancement for consistency.


🚫 REQUEST CHANGES

Summary: The critical authorization bypass at line 438 remains unresolved despite being flagged in 6 prior automated reviews and confirmed by the itoqa testing bot. This is a one-line fix: change conversation && to !conversation ||. The other prior critical issues (singleton pattern, race conditions, error handling) have all been properly addressed — nice work on those!

The remaining pending recommendations (test coverage, stream error handling) are also valid but less urgent than the auth bypass. Fix the authorization logic and this PR is in good shape.

Discarded (2)
Location Issue Reason Discarded
conversations.ts:450 Using s.pipe() instead of manual reader loop Both patterns are valid; pipe() is more concise and the error handling concern was already raised in pending recommendations
stream-buffer-registry.ts Unbounded memory growth in chunks array Valid concern but consistent with existing stream-registry pattern; acceptable for in-memory architecture with 60s cleanup
Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 5 0 1 0 1 3 2
Total 5 0 1 0 1 3 2

Note: This was a focused re-review validating prior findings. The authorization bypass (Critical) was already identified by prior reviews but remains unfixed.

});

if (conversation && conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Authorization bypass when conversation not found in database

Issue: The condition conversation && conversation.userId !== endUserId only denies access when the conversation IS found AND belongs to a different user. When conversation is null/undefined (not found in DB), the condition evaluates to false and execution proceeds to line 450, returning the stream.

Why: This enables cross-user data leakage: if User A starts a chat (stream buffered under conversationId), and User B calls the resume endpoint with that conversationId before the conversation is persisted to the database (or after deletion), User B receives User A's stream. The itoqa test bot confirmed this vulnerability (ADV-1 test failed).

Fix: (1-click apply)

Suggested change
return new Response(null, { status: 204 });
if (!conversation || conversation.userId !== endUserId) {
return new Response(null, { status: 204 });
}

This matches the correct authorization pattern used at line 458 in the same file.

Refs:

@itoqa
Copy link

itoqa bot commented Mar 25, 2026

Ito Test Report ❌

14 test cases ran. 4 failed, 10 passed.

Across 14 test cases, 10 passed and 4 failed, with strong coverage confirming that run-stream resumption behaves correctly in core and edge scenarios (replay plus live tailing, concurrent consumers, immediate post-completion replay, TTL-based 204 after buffer cleanup, and reconnect burst stability), while GET conversation listing without PoW, origin enforcement on resume, and conversationId input hardening all worked as intended. The most important issues were two High-severity backend defects—non-GET PoW challenge solution replay being accepted on repeated POST /run/api/chat requests and context validation masking client header errors as 500 internal_server_error—and two Medium-severity playground continuity bugs where conversation IDs are regenerated/reset on reload or remount, causing lost resume/history in mobile reload and rapid dual-tab navigation stress flows.

❌ Failed (4)
Category Summary Screenshot
Adversarial ⚠️ Context header validation errors are converted into internal_server_error responses, blocking expected stream setup. ADV-1
Adversarial 🟠 Rapid dual-tab back/forward/reload loops leave resume/history behavior incoherent with repeated error fallback states. ADV-7
Edge 🟠 Mobile reload during in-flight chat resets the panel to initial state instead of showing resumed/fallback response content. EDGE-6
Happy-path ⚠️ Replayed valid PoW challenge solutions are accepted on repeated non-GET chat requests. ROUTE-3
⚠️ Context validation middleware masks bad requests as 500 errors
  • What failed: The middleware creates a bad_request error for invalid headers, but then catches and rewrites it to internal_server_error (Context validation failed) instead of returning the client-correct 4xx response.
  • Impact: Valid user flows can fail with 500 responses during chat initialization, preventing conversation/stream creation and blocking downstream resume-security scenarios. This also misclassifies client/input issues as server faults, making remediation and monitoring less reliable.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Send POST /run/api/chat with headers/context that make validateHeaders return valid: false.
    2. Observe that contextValidationMiddleware enters the invalid-header branch and constructs a bad_request API error.
    3. Observe that the surrounding catch rewrites the response to internal_server_error Context validation failed instead of preserving a 4xx validation error.
  • Code analysis: I reviewed the run context validation middleware and the chat streaming route. The middleware explicitly constructs a bad_request on header validation failure, but the broad catch unconditionally throws internal_server_error; since /chat is wrapped by this middleware, affected requests fail before stream creation.
  • Why this is likely a bug: Production code contains contradictory error handling (intentional bad_request immediately masked as internal_server_error), which deterministically produces incorrect API behavior independent of test harness setup.

Relevant code:

agents-api/src/domains/run/context/validation.ts (lines 408-421)

if (!validationResult.valid) {
  logger.warn(
    {
      tenantId,
      agentId,
      errors: validationResult.errors,
    },
    'Headers validation failed'
  );
  const errorMessage = `Invalid headers: ${validationResult.errors.map((e) => `${e.field}: ${e.message}`).join(', ')}`;
  throw createApiError({
    code: 'bad_request',
    message: errorMessage,
  });
}

agents-api/src/domains/run/context/validation.ts (lines 436-446)

} catch (error) {
  logger.error(
    {
      error: error instanceof Error ? error.message : 'Unknown error',
    },
    'Context validation middleware error'
  );
  throw createApiError({
    code: 'internal_server_error',
    message: 'Context validation failed',
  });
}

agents-api/src/domains/run/routes/chatDataStream.ts (lines 88-92)

// Apply context validation middleware
app.use('/chat', contextValidationMiddleware);

app.openapi(chatDataStreamRoute, async (c) => {
  try {
🟠 Unusual navigation loop does not corrupt resume behavior
  • What failed: UI stayed responsive, but resume/history state degraded into fallback/error behavior instead of coherent continuation of the active conversation.
  • Impact: Heavy navigation interruptions can break continuity guarantees and leave users with inconsistent conversation recovery. This can cause perceived message loss during unstable browsing conditions.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Open the agent playground and start a long streaming response.
    2. Duplicate into a second tab on the same agent page.
    3. Run repeated back, forward, and reload actions across both tabs for about 10 seconds.
    4. Inspect the final chat state and confirm whether the same conversation resumes coherently with complete history.
  • Code analysis: I inspected the same resume path end-to-end for this stress case. The API side keeps a stream buffer per conversation, but the playground state model reinitializes playgroundConversationId and also resets it on unmount, so tab reload/history churn shifts the client to new conversation IDs that no longer map to the original active stream/history.
  • Why this is likely a bug: The reconnection feature depends on reusing the same conversation ID, but navigation/remount behavior rotates that ID and disconnects the UI from the stream/history it should resume.

Relevant code:

agents-manage-ui/src/features/agent/state/use-agent-store.ts (lines 97-123)

const initialAgentState: AgentStateData = {
  nodes: [],
  edges: [],
  metadata: {
    id: undefined,
    name: '',
    description: '',
    contextConfig: {
      contextVariables: '',
      headersSchema: '',
    },
    models: undefined,
    stopWhen: undefined,
    prompt: undefined,
    statusUpdates: undefined,
  },
  playgroundConversationId: generateId(),
};

agents-manage-ui/src/features/agent/state/use-agent-store.ts (lines 328-335)

persist(agentState, {
  name: 'inkeep:agent',
  partialize(state) {
    return {
      jsonSchemaMode: state.jsonSchemaMode,
      isSidebarPinnedOpen: state.isSidebarPinnedOpen,
      hasTextWrap: state.hasTextWrap,
    };
  },
})

agents-manage-ui/src/components/agent/playground/playground.tsx (lines 48-51)

useEffect(() => {
  // when the playground is closed the chat widget is unmounted so we need to reset the conversation id
  return () => resetPlaygroundConversationId();
}, []);
🟠 Mobile viewport reconnect flow
  • What failed: The panel returned to an initial chat state without resumed stream content or persisted fallback response for the interrupted turn.
  • Impact: Users who reload during streaming can lose conversational continuity and fail to recover the in-flight response. This weakens confidence in reconnect behavior for interruption-prone sessions.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Open the agent playground and click Try it.
    2. Set viewport to mobile size (390x844) and send a long streaming prompt.
    3. Reload the page while the assistant response is still streaming.
    4. Reopen the chat panel and check whether the interrupted response resumes or appears via persisted fallback history.
  • Code analysis: I reviewed the run stream-resume backend (conversations and chatDataStream routes) and the manage UI playground state wiring. Backend routes expose replay/fallback behavior, but the playground client generates a fresh conversation ID on reload/remount and does not persist it across page reloads, so reconnect requests target a different conversation.
  • Why this is likely a bug: Resume requires a stable conversation identifier, but the playground resets and re-generates that identifier on remount/reload, preventing reconnection to the original stream.

Relevant code:

agents-manage-ui/src/features/agent/state/use-agent-store.ts (lines 97-123)

const initialAgentState: AgentStateData = {
  nodes: [],
  edges: [],
  metadata: {
    id: undefined,
    name: '',
    description: '',
    contextConfig: {
      contextVariables: '',
      headersSchema: '',
    },
    models: undefined,
    stopWhen: undefined,
    prompt: undefined,
    statusUpdates: undefined,
  },
  playgroundConversationId: generateId(),
};

agents-manage-ui/src/features/agent/state/use-agent-store.ts (lines 328-335)

persist(agentState, {
  name: 'inkeep:agent',
  partialize(state) {
    return {
      jsonSchemaMode: state.jsonSchemaMode,
      isSidebarPinnedOpen: state.isSidebarPinnedOpen,
      hasTextWrap: state.hasTextWrap,
    };
  },
})

agents-manage-ui/src/components/agent/playground/playground.tsx (lines 48-51)

useEffect(() => {
  // when the playground is closed the chat widget is unmounted so we need to reset the conversation id
  return () => resetPlaygroundConversationId();
}, []);
⚠️ PoW challenge solutions can be replayed on non-GET chat requests
  • What failed: A reused valid PoW solution should be rejected as one-time or nonce-bound, but the same solution can be accepted again on a subsequent non-GET request.
  • Impact: Attackers can amortize PoW cost by replaying solved challenges, reducing abuse resistance for protected non-GET endpoints. This weakens request-throttling intent and makes automated abuse cheaper.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Request a challenge solution via GET /run/auth/pow/challenge and compute a valid X-Inkeep-Challenge-Solution value.
    2. Send POST /run/api/chat with valid app credential headers and the computed challenge solution.
    3. Repeat another POST /run/api/chat reusing the exact same previously valid challenge solution.
    4. Observe the replayed solution is accepted instead of being rejected as already used.
  • Code analysis: I inspected the non-GET auth gate and PoW verification utility. The route correctly calls PoW verification for non-GET requests, but the verifier only checks existence, expiry, and cryptographic validity of the submitted solution and never records/invalidates previously used solutions.
  • Why this is likely a bug: The production PoW verifier has no replay-state check (nonce/store/used-token invalidation), so reused valid solutions can pass indefinitely until expiration.

Relevant code:

agents-api/src/middleware/runAuth.ts (lines 523-528)

if (reqData.request.method !== 'GET') {
  const pow = await verifyPoW(reqData.request, env.INKEEP_POW_HMAC_SECRET);
  if (!pow.ok) {
    throw new HTTPException(400, { message: getPoWErrorMessage(pow.error) });
  }
}

packages/agents-core/src/utils/pow.ts (lines 43-61)

const challengeHeader = request.headers.get('x-inkeep-challenge-solution');
if (!challengeHeader) {
  return { ok: false, error: 'pow_required' };
}

if (isChallengeExpired(challengeHeader)) {
  return { ok: false, error: 'pow_expired' };
}

try {
  const valid = await verifySolution(challengeHeader, hmacSecret, false);
  if (!valid) {
    return { ok: false, error: 'pow_invalid' };
  }
} catch {
  return { ok: false, error: 'pow_invalid' };
}

return { ok: true };
✅ Passed (10)
Category Summary Screenshot
Adversarial Origin validation denied missing/disallowed origins and allowed approved origin on resume GET. ADV-4
Adversarial ConversationId injection payloads returned safe non-5xx responses without error disclosure. ADV-5
Edge Re-run confirmed concurrent resume consumers both received full ordered replay with terminal completion. EDGE-1
Edge Immediate resume after completion returned replay data on a valid conversation stream. EDGE-3
Edge Completion-based TTL validation matched intended cleanup behavior (resume returned empty after delay). EDGE-4
Edge Rapid reconnect burst completed without 5xx errors and final resume/detail checks succeeded. EDGE-5
Happy-path Resume stream replayed pre-disconnect frames and tailed to completion as expected. ROUTE-1
Happy-path Resume endpoint returned 204 for a conversation with no active stream buffer. ROUTE-2
Happy-path Conversation list GET succeeded without PoW header, matching GET auth behavior. ROUTE-4
Happy-path After TTL expiry, resume returned 204 while conversation detail still returned persisted assistant content. ROUTE-5

Commit: 03d9b64

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant