Fix Vercel AI SDK adapter bugs from code review#132
Conversation
Fix four bugs, add configurability, document limitations. Bugs fixed: - Iterator-error path now closes open text/reasoning spans before emitting error/finish-step/finish. Previously a network error mid-stream left text-start without a matching text-end. - Reasoning span IDs now increment on close, so reasoning->text-> reasoning produces distinct IDs instead of reusing "reasoning-0". - Text span IDs now increment in StreamModel (not just StreamModelWithTools). Tool calls and stream resets advance the counter. Removed the non-advancing endText() entirely. - writeReasoningDelta now closes any open text span first, mirroring how writeTextDelta already closed reasoning spans. - generateMessageID fallback is now 16 chars, matching the documented contract and test assertion. - maxTurns exhaustion now emits finish-step before finish, consistent with normal turn endings. New options: WithMaxBodyBytes (default 1MB), WithMaxTurns (default 10). Unmarshal failures for tool input/output/result now log at Warn level instead of being silently swallowed. Package doc notes that inbound tool-call history is not yet reconstructed and that interceptor plugins must be wired at the model level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
@claude review |
Claude Code ReviewReviewed the changes addressing review items from PR #131. Most fixes look correct and well-tested. I found one remaining bug from the same class of issues being fixed, and one new bug introduced by the Completed
Issues1. (bug)
|
handleContentPart now calls endReasoning() before endTextAndAdvance() on PartToolRequest, matching handleToolTurnPart. Without this, a reasoning->tool-request sequence in StreamModel left the reasoning span open until stream end. Remove the extra finish-step on maxTurns exhaustion -- writeToolTurnEnd already emits finish-step for the last turn, so the post-loop code was emitting a duplicate with no matching start-step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
streamToolTurn already writes finish+[DONE] on iterator error, but the caller in StreamModelWithTools unconditionally wrote another finish+[DONE] because len(toolRequests)==0 was true. The client would see two finish chunks and two [DONE] sentinels. Fix: check for finishReasonError (or empty, for context cancellation) and return early before the caller's normal termination path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
Claude Code ReviewAll three review-round-2 blockers are now fixed correctly. Remaining concerns are test-coverage gaps around the new fixes and a pre-existing test that doesn't actually exercise what it claims. Completed
Previously-flagged blockers — all addressed1. 2. Duplicate // streamToolTurn already wrote finish+[DONE] on error.
if finishReason == finishReasonError || finishReason == "" {
return
}The early-return guard catches both 3. Unpaired Issues1. (test gap) No test for the iterator-error fix in tool-calling mode
The fix in commit A regression here would be silent (still produces output, just malformed terminal sequence) and could easily be reintroduced. Worth adding a 2. (test gap)
|
Add tests for: - StreamModelWithTools happy path (tool call -> executor -> second turn) - maxTurns exhaustion (paired start-step/finish-step, single finish) - Iterator error in tool-calling mode (no duplicate finish+[DONE]) Fix TestHandler_WithMaxBodyBytes to send valid JSON exceeding the limit instead of invalid JSON that fails decoding regardless. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🔴
adapter/vercelaisdk/uimessagestream/handler.go:553-555— 🔴streamToolTurn's event switch (handler.go:566-574) has cases only forContentPartEventandStreamEndEvent—llm.StreamResetEventsilently falls through, whileStreamModelcorrectly handles it at handler.go:450-457 (a branch this PR even tightened fromendText()toendTextAndAdvance()). When a retry interceptor wrapped around the model emitsStreamResetEventmid-turn,textStarted/reasoningStartedstay true with stale IDs (fusing discarded-attempt deltas with retry deltas under the same span ID) and anytoolRequestscollected before the reset remain in the local slice and get executed byexecuteTools, double-firing tools from the discarded attempt. Fix: addcase llm.StreamResetEvent:mirroringStreamModel—sw.endTextAndAdvance(),sw.endReasoning(), andtoolRequests = nil.Extended reasoning...
What the bug is
In
streamToolTurn(handler.go:548-575), the event-dispatch switch has only two cases:switch e := event.(type) { case llm.ContentPartEvent: if abort := handleToolTurnPart(e, sw, &toolRequests); abort { return "", nil } case llm.StreamEndEvent: finishReason = writeToolTurnEnd(e, sw, ew, logger) }
There is no
case llm.StreamResetEvent:branch, so the event silently falls through with no action taken. Compare withStreamModelat handler.go:450-457 which correctly closes both spans:case llm.StreamResetEvent: if err := sw.endTextAndAdvance(); err != nil { return } if err := sw.endReasoning(); err != nil { return }
This PR even tightened that branch (the diff changes
endText()toendTextAndAdvance()for the StreamModel reset case), which makes the asymmetry tostreamToolTurnmore glaring.Why this path is reachable
The
StreamResetEventcontract inllm/event.go:119-126is explicit: "Consumers should discard any accumulated content from the previous attempt and prepare to receive events from a fresh generation". The event is emitted by retry interceptors. The package doc the PR adds (handler.go:17-21) states: "interceptor plugins (retry, OTel) must be wired at the model level" — confirming that the model passed intostreamToolTurncan yieldStreamResetEventwhenever a retry interceptor wraps it. This is not a hypothetical path; it is documented supported usage.Step-by-step proof — span ID fusion
Consider a model wrapped by a retry interceptor, with tools enabled, that yields:
[Text("attempt1"), StreamResetEvent, Text("attempt2"), StreamEnd]ContentPartEvent{Text("attempt1")}→handleToolTurnPart→writeTextDeltaopens span: emitstext-start id=text-0,text-delta id=text-0 delta=attempt1. State:textStarted=true,textID=text-0.StreamResetEvent→ falls through the switch with no action. State unchanged:textStarted=true,textID=text-0.ContentPartEvent{Text("attempt2")}→writeTextDeltaseestextStarted==trueand just emitstext-delta id=text-0 delta=attempt2. NO newtext-start.StreamEndEvent→writeToolTurnEnd→closeSpans()→text-end id=text-0,finish-step.
Wire output:
text-start id=text-0, text-delta "attempt1", text-delta "attempt2", text-end id=text-0. The discarded attempt and the retry are fused under the same span ID, contradicting theStreamResetEventcontract. The same applies if reasoning was open:reasoningStartedstays true and the stalereasoningIDcontinues to be used.Step-by-step proof — tool double-fire
With a model that yields:
[ToolRequest(call-1), StreamResetEvent, ToolRequest(call-2), StreamEnd(tool-calls)]ContentPartEvent{ToolRequest(call-1)}→handleToolTurnPartPartToolRequest case → appendscall-1to localtoolRequestsslice → emitstool-input-start,tool-input-availableforcall-1.StreamResetEvent→ falls through.toolRequestsslice still containscall-1.ContentPartEvent{ToolRequest(call-2)}→ appendscall-2to slice → emits tool chunks forcall-2.StreamEndEvent{FinishReason=ToolCalls}→writeToolTurnEndreturns"tool-calls". Returns toStreamModelWithToolswithtoolRequests=[call-1, call-2].executeToolsis invoked with bothcall-1(discarded) andcall-2(retry) — executing the tool for the discarded attempt and feeding its result back to the model in the next turn.
For non-idempotent tools (database writes, payment APIs, email sending) this is a correctness bug, not just a wire-protocol artifact.
Impact
- Span ID protocol violation: span IDs persist across the reset boundary, fusing discarded and retry content under the same
text-N/reasoning-Nid on the wire. UI clients keyed off span IDs will display the discarded attempt's text concatenated with the retry attempt, defeating the whole purpose of retry semantics. - Tool double-execution: tool requests from a discarded attempt remain in the local slice and are executed by
executeToolsafter the turn ends, plus theirtool-output-availablechunks are sent to the client, and their results are appended tomessagesfor the next turn — effectively poisoning the conversation with phantom tool calls from a stream that was supposed to be discarded. - Asymmetry with
StreamModel: this is exactly the class of bug the PR set out to fix —handleContentPartvshandleToolTurnPart, thecloseSpans()symmetry between text and reasoning, etc. The existing testTestStreamModel_StreamResetEvent(andTestStreamModel_StreamResetEventWithReasoning) cover the non-tools path; no equivalent exists for the tools path.
How to fix
Add a
case llm.StreamResetEvent:branch tostreamToolTurn's switch mirroringStreamModel, and also reset the localtoolRequestsslice:case llm.StreamResetEvent: if err := sw.endTextAndAdvance(); err != nil { return "", nil } if err := sw.endReasoning(); err != nil { return "", nil } toolRequests = nil
A companion test
TestStreamModelWithTools_StreamResetEventmirroring the existingTestStreamModel_StreamResetEventwould also be valuable.
This comment has been minimized.
This comment has been minimized.
| messages = append(messages, llm.Message{Role: llm.RoleAssistant, Content: assistantParts}) | ||
|
|
||
| if err := executeTools(ctx, toolRequests, &messages, ew, executor); err != nil { | ||
| if err := executeTools(ctx, toolRequests, &messages, ew, logger, executor); err != nil { |
There was a problem hiding this comment.
🟣 🟡 Pre-existing bug (from #131, not introduced by this PR): In StreamModelWithTools, text and reasoning emitted alongside a tool call in the same model turn are streamed to the client but not preserved in the conversation history passed to the next model turn. The assistant message at handler.go:509-514 is built solely from toolRequests — any preamble like "Let me check the weather..." or reasoning the model just generated is invisible to it on the next call. Fix: accumulate text/reasoning parts in handleToolTurnPart and include them in assistantParts alongside the tool requests. Since this PR adds a Known limitations doc block covering inbound-history reconstruction, it's a natural moment to either fix this symmetric outbound gap or document it.
Extended reasoning...
What the bug is
In handleToolTurnPart (handler.go:580-617), the three content-part cases behave asymmetrically with respect to next-turn history:
case llm.PartText:
if err := sw.writeTextDelta(e.Part.Text); err != nil { ... } // streamed only
case llm.PartReasoning:
if err := sw.writeReasoningDelta(e.Part.ReasoningTrace); err != nil { ... } // streamed only
case llm.PartToolRequest:
...
*toolRequests = append(*toolRequests, tr) // ← accumulated
if err := sw.writeToolRequest(tr); err != nil { ... }Only PartToolRequest accumulates into the slice. Then in StreamModelWithTools (handler.go:509-514), the assistant message is constructed exclusively from tool requests:
assistantParts := make([]*llm.Part, 0, len(toolRequests))
for _, tr := range toolRequests {
assistantParts = append(assistantParts, llm.NewToolRequestPart(tr))
}
messages = append(messages, llm.Message{Role: llm.RoleAssistant, Content: assistantParts})So the assistant turn the model sees on iteration N+1 contains only its tool_use blocks — its own preceding text and reasoning are gone.
Specific code path that triggers it
A real model commonly emits something like [Text("Let me look that up..."), ToolRequest(getWeather, {...})] in a single turn. After streamToolTurn returns, executeTools appends a RoleUser message containing only tool responses. The next call to model.GenerateEvents sees [user("weather?"), assistant(tool_use only), user(tool_responses)] — no preamble text, no reasoning trace.
Why existing code doesn't prevent it
The new TestStreamModelWithTools_HappyPath added in this PR uses a model that emits Text("calling tool") before the tool call, but only inspects the SSE chunk sequence — it never asserts on the inter-turn messages slice that gets passed back to GenerateEvents. So the regression silently passes. The conformance suite likewise focuses on wire chunks, not on history accumulation across turns.
Step-by-step proof
Consider a 2-turn agentic interaction with a model that emits text alongside its tool call. Initial req.Messages = [user("weather?")].
- Turn 1 begins.
streamToolTurncalled withmessages = [user("weather?")]. - Model yields
ContentPartEvent{Text("Let me check.")}→writeTextDeltaemitstext-start, text-delta, text-endto the client.toolRequestsremains empty. - Model yields
ContentPartEvent{ToolRequest{ID:"call-1", Name:"getWeather", Args:{...}}}→ tool span chunks emitted,toolRequests = [tr1]. - Model yields
StreamEndEvent{FinishReason: tool-calls}→writeToolTurnEndemitsfinish-step. - Control returns to
StreamModelWithTools.assistantParts = [ToolRequestPart(tr1)]— text is dropped. messages = [user("weather?"), assistant(tool_use:tr1)].executeToolsappendsuser(tool_response:call-1="72F sunny").- Turn 2 begins. Model is called with
messages = [user("weather?"), assistant(tool_use:tr1), user(tool_response)]. The assistant's prior "Let me check." preamble is invisible on this call. The model may now repeat its preamble ("Let me check the weather..."), lose continuity with the reasoning it just produced, or behave inconsistently with its prior statements.
Impact
- Anthropic prompt-cache continuity breaks. Anthropic retains assistant text alongside tool_use in their content array; reconstructing the assistant turn without the preceding text invalidates cache prefix hits and increases token cost on subsequent turns of long agentic chains.
- Strict content validators can reject. Some provider SDKs reject assistant messages that contain tool_use blocks with no associated text content when text was originally produced, depending on validation strictness.
- Multi-turn agentic accuracy degrades. The new
WithMaxTurnsoption (added in this PR, default 10) increases the blast radius — longer agentic chains amplify the divergence between what the model said and what it sees it said. - Reasoning trace continuity is also lost. For reasoning-capable models (e.g. Claude with thinking, OpenAI o-series), the inter-tool reasoning the model emits is silently dropped before the next turn, breaking the model's ability to build on its own chain of thought.
How to fix
Accumulate text and reasoning parts in handleToolTurnPart alongside the tool requests, and include them in the assistant Content. Sketch:
// In streamToolTurn:
var assistantContent []*llm.Part
// pass &assistantContent into handleToolTurnPart
// In handleToolTurnPart:
case llm.PartText:
if err := sw.writeTextDelta(e.Part.Text); err != nil { return true }
*assistantContent = append(*assistantContent, e.Part)
case llm.PartReasoning:
if err := sw.writeReasoningDelta(e.Part.ReasoningTrace); err != nil { return true }
*assistantContent = append(*assistantContent, e.Part)
case llm.PartToolRequest:
...
*toolRequests = append(*toolRequests, tr)
*assistantContent = append(*assistantContent, e.Part)
...
// In StreamModelWithTools, replace the assistantParts loop with:
messages = append(messages, llm.Message{Role: llm.RoleAssistant, Content: assistantContent})Note: text deltas from the same span should ideally be concatenated into a single PartText per logical span (or the llm.Message format should permit multiple text parts, which it does).
Why this PR is the natural moment
This PR substantially modifies StreamModelWithTools (new maxTurns parameter, signature change, early-return guards on error, post-loop finish-step drop) and adds a package-level Known limitations block that explicitly enumerates the inbound tool-history gap ("only text parts are forwarded to the model"). The symmetric outbound gap — what we send back to the model after a tool turn — is not mentioned. Either fix it here or extend the doc block to cover it.
- Remove no-op textID/reasoningStarted reset in streamToolTurn; closeSpans already keeps these in sync between turns. - Clarify the early-return comment in StreamModelWithTools to distinguish the error path from the ctx-cancel/write-failure path. - Convert all raw t.Fatalf/t.Errorf assertions to testify assert/require, matching the project convention in CLAUDE.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
Claude Code ReviewLatest commits ( Completed
Issues1. 🔴 (regression — blocking) Early-return guard breaks
|
…mode
The early-return guard for iterator errors in StreamModelWithTools
over-triggered: it also caught StreamEndEvent{Error: ...} from
writeToolTurnEnd, which only writes error+finish-step (not
finish+[DONE]). Result: stream hangs with no finish chunk.
Fix: streamToolTurn's iterator-error branch no longer writes
finish+[DONE] itself. Both error paths (iterator error and
StreamEndEvent.Error) now return a finishReason, and the caller
writes finish+[DONE] uniformly. The guard only catches "" (ctx
cancel / write failure), not "error".
Also:
- Add StreamResetEvent handling to streamToolTurn, mirroring
StreamModel. Closes open spans and clears collected tool requests
to prevent double-firing tools from a discarded attempt.
- Close open spans before PartToolResponse in handleContentPart,
symmetric with PartToolRequest.
Every fix has a corresponding test:
- TestStreamModelWithTools_StreamEndEventWithError
- TestStreamModelWithTools_StreamResetEvent (asserts executor not
called for discarded tool requests)
- TestStreamModel_ToolResponseClosesSpans
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
-
🟣
adapter/vercelaisdk/uimessagestream/handler.go:542— Pre-existing from #131, but flagging since this PR substantially modifiesStreamModelWithTools:streamToolTurnconstructs its per-turniterReq(handler.go:542-546) by copying onlyMessages,Tools, andToolChoicefrom the caller'sreq— silently droppingOptions,ResponseFormat, andMetadataon every iteration. Direct callers of the exportedStreamModelWithToolsAPI who setOptions(temperature/max_tokens/top_p/reasoning level) will get their tuned config on turn 1 but provider defaults on turns 2+;Metadataloss breaks OTel/tracing correlation mid-chain. One-line fix:iterReq := *req; iterReq.Messages = messages.Extended reasoning...
What goes wrong
streamToolTurnatadapter/vercelaisdk/uimessagestream/handler.go:542-546constructs the per-turn request as:iterReq := &llm.Request{ Messages: messages, Tools: req.Tools, ToolChoice: req.ToolChoice, }
But
llm.Request(llm/request.go:22-47) has six fields:Messages,Tools,ToolChoice,ResponseFormat,Options,Metadata. The latter three are silently dropped on every iteration of the agentic tool-calling loop.Impact
Optionsis the most consequential drop. Documented atllm/request.go:39-42as "provider-specific configuration: temperature, max_tokens, top_p, etc." Production callers nearly always set this. Multi-turn tool-calling silently reverts to provider defaults after turn 1 — any temperature/max_tokens/reasoning-level tuning is lost.Metadataloss breaks OTel correlation. Perllm/request.go:44-46, Metadata "flows through for tracing, logging, and debugging." Dropping it on turn 2+ breaks observability tags mid-conversation, making it impossible to correlate spans across an agentic chain.ResponseFormatloss means a JSON-schema set on the first turn won't constrain the final-answer turn's output.
Contrast with the single-turn path
StreamModel(handler.go:415-417), which passesreqdirectly toGenerateEventsand preserves everything.Step-by-step proof
Consider a direct caller invoking
StreamModelWithToolswith:req := &llm.Request{ Messages: msgs, Tools: tools, Options: openai.Options{Temperature: 0.2, MaxTokens: 4096}, Metadata: map[string]string{"trace_id": "abc-123"}, } StreamModelWithTools(ctx, model, req, ew, logger, executor, 0)
- Turn 1:
streamToolTurnbuildsiterReq = {Messages: msgs, Tools, ToolChoice}—OptionsandMetadataare nil. The provider sees default temperature, default max_tokens, no trace_id. (Even on turn 1 the caller's Options/Metadata are dropped.) - Model returns a tool call.
executeToolsappends the response. - Turn 2: same thing —
iterReqagain has nil Options/Metadata. - Every subsequent turn: same drop.
The caller's tuned config is honored zero times, not even on the first turn.
Scope
The HTTP
Handlerpath is invisible to this bug becausechatRequest(handler.go:111-116) doesn't decode these fields and the constructedreqat handler.go:181-184 only setsMessagesandTools. The bug only manifests for direct callers of the exportedStreamModelWithToolsfrom custom handlers — but that's exactly the audience this PR addsTestStreamModelWithTools_*coverage for, establishing it as a supported public API surface.Why this PR is the natural moment
Pre-existing from #131, but the PR substantially modifies
StreamModelWithTools(addsmaxTurnsparameter, breaks the signature, refactors error paths, adds early-return guards) and introduces a package-levelKnown limitationsblock that already documents the inbound-history gap. The symmetric outbound-config gap deserves either a fix or a doc note alongside it. The signature is already breaking, so propagating the missing fields is essentially free.How to fix
One-line shallow clone preserves everything and is robust to future
llm.Requestfield additions:iterReq := *req iterReq.Messages = messages
(Or copy the three missing fields explicitly. The shallow-clone form is preferred — any future Request field is propagated automatically.)
-
🔴
adapter/vercelaisdk/uimessagestream/handler.go:555-558— streamToolTurn's event switch (handler.go:568-576) only handles ContentPartEvent and StreamEndEvent, silently dropping ErrorEvent and StreamResetEvent — both of which StreamModel handles correctly at handler.go:443-457. This is asymmetric and the StreamResetEvent drop is the severe case: when a retry interceptor is wrapped around the model (a configuration the new Known Limitations doc block explicitly endorses), the reset is silently ignored, leaving text/reasoning spans open across the boundary AND leaving the local toolRequests slice populated with stale requests from the failed attempt. executeTools then runs both the stale and fresh tool calls, and both land in the next-turn assistant message. Fix is to mirror StreamModel's two missing arms in streamToolTurn's switch.Extended reasoning...
What the bug is
streamToolTurnandStreamModelconsume the sameiter.Seq2[llm.Event, error]channel but their type-switches are not symmetric:StreamModel(handler.go:437-461) handlesContentPartEvent,ErrorEvent(logs + emits sanitized error chunk),StreamResetEvent(callsendTextAndAdvance+endReasoningto close spans across the retry boundary), andStreamEndEvent.streamToolTurn(handler.go:568-576) handles onlyContentPartEventandStreamEndEvent. The other two cases fall through silently.
The PR is the natural moment to flag this because (a) the diff touches the
StreamResetEventarm inStreamModeldirectly (line 451:endText()→endTextAndAdvance()), (b) the diff addssw.closeSpans()immediately above the deficientstreamToolTurnswitch, and (c) the new package doc block explicitly legitimizes retry-interceptor-at-model-level as the supported way to use retries with this handler — which is exactly the configuration that exposes theStreamResetEventdrop.Why the StreamResetEvent drop is the severe case
plugins/retry/retry.go:170-175confirmsStreamResetEventis emitted by the retry interceptor on every retry attempt:if !yield(llm.StreamResetEvent{Attempt: attempt, Reason: lastErr.Error()}, nil) { return }
llm/event.godocuments that consumers must "discard any accumulated content from the previous attempt and prepare to receive events from a fresh generation."streamToolTurnkeeps two pieces of accumulated state across iterations of itsfor event, err := range model.GenerateEvents(...)loop:- The
streamWriterspan state (textStarted,reasoningStarted,textID,reasoningID). - The local
toolRequests []*llm.ToolRequestslice.
Dropping
StreamResetEventleaves both of those carrying failed-attempt state into the retry attempt.Step-by-step proof (
StreamModelWithToolswith retry-wrapped model + executor)Attempt 1 starts, model emits some content + a tool request, then the underlying provider blips and the retry interceptor kicks in.
ContentPartEvent{Text("partial ")}→handleToolTurnPart→writeTextDelta→ emitstext-start(id=text-0),text-delta. State:textStarted=true,textID="text-0".ContentPartEvent{ToolRequest{ID:"call-stale", Name:"getWeather", ...}}→handleToolTurnPartPartToolRequest arm → closes text (endTextAndAdvance→text-end,textCounter++,textID="text-1",textStarted=false), appends totoolRequests(now[call-stale]), emitstool-input-start/tool-input-availableforcall-stale.StreamResetEvent{Attempt:1, Reason:"..."}→ falls through the switch silently. State unchanged:toolRequests=[call-stale]. (No spans currently open, but they would be on a different interleaving — e.g. if step 2 hadn't happened yet,textStartedwould still be true and step 4's deltas would land on the failed attempt's span.)- Retry yields
ContentPartEvent{Text("fresh answer")}→ emitstext-start(id=text-1),text-delta. - Retry yields
ContentPartEvent{ToolRequest{ID:"call-fresh", ...}}→ appends →toolRequests=[call-stale, call-fresh]. Emits tool spans forcall-fresh. StreamEndEvent{FinishReason: tool-calls}→writeToolTurnEndcloses spans, emitsfinish-step, returns"tool-calls".
Damage on return to
StreamModelWithTools:- handler.go:509-514 builds an assistant message from both tool requests:
assistantParts = [ToolRequestPart(call-stale), ToolRequestPart(call-fresh)]. The next-turnmessagesslice now carries a phantom tool_use forcall-stalethat the model never actually emitted on the successful attempt. - handler.go:517 calls
executeToolswhich runs bothcall-staleandcall-fresh. The stale tool may perform side effects (write to a DB, send an email) or simply waste a tool budget; the correspondingtool-output-availablechunks go on the wire for both, confusing the UI; tool_response parts for both go into the next-turn user message.
On the next model turn, the conversation history contains a fabricated assistant message + tool responses that have no relationship to the actual exchange the model produced — model behavior on that turn is unpredictable, and Anthropic in particular will reject the request if the tool_use ids don't match what it just emitted.
Step-by-step proof (ErrorEvent drop)
A provider yields
ErrorEvent{Message:"rate-limit warning"}mid-tool-turn:StreamModel(handler.go:443-448): logs and emits{"type":"error","errorText":"An error occurred"}so the UI can render a soft warning. Stream continues.streamToolTurn: event silently disappears. The protocol surface is asymmetric depending only on whether aToolExecutoris configured.
Why existing tests don't catch it
TestStreamModel_StreamResetEventandTestStreamModel_StreamResetEventWithReasoningexercise only the non-tool path (StreamModel). The new tool-mode tests (TestStreamModelWithTools_HappyPath,TestStreamModelWithTools_MaxTurnsExhaustion,TestStreamModelWithTools_IteratorErrorNoDuplicateTerminal) never feed aStreamResetEventorErrorEventinto the tool-calling iterator. There is noTestStreamModelWithTools_*Reset*or*ErrorEvent*test.Fix
Mirror
StreamModel's switch arms instreamToolTurn. ThetoolRequests = nilstep is the subtle part — without it, stale tool calls from a failed attempt persist into the next-turn assistant message:case llm.ErrorEvent: logger.Warn("recoverable LLM error", "message", e.Message) if err := ew.WriteChunk(Chunk{"type": "error", "errorText": "An error occurred"}); err != nil { return "", nil } case llm.StreamResetEvent: if err := sw.endTextAndAdvance(); err != nil { return "", nil } if err := sw.endReasoning(); err != nil { return "", nil } toolRequests = nil
…quest fields Three issues from review round 5: StreamResetEvent in streamToolTurn now emits tool-output-error for each accumulated tool request before clearing the slice. Without this, tool inputs already written to the wire had no matching output, leaving useChat clients with perpetually-pending tool-call cards. Add ErrorEvent case to streamToolTurn, mirroring StreamModel. Without this, recoverable provider warnings (emitted by OpenAI, Google) were silently dropped in tool-calling mode. Fix iterReq construction in streamToolTurn to copy the full request (Options, ResponseFormat, Metadata) instead of cherry-picking three fields. Previously, provider-specific options like MaxTokens and Temperature were silently dropped on turns 2..N. Every fix has a test: - TestStreamModelWithTools_StreamResetEvent (updated: expects tool-output-error for discarded request) - TestStreamModelWithTools_ErrorEventNonTerminal - TestStreamModelWithTools_RequestFieldsForwarded Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an iterator error occurs after tool-input-start/tool-input-available have been written to the wire, emit tool-output-error for each pending tool request before the error/finish-step sequence. Without this, the useChat client sees a perpetually-pending tool-call card. Mirrors the StreamResetEvent fix from the previous commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
||
| case llm.StreamResetEvent: | ||
| if err := sw.endText(); err != nil { | ||
| if err := sw.endTextAndAdvance(); err != nil { |
There was a problem hiding this comment.
🟣 🟣 Pre-existing (from #131): StreamModel (the non-executor path) emits tool-input-start/tool-input-available via handleContentPart's PartToolRequest case (handler.go:343-352) but tracks no pending tool requests — so on iterator-error (handler.go:429-442), StreamResetEvent (handler.go:458-465), or StreamEndEvent.Error via writeStreamEnd (handler.go:388-407), any previously-emitted tool-input-available is left unpaired on the wire, violating the protocol invariant just enforced in streamToolTurn. Reachable through the public API as Handler(model, WithTools(tools, nil)) (WithTools accepts nil executor; ServeHTTP sets req.Tools regardless; StreamModelWithTools delegates to StreamModel when executor==nil at handler.go:481-484) or via direct calls to the exported StreamModel with req.Tools populated. Fix template is identical to what this PR applied in streamToolTurn: track pending tool-call IDs and emit tool-output-error for each before the terminal chunks; alternatively reject WithTools(tools, nil) at config time.
Extended reasoning...
What the bug is
StreamModel (the non-executor path) emits tool-input-start and tool-input-available chunks for any PartToolRequest content event via handleContentPart (handler.go:343-352) → writeToolRequest (handler.go:292-313). But unlike streamToolTurn — which maintains a local toolRequests []*llm.ToolRequest slice — StreamModel does not track these pending tool calls anywhere.
When any of three terminal events fires after such chunks have been written:
- Iterator-error at handler.go:429-442:
sw.closeSpans()only closes text/reasoning spans; writeserror/finish-step/finish. Notool-output-errorfor orphaned tool calls. - StreamResetEvent at handler.go:458-465:
endTextAndAdvance+endReasoningonly. Notool-output-error. - StreamEndEvent.Error via
writeStreamEndat handler.go:388-407:error+closeSpans+finish-step+finish. Notool-output-error.
All three leave the previously-emitted tool-input-available unpaired on the wire — the exact pairing invariant the PR just enforced in streamToolTurn at handler.go:555-568 (iterator-error) and handler.go:588-597 (StreamResetEvent).
Why this PR is the natural moment
This PR's diff explicitly:
- Added
sw.closeSpans()toStreamModel's iterator-error branch at handler.go:436. - Changed
endText()toendTextAndAdvance()inStreamModel's StreamResetEvent case at handler.go:459. - Applied the symmetric
tool-output-errorpairing fix instreamToolTurnfor both iter-error and StreamResetEvent.
So the same hunks were touched, the same template applied to the executor sibling — but StreamModel's lack of a pending-IDs slice meant the symmetric fix needs a small additional structural change (track IDs) rather than just looping over an existing slice.
Step-by-step proof (StreamResetEvent path)
Provider yields [ContentPartEvent{ToolRequest{ID: "call-1"}}, StreamResetEvent, ContentPartEvent{Text("hi")}, StreamEndEvent{Stop}] to StreamModel:
start,start-stepwritten.PartToolRequest→handleContentPart→endReasoning(no-op),endTextAndAdvance(no-op),writeToolRequest→ emitstool-input-start,tool-input-availableforcall-1.StreamResetEvent→ handler.go:458-465 →endTextAndAdvance(no-op),endReasoning(no-op). Notool-output-errorforcall-1.PartText("hi")→text-start,text-delta.StreamEndEvent{Stop}→writeStreamEnd→text-end,finish-step,finish.[DONE].
Wire output:
start, start-step,
tool-input-start, tool-input-available, <-- call-1 (orphaned)
text-start, text-delta, text-end,
finish-step, finish, [DONE]
call-1 enters pending on tool-input-available and never receives a paired tool-output-*. Same orphan pattern for iterator-error and StreamEndEvent.Error.
Reachability
Two routes through the public API:
Handler(model, WithTools(tools, nil)):WithTools(handler.go:87-92) accepts aToolExecutorfunction with no nil check.ServeHTTPalways setsreq.Tools = h.cfg.tools(handler.go:170).StreamModelWithToolsseesexecutor == niland delegates toStreamModelat handler.go:481-484, passingreqwithToolspopulated. The model receives tools and may emitPartToolRequest. This configuration corresponds to client-side tool execution.- Direct calls to the exported
StreamModelwithreq.Toolsset by external callers (StreamModelis exported at handler.go:411).
Addressing the refutation
The refutation argues the no-executor path operates under a different contract: tool-input-available is the terminal tool chunk and the client handles resolution in a subsequent request. Two points in response:
- The package doc explicitly states "Inbound tool call history from multi-turn conversations is not yet reconstructed; only text parts are forwarded to the model" (handler.go:18). So full client-side execution isn't really supported in either direction. The
tool-input-availablechunks are server-side state the client must somehow drop on its own when the stream errors. - Even granting the client-side interpretation, on the error terminal paths (iterator error, StreamEndEvent.Error, StreamResetEvent for a discarded attempt) the protocol invariant remains — those tool calls did not complete and won't complete, so
tool-output-erroris the correct signal. The PR's ownstreamToolTurnfix codifies this for the same three error paths.
Reachability via WithTools(tools, nil) is genuinely undocumented, granting that — but StreamModel is exported and the nil-executor delegation is intentional code (not just a defensive check), and the symmetric fix template is right there in the PR's diff.
Severity rationale
pre_existing. The orphan was there before this PR, the PR didn't introduce the bug, and the typical Handler(model, WithTools(tools, executor)) configuration routes through streamToolTurn (now fixed). But worth flagging here because the symmetric fix pattern is what the PR just applied next door.
How to fix
Option 1 (mirror streamToolTurn): track pending tool IDs in StreamModel and emit tool-output-error for each on the three terminal paths before the terminal chunks. Requires propagating a small slice down through handleContentPart or a thin wrapper.
Option 2 (smaller diff): reject WithTools(tools, nil) at config time, or document that tools-without-executor is unsupported. This closes route #1 but not direct StreamModel calls.
|
@claude review |
Claude Code ReviewLatest commits address most of the round 5 findings (orphan tool-input pairing on iterator-error and StreamResetEvent, the missing Completed
Issues1. 🔴 (carried forward from round 5, partially fixed)
|
Third and final symmetric error path: when StreamEndEvent carries an error after tool-input chunks have been written, emit tool-output-error for each pending request. Completes the set alongside iterator-error (b93c88f) and StreamResetEvent (06f0a7a). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude review |
Claude Code ReviewLatest commit Completed
Round 6 blocker — addressed
case llm.StreamEndEvent:
finishReason = writeToolTurnEnd(e, sw, ew, logger)
if e.Error != nil {
for _, tr := range toolRequests {
_ = ew.WriteChunk(Chunk{
"type": "tool-output-error", "toolCallId": tr.ID,
"errorText": "stream error; tool call discarded",
})
}
toolRequests = nil
}All three symmetric error paths in Issues1. 🟡 (consistency nit)
|
| if len(toolRequests) == 0 || finishReason != "tool-calls" { | ||
| _ = ew.WriteChunk(Chunk{"type": "finish", "finishReason": finishReason}) | ||
| _ = ew.WriteDone() |
There was a problem hiding this comment.
🔴 Tool-input chunks emitted before a StreamEndEvent with FinishReason length/content-filter/other are orphaned: streamToolTurn returns ("length", [pending]) and StreamModelWithTools at handler.go:510-515 writes finish+[DONE] without iterating toolRequests, leaving every Vercel useChat tool-call card with no matching tool-output-* chunk. The PR added this cleanup for the iterator-error (lines 564-569) and StreamResetEvent (lines 599-604) paths but missed the third symmetric path. Fix: emit tool-output-error for each pending tool request before the finish chunk, mirroring the two existing loops.
Extended reasoning...
What the bug is
The early-return at handler.go:510-515 fires whenever streamToolTurn returns with finishReason != "tool-calls", but does not iterate the accumulated toolRequests slice to pair each previously-emitted tool-input-available with a tool-output-* chunk. Two of the three symmetric paths in streamToolTurn were just fixed for exactly this orphan-pairing problem (lines 564-569 for iterator-error and lines 599-604 for StreamResetEvent), but the StreamEndEvent path was missed for the e.Error == nil case.
Specific code path that triggers it
Inside streamToolTurn, StreamEndEvent delegates to writeToolTurnEnd (handler.go:653-672). When e.Error == nil and e.Response.FinishReason is anything other than FinishReasonToolCalls, mapFinishReason translates it to "stop", "length", "content-filter", or "other". None of those equal "tool-calls", so the caller's early-return triggers with toolRequests still populated.
Why existing code doesn't prevent it
The iterator-error and StreamResetEvent branches both have explicit cleanup loops added in this PR. The StreamEndEvent path returns the mapped reason and the caller fall-through assumed toolRequests is empty when finishReason isn't "tool-calls" — which is true when the model finishes cleanly without tool calls, but not when length/content-filter cuts off mid-tool-emission.
Reachability is well-documented across every in-repo provider
This is the documented, intentional mapping for the in-repo providers:
providers/anthropic/response_mapper_test.go:92— "max_tokens with tool calls stays Length"providers/anthropic/response_mapper_test.go:104— "refusal with tool calls stays ContentFilter"providers/anthropic/stream_partial_test.go:139— "max_tokens must propagate as FinishReasonLength even when tool calls are present"providers/bedrock/finish_reason_test.go:65,71— Length/ContentFilter preserved with tool_useproviders/openai/response_mapper_test.go:100,107— Length/ContentFilter with tool callsproviders/google/finish_reason_test.go:42-45— sameproviders/openaicompat/finish_reason_test.go:43-45— same
So the moment any model truncates mid-tool-emission (long arguments, large schemas, content filter cutoff), this fires.
Step-by-step proof
Provider yields [ContentPartEvent{PartToolRequest{ID:"call-1"}}, StreamEndEvent{Response: {FinishReason: FinishReasonLength}}] to StreamModelWithTools:
StreamModelWithToolswritesstart, enters loop, callsstreamToolTurn.streamToolTurnwritesstart-step.ContentPartEvent{PartToolRequest{call-1}}→handleToolTurnPartPartToolRequest case → appendscall-1totoolRequests,writeToolRequestemitstool-input-start+tool-input-available. State:toolRequests=[call-1].StreamEndEvent{Response:{FinishReason:Length}}(e.Error nil) →writeToolTurnEndemitsfinish-step, returns"length"via mapFinishReason. No tool-output- for call-1.*streamToolTurnreturns("length", [call-1]).- Caller:
len(toolRequests)>0is false branch, butfinishReason != "tool-calls"is true → enters if block at handler.go:510. Writesfinishwith reason="length",[DONE], returns.
Wire output:
start, start-step,
tool-input-start, tool-input-available, ← call-1 orphaned
finish-step,
finish (length), [DONE]
The tool-input-available for call-1 has no matching tool-output-available or tool-output-error. Vercel useChat tracks tool-call state by toolCallId — call-1 enters pending and never leaves.
Impact
UI clients render the tool-call card as perpetually pending: permanent loading spinner, unresolved error overlay, or a stale entry in messages[] that contradicts the assistant's truncated response. The same pairing invariant that the PR just enforced for the other two error paths.
Fix
Mirror the cleanup loops already present at lines 564-569 and 599-604. In StreamModelWithTools at handler.go:510, before writing the finish chunk:
if len(toolRequests) == 0 || finishReason != "tool-calls" {
for _, tr := range toolRequests {
_ = ew.WriteChunk(Chunk{
"type": "tool-output-error", "toolCallId": tr.ID,
"errorText": "stream ended without tool resolution; tool call discarded",
})
}
_ = ew.WriteChunk(Chunk{"type": "finish", "finishReason": finishReason})
_ = ew.WriteDone()
return
}A companion test TestStreamModelWithTools_FinishLengthPairsToolChunks that feeds PartToolRequest → StreamEndEvent{FinishReason: Length} and asserts a tool-output-error precedes the finish would lock this in.
What
Address all actionable findings from the code review on PR #131: four span-tracking bugs, two missing options, silent unmarshal failures, and undocumented limitations.
Why
The bugs would surface in production with any non-trivial model behavior: network errors mid-stream leave unclosed spans, reasoning/text IDs collide when the model interleaves them, and maxTurns exhaustion emits an inconsistent chunk sequence.
Implementation details
Span lifecycle bugs (review items #1, #3, #4)
The core problem was asymmetric transitions:
writeTextDeltaclosed reasoning spans, butwriteReasoningDeltadid not close text spans. Similarly, iterator-error paths skippedcloseSpans()entirely. Fixed by:sw.closeSpans()before error/finish-step/finish in bothStreamModelandstreamToolTurniterator-error paths.endReasoning()incrementreasoningCounterand updatereasoningID, matching howendTextAndAdvance()already worked for text.writeReasoningDelta()callendTextAndAdvance()first, mirroringwriteTextDelta()'s call toendReasoning().endText()(the non-advancing variant). All callers now useendTextAndAdvance().handleContentParttool-request case andStreamResetEventhandler both now advance text IDs.New options (review items #5 partial, #11)
WithMaxTurns(n)-- configurable tool-calling turn limit (default 10).StreamModelWithToolsacceptsmaxTurns intparameter; defaults to 10 if <= 0.WithMaxBodyBytes(n)-- configurable request body limit (default 1MB).Other fixes
generateMessageIDfallback shortened to 16 chars to match documented contract.maxTurnsexhaustion now emitsfinish-stepbeforefinish.json.Unmarshalfailures for tool input/output/result now log at Warn.Triggerfield fromchatRequest.References
Review comment: #131 (comment)