-
Notifications
You must be signed in to change notification settings - Fork 20.4k
fix(langchain): keep tool call / AIMessage pairings when summarizing
#34609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Port of langchain-ai/langchain#34609 When SummarizationMiddleware triggers and the cutoff lands on a ToolMessage, the middleware now searches backward for the AIMessage with matching tool_calls and includes it in the preserved messages. This prevents "orphaned" tool responses that cause API errors like: "No tool call found for function call output with call_id..."
ccurme
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between this and the logic we used to have here: https://github.com/langchain-ai/langchain/pull/34195/files
| while cutoff_index < len(messages) and isinstance(messages[cutoff_index], ToolMessage): | ||
| cutoff_index += 1 | ||
| return cutoff_index | ||
| if cutoff_index >= len(messages) or not isinstance(messages[cutoff_index], ToolMessage): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also don't want to land on an AIMessage with tool calls, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous logic advanced forward past ToolMessage objects (aggressive summarization). This approach takes the opposite approach, searching backward to include the AIMessage requesting the tools (in other words, preserve more context for the sake of atomicity)
If the cutoff lands on an AIMessage with tool_calls, the corresponding ToolMessage responses would be in the summarized portion, creating the reverse orphaning problem — tool call requests without their responses. Landing on an AIMessage with tool_calls is safe because the ToolMessage objects come after it and will be preserved together.
ccurme
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we changing the convention (summarizing less vs. more), or fixing an error mode?
If we are fixing, could you add a test that fails on master and passes here (or identify that test if it exists now)? Thanks.
| assert middleware._find_safe_cutoff_point(messages, len(messages) + 5) == len(messages) + 5 | ||
|
|
||
|
|
||
| def test_summarization_middleware_find_safe_cutoff_point_orphan_tool() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes on master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're changing the convention (summarizing less vs. more)
I've added a test that fails on master
Fixes #34282
Before: When using agents with tools (like file reading, web search, etc.), the conversation looks like this:
When the conversation gets too long,
SummarizationMiddlewarekicks in to compress older messages. The problem was:If you asked to keep the last 6 messages, you'd get:
The AI's original request to read the files (
[AI]message withtool_calls) was summarized away, but the tool responses remained. This caused the error:Many APIs require that every tool response has a matching tool request. Without the AI message, the tool responses are "orphaned."
The fix
Now when the cutoff lands on tool messages, we move backward to include the AI message that requested those tools:
Same scenario, keeping last 6 messages:
The AI message is preserved along with its tool responses, keeping them paired together.
Practical examples
Example 1: Parallel tool calls
Scenario: Agent reads 10 files in parallel, summarization triggers (see above)
Example 2: Mixed conversation
Scenario: User asks question, AI uses tools, user says thanks
Keeping last 2 messages:
[User] "Thanks!"kept[AI] + [Tool] + [AI] + [User]all keptExample 3: Multiple tool sequences
Keeping last 3 messages: If cutoff lands on
[Tool] "Results for Y", we now include[AI] [tool_call: search]to keep the pair together.