Skip to content

Conversation

@mdrxy
Copy link
Member

@mdrxy mdrxy commented Jan 5, 2026

Fixes #34517

Supersedes #34557, #34570

Fixes token inflation in SummarizationMiddleware that caused context window overflow during summarization.

Root cause: When formatting messages for the summary prompt, str(messages) was implicitly called, which includes all Pydantic metadata fields (usage_metadata, response_metadata, additional_kwargs, etc.). This caused the stringified representation to use ~2.5x more tokens than count_tokens_approximately estimates.

Problem:

  • Summarization triggers at 85% of context window based on count_tokens_approximately
  • But str(messages) in the prompt uses 2.5x more tokens
  • Results in ContextLengthExceeded

Fix: Use get_buffer_string() to format messages, which produces compact output:

Human: What's the weather?
AI: Let me check...[tool_calls]
Tool: 72°F and sunny

Instead of verbose Pydantic repr:

[HumanMessage(content='What's the weather?', additional_kwargs={}, response_metadata={}), ...]

@mdrxy mdrxy requested a review from eyurtsev as a code owner January 5, 2026 21:05
@github-actions github-actions bot added core `langchain-core` package issues & PRs langchain `langchain` package issues & PRs fix For PRs that implement a fix labels Jan 5, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 5, 2026

Merging this PR will improve performance by 25.09%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

⚡ 2 improved benchmarks
✅ 11 untouched benchmarks
⏩ 21 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime test_async_callbacks_in_sync 23.1 ms 18.4 ms +25.09%
WallTime test_import_time[ChatPromptTemplate] 626.8 ms 567.6 ms +10.43%

Comparing mdrxy/fix-summarization (ab08dbf) with master (c1f1641)2

Open in CodSpeed

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (0438f8c) during the generation of this report, so c1f1641 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@github-actions github-actions bot added fix For PRs that implement a fix and removed fix For PRs that implement a fix labels Jan 5, 2026
ccurme
ccurme previously approved these changes Jan 6, 2026
@mdrxy mdrxy merged commit 8aeff95 into master Jan 7, 2026
91 checks passed
@mdrxy mdrxy deleted the mdrxy/fix-summarization branch January 7, 2026 00:05
@mdrxy mdrxy changed the title fix(core,langchain): use get_buffer_string for message summarization fix(langchain): use get_buffer_string for message summarization Jan 7, 2026
@github-actions github-actions bot added fix For PRs that implement a fix and removed fix For PRs that implement a fix labels Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs fix For PRs that implement a fix langchain `langchain` package issues & PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SummarizationMiddleware includes metadata in prompt causing context length overflow

3 participants