fix(agents): inject reasoning_content sequentially to avoid count mismatch#1557
fix(agents): inject reasoning_content sequentially to avoid count mismatch#1557mvanhorn wants to merge 4 commits intoagentscope-ai:mainfrom
Conversation
…match The FileBlockSupportFormatter._format method compared assistant message counts before and after parent formatting to inject reasoning_content. The parent formatter drops assistant messages that only contain thinking blocks (no content or tool_calls), so the counts never matched and reasoning_content was always skipped with a warning. This changes the injection to work sequentially: collect reasoning values from input assistant messages in order, then assign them to output assistant messages by index. This handles the count difference from dropped thinking-only messages. Fixes agentscope-ai#1532 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request fixes an issue where reasoning_content was not being injected due to a message count mismatch. The proposed solution of sequential injection is a good direction, but the current implementation has a flaw that can lead to reasoning_content being assigned to the wrong message. I've identified the issue and suggested a more robust implementation that correctly aligns the reasoning content with the surviving messages.
src/copaw/agents/model_factory.py
Outdated
| reasoning_values = [ | ||
| reasoning_contents[id(m)] | ||
| for m in msgs | ||
| if m.role == "assistant" and id(m) in reasoning_contents | ||
| ] | ||
| out_assistant = [ | ||
| m for m in messages if m.get("role") == "assistant" | ||
| ] | ||
| if len(in_assistant) != len(out_assistant): | ||
| logger.warning( | ||
| "Assistant message count mismatch after formatting " | ||
| "(%d before, %d after). " | ||
| "Skipping reasoning_content injection.", | ||
| len(in_assistant), | ||
| len(out_assistant), | ||
| ) | ||
| else: | ||
| for in_msg, out_msg in zip( | ||
| in_assistant, | ||
| out_assistant, | ||
| ): | ||
| reasoning = reasoning_contents.get(id(in_msg)) | ||
| if reasoning: | ||
| out_msg["reasoning_content"] = reasoning | ||
| for i, out_msg in enumerate(out_assistant): | ||
| if i < len(reasoning_values): | ||
| out_msg["reasoning_content"] = reasoning_values[i] |
There was a problem hiding this comment.
This sequential injection logic can assign reasoning_content to the wrong assistant message. The reasoning_values list is built from all assistant messages that have reasoning, including those later dropped by the formatter. The code then injects these values sequentially into the surviving assistant messages, leading to a misalignment.
For example, if the message history is:
- Assistant message
m1(survives, no reasoning) - Assistant message
m2(dropped, has reasoning) - Assistant message
m3(survives, no reasoning)
reasoning_values will have one item (from m2), and out_assistant will have two messages (from m1 and m3). The loop will incorrectly assign m2's reasoning to m1.
To fix this, we should build a list of reasoning values that correctly aligns with the surviving assistant messages by predicting which messages will survive formatting. I've also re-introduced the length check as a safeguard.
# Build a list of reasoning values that aligns with the surviving
# assistant messages to prevent incorrect injection.
aligned_reasoning = []
for m in (msg for msg in msgs if msg.role == "assistant"):
# Predict if the message will be dropped by the parent formatter.
# A message is dropped if it is thinking-only and has no tool calls.
is_thinking_only = (
isinstance(m.content, list)
and m.content
and all(b.get("type") == "thinking" for b in m.content)
)
if not (is_thinking_only and not getattr(m, "tool_calls", None)):
aligned_reasoning.append(reasoning_contents.get(id(m)))
out_assistant = [
m for m in messages if m.get("role") == "assistant"
]
# As a safeguard, check if our prediction of surviving messages matches reality.
if len(aligned_reasoning) != len(out_assistant):
logger.warning(
"Assistant message count mismatch after formatting "
"(%d expected survivors, %d actual). "
"Skipping reasoning_content injection.",
len(aligned_reasoning),
len(out_assistant),
)
else:
for i, out_msg in enumerate(out_assistant):
if aligned_reasoning[i]:
out_msg["reasoning_content"] = aligned_reasoning[i]
qbc2016
left a comment
There was a problem hiding this comment.
Thank you for the contribution. Please refine the codes based on the comments by gemini-code-assistant.
Predict which assistant messages survive the parent formatter (drop thinking-only messages without tool_calls) and only inject reasoning_content into the correctly aligned output messages. Re-introduce the count mismatch safeguard as a fallback.
|
Addressed in ba7119d. The sequential injection was indeed misaligning reasoning with the wrong messages. Now predicts which assistant messages survive the parent formatter (drops thinking-only messages without tool_calls) and injects reasoning only into the aligned output. Re-added the count mismatch safeguard as a fallback. |
qbc2016
left a comment
There was a problem hiding this comment.
Please see the inline comments, and run pre-commit run --all-files to pass the pre-commit checks.
src/copaw/agents/model_factory.py
Outdated
| and all(b.get("type") == "thinking" for b in m.content) | ||
| ) | ||
| if not ( | ||
| is_thinking_only and not getattr(m, "tool_calls", None) |
There was a problem hiding this comment.
not getattr(m, "tool_calls", None) is redundant — Msg has no tool_calls attribute (it always evaluates to None). Please simplify to if not is_thinking_only.
There was a problem hiding this comment.
Simplified in ea44e2a - removed the redundant getattr(m, "tool_calls", None) check since Msg has no tool_calls attribute.
- Replace X | Y union syntax with Union[str, List[dict]] for mypy v1.7 compatibility (mypy defaults to Python 3.9 syntax rules) - Add pylint disable-next for too-many-statements on factory function Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Ran
All hooks pass now (mypy, black, flake8, pylint, prettier). |
Remove redundant `not getattr(m, "tool_calls", None)` check — Msg has no tool_calls attribute so the condition always evaluated to True. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
FileBlockSupportFormatter._formatinmodel_factory.pynever injectsreasoning_contentinto API requests because the assistant message count before and after formatting never matches. Every request logs:The root cause: the parent
OpenAIChatFormatter._formatdrops assistant messages that contain only thinking blocks (nocontentortool_calls), soout_assistantis always smaller thanin_assistant.Changes
model_factory.py: Replace the count-based guard with sequential injection. Reasoning values are collected from input assistant messages in order, then assigned to output assistant messages by index. This handles the count difference from dropped thinking-only messages without skipping injection.Before/After
Before:
reasoning_contentis never injected. Warning logged on every request. Models that support extended thinking never receive the reasoning chain.After:
reasoning_contentis injected to output assistant messages in order. No warning logged. Models receive the reasoning chain as expected.Fixes #1532
This contribution was developed with AI assistance (Claude Code).