Skip to content

fix(agents): inject reasoning_content sequentially to avoid count mismatch#1557

Open
mvanhorn wants to merge 4 commits intoagentscope-ai:mainfrom
mvanhorn:osc/1532-reasoning-content-mismatch
Open

fix(agents): inject reasoning_content sequentially to avoid count mismatch#1557
mvanhorn wants to merge 4 commits intoagentscope-ai:mainfrom
mvanhorn:osc/1532-reasoning-content-mismatch

Conversation

@mvanhorn
Copy link
Contributor

Summary

FileBlockSupportFormatter._format in model_factory.py never injects reasoning_content into API requests because the assistant message count before and after formatting never matches. Every request logs:

Assistant message count mismatch after formatting (140 before, 135 after). Skipping reasoning_content injection.

The root cause: the parent OpenAIChatFormatter._format drops assistant messages that contain only thinking blocks (no content or tool_calls), so out_assistant is always smaller than in_assistant.

Changes

  • model_factory.py: Replace the count-based guard with sequential injection. Reasoning values are collected from input assistant messages in order, then assigned to output assistant messages by index. This handles the count difference from dropped thinking-only messages without skipping injection.

Before/After

Before: reasoning_content is never injected. Warning logged on every request. Models that support extended thinking never receive the reasoning chain.

After: reasoning_content is injected to output assistant messages in order. No warning logged. Models receive the reasoning chain as expected.

Fixes #1532

This contribution was developed with AI assistance (Claude Code).

…match

The FileBlockSupportFormatter._format method compared assistant message
counts before and after parent formatting to inject reasoning_content.
The parent formatter drops assistant messages that only contain thinking
blocks (no content or tool_calls), so the counts never matched and
reasoning_content was always skipped with a warning.

This changes the injection to work sequentially: collect reasoning
values from input assistant messages in order, then assign them to
output assistant messages by index. This handles the count difference
from dropped thinking-only messages.

Fixes agentscope-ai#1532

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the first-time-contributor PR created by a first time contributor label Mar 16, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the FileBlockSupportFormatter where reasoning_content was consistently omitted from API requests. Previously, a mismatch in assistant message counts after formatting triggered a guard that prevented the injection of reasoning chains. The implemented fix introduces a robust sequential injection mechanism, ensuring that reasoning content is correctly passed to models, thereby enhancing their ability to utilize extended thinking and preventing unnecessary warning logs.

Highlights

  • Reasoning Content Injection Fix: Resolved an issue where reasoning_content was never injected into API requests due to an assistant message count mismatch, leading to models not receiving the reasoning chain.
  • Sequential Injection Mechanism: Implemented a new sequential injection logic for reasoning_content that correctly handles discrepancies in assistant message counts between input and output, ensuring proper data flow.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/copaw/agents/model_factory.py
    • Replaced the conditional reasoning_content injection logic, which relied on matching input and output assistant message counts, with a sequential injection method.
    • Introduced a mechanism to collect reasoning_values from input assistant messages and then assign them to output assistant messages by index, accommodating potential message count discrepancies caused by dropped thinking-only messages.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an issue where reasoning_content was not being injected due to a message count mismatch. The proposed solution of sequential injection is a good direction, but the current implementation has a flaw that can lead to reasoning_content being assigned to the wrong message. I've identified the issue and suggested a more robust implementation that correctly aligns the reasoning content with the surviving messages.

Comment on lines +169 to +179
reasoning_values = [
reasoning_contents[id(m)]
for m in msgs
if m.role == "assistant" and id(m) in reasoning_contents
]
out_assistant = [
m for m in messages if m.get("role") == "assistant"
]
if len(in_assistant) != len(out_assistant):
logger.warning(
"Assistant message count mismatch after formatting "
"(%d before, %d after). "
"Skipping reasoning_content injection.",
len(in_assistant),
len(out_assistant),
)
else:
for in_msg, out_msg in zip(
in_assistant,
out_assistant,
):
reasoning = reasoning_contents.get(id(in_msg))
if reasoning:
out_msg["reasoning_content"] = reasoning
for i, out_msg in enumerate(out_assistant):
if i < len(reasoning_values):
out_msg["reasoning_content"] = reasoning_values[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This sequential injection logic can assign reasoning_content to the wrong assistant message. The reasoning_values list is built from all assistant messages that have reasoning, including those later dropped by the formatter. The code then injects these values sequentially into the surviving assistant messages, leading to a misalignment.

For example, if the message history is:

  1. Assistant message m1 (survives, no reasoning)
  2. Assistant message m2 (dropped, has reasoning)
  3. Assistant message m3 (survives, no reasoning)

reasoning_values will have one item (from m2), and out_assistant will have two messages (from m1 and m3). The loop will incorrectly assign m2's reasoning to m1.

To fix this, we should build a list of reasoning values that correctly aligns with the surviving assistant messages by predicting which messages will survive formatting. I've also re-introduced the length check as a safeguard.

                # Build a list of reasoning values that aligns with the surviving
                # assistant messages to prevent incorrect injection.
                aligned_reasoning = []
                for m in (msg for msg in msgs if msg.role == "assistant"):
                    # Predict if the message will be dropped by the parent formatter.
                    # A message is dropped if it is thinking-only and has no tool calls.
                    is_thinking_only = (
                        isinstance(m.content, list)
                        and m.content
                        and all(b.get("type") == "thinking" for b in m.content)
                    )
                    if not (is_thinking_only and not getattr(m, "tool_calls", None)):
                        aligned_reasoning.append(reasoning_contents.get(id(m)))

                out_assistant = [
                    m for m in messages if m.get("role") == "assistant"
                ]

                # As a safeguard, check if our prediction of surviving messages matches reality.
                if len(aligned_reasoning) != len(out_assistant):
                    logger.warning(
                        "Assistant message count mismatch after formatting "
                        "(%d expected survivors, %d actual). "
                        "Skipping reasoning_content injection.",
                        len(aligned_reasoning),
                        len(out_assistant),
                    )
                else:
                    for i, out_msg in enumerate(out_assistant):
                        if aligned_reasoning[i]:
                            out_msg["reasoning_content"] = aligned_reasoning[i]

Copy link
Member

@qbc2016 qbc2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution. Please refine the codes based on the comments by gemini-code-assistant.

Predict which assistant messages survive the parent formatter (drop
thinking-only messages without tool_calls) and only inject
reasoning_content into the correctly aligned output messages.
Re-introduce the count mismatch safeguard as a fallback.
@mvanhorn mvanhorn requested a deployment to maintainer-approved March 16, 2026 15:08 — with GitHub Actions Waiting
@mvanhorn
Copy link
Contributor Author

Addressed in ba7119d. The sequential injection was indeed misaligning reasoning with the wrong messages. Now predicts which assistant messages survive the parent formatter (drops thinking-only messages without tool_calls) and injects reasoning only into the aligned output. Re-added the count mismatch safeguard as a fallback.

Copy link
Member

@qbc2016 qbc2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the inline comments, and run pre-commit run --all-files to pass the pre-commit checks.

and all(b.get("type") == "thinking" for b in m.content)
)
if not (
is_thinking_only and not getattr(m, "tool_calls", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not getattr(m, "tool_calls", None) is redundant — Msg has no tool_calls attribute (it always evaluates to None). Please simplify to if not is_thinking_only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified in ea44e2a - removed the redundant getattr(m, "tool_calls", None) check since Msg has no tool_calls attribute.

- Replace X | Y union syntax with Union[str, List[dict]] for mypy v1.7
  compatibility (mypy defaults to Python 3.9 syntax rules)
- Add pylint disable-next for too-many-statements on factory function

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mvanhorn
Copy link
Contributor Author

Ran pre-commit run --all-files and fixed two issues in fe8d8f5:

  • Replaced str | list[dict] union syntax with Union[str, List[dict]] (mypy v1.7 defaults to pre-3.10 syntax rules)
  • Added pylint disable for too-many-statements on the factory function

All hooks pass now (mypy, black, flake8, pylint, prettier).

Remove redundant `not getattr(m, "tool_calls", None)` check — Msg has
no tool_calls attribute so the condition always evaluated to True.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first-time-contributor PR created by a first time contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reasoning_content injection fails with "Assistant message count mismatch" warning in FileBlockSupportFormatter

2 participants