Skip to content

fix: prevent tool calls from leaking into text and detect fake tool call output#14

Open
alzhang-git wants to merge 4 commits intomicrosoft:mainfrom
alzhang-git:alzhang-bugfix
Open

fix: prevent tool calls from leaking into text and detect fake tool call output#14
alzhang-git wants to merge 4 commits intomicrosoft:mainfrom
alzhang-git:alzhang-bugfix

Conversation

@alzhang-git
Copy link

Summary

Fixes two bugs in tool calling where tool calls are skipped and instead printed as text.

Fix 1 — converters.py (Root cause prevention)

  • _extract_content(): Now skips ool_call/ ool_use blocks so they don't leak into serialized text
  • convert_messages_to_prompt(): Extracts tool calls from content blocks when the ool_calls key is missing, so conversation history is correctly serialized

Fix 2 — provider.py (Defensive detection)

  • When the LLM produces text containing [Tool Call: ...] but no actual structured tool calls, the provider detects this, adds a correction message ("use actual tools, don't write tool calls as text"), and retries (up to 2 times)
  • This prevents the user from seeing fake tool results that never actually executed

Testing

  • All 691 existing tests pass (2 pre-existing failures in est_coverage_gaps.py unrelated to this change)
  • Both modified files pass syntax validation

…all output

Fix 1 - converters.py (root cause prevention):
- _extract_content(): Skip tool_call/tool_use blocks so they don't leak into text
- convert_messages_to_prompt(): Extract tool calls from content blocks when the
  tool_calls key is missing, so conversation history is correctly serialized

Fix 2 - provider.py (defensive detection):
- When the LLM produces text containing [Tool Call: ...] but no actual structured
  tool calls, the provider detects this, adds a correction message, and retries
  (up to 2 times) to prevent fake tool results from reaching the user

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Author

@alzhang-git alzhang-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to address issue time by time tool calling is skipped. E.g. when writing to a file, tool calling is not executed and instead of print file as text

alzhang-git and others added 2 commits February 28, 2026 14:14
Root cause: [Tool Call: name({args})] format in prompt history was being
mimicked by the LLM as plain text (36% of assistant messages had fake
tool calls). Changed to XML tags that are less likely to be reproduced:

- Tool calls: <tool_used name="name">args</tool_used>
- Tool results: <tool_result name="name">content</tool_result>

Additional improvements:
- _extract_content(): skip tool_result and thinking blocks too
- Detection regex: catch Tool Result(), <tool_used> patterns
- Stronger correction message on fake tool retry
- Updated all test assertions for new format

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mowree
Copy link
Collaborator

mowree commented Mar 1, 2026

✅ PR Review: Stress Test Validation Complete

Ran comprehensive stress tests — all passed:

Test Suite Tests Result
Cross-Model Structured Calls 6 models ✅ Passed
Saturated History Mimicry (5/15/25 turns) 18 tests ✅ Passed
Token Economics 3 tests ✅ Passed
Unit Tests 29 ✅ Passed

Models: claude-sonnet-4, claude-opus-4.5, gpt-4.1, gpt-5-mini, o3-mini, gemini-3-pro-preview


🔧 One Fix Requested

Message list mutation at provider.py:1057:

The retry loop does messages.append(correction_msg) which mutates the caller's list in-place. After complete() returns, the caller's message history contains correction messages they didn't add. This can cause unexpected behavior in multi-turn conversations.

Suggested fix: Work on a copy with messages = list(messages) before appending.


💡 Other Suggestions (Non-blocking)

  • Consider moving regex to module level (style preference)
  • Consider adding <tool_result> to detection regex
  • Note: the fake-call regex matches <tool_used>, which the converter now emits to prompt history — potential for false positives if LLM echoes prompt content (our saturation tests didn't trigger this, but worth monitoring)

I'll plan to merge in 48 hours. Let me know if you'd like to address the mutation issue first — happy to wait for a quick fix. Otherwise LGTM overall! 🚀

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes prompt/serialization issues that caused tool calls/results to be emitted as plain text instead of remaining structured, and adds a defensive retry when the model “fakes” tool calls in text without producing structured tool calls.

Changes:

  • Update prompt serialization to represent tool usage/history via explicit XML-like tags and to prevent tool blocks from leaking into text content.
  • Recover missing assistant tool_calls by extracting them from OpenAI-style content blocks (tool_call / tool_use) when needed.
  • Add provider-side detection + retry when responses include tool-call-like text but no structured tool calls.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
amplifier_module_provider_github_copilot/converters.py Skips tool/thinking blocks in _extract_content, extracts tool calls from content blocks, and serializes tool usage/results with <tool_used> / <tool_result> tags.
amplifier_module_provider_github_copilot/provider.py Adds regex-based detection of “fake tool calls” in text and retries with a correction message.
tests/test_converters.py Updates assertions to match the new <tool_used> / <tool_result> serialization format.
tests/integration/test_multi_model_saturation.py Updates tool-call-pattern counting to match the new <tool_used ...> format in serialized prompts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…d test

- Use spread operator for retry messages to avoid mutating caller's list
- Add <tool_result> to fake tool detection regex
- Add unit test for tool call extraction from content blocks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@alzhang-git
Copy link
Author

Thanks for the thorough review and stress testing @mowree!

All three items addressed in 2ab91ff:

  1. Message mutation — Fixed. Using retry_messages = [*messages, correction_msg] so the caller's list is never mutated.
  2. <tool_result> detection — Added to the regex.
  3. Missing unit test — Added test_assistant_tool_calls_extracted_from_content_blocks.

Test results: 609 passed, 2 pre-existing failures in test_coverage_gaps.py (timing-related, unrelated to this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants