fix(agents): detect and break tool-call reflex loops (#658) by penso · Pull Request #664 · moltis-org/moltis

penso · 2026-04-11T19:21:11Z

Summary

Fixes #658.

The runner previously dispatched tool calls with empty or malformed args straight to tool.execute without pre-validation, and had no detection for repeated identical failures. A model stuck in a reflex-retry loop (e.g. exec({}) on every iteration) burned through all 25 iterations before max_iterations fired, producing a ~4 minute dead zone with no visible progress.

Three defensive layers are added at the runner boundary — any one alone would have prevented the reported scenario; together they harden the runner against the whole class.

1. Pre-dispatch schema validation (Fix B from the issue)

New crates/agents/src/tool_arg_validator.rs checks each tool call's arguments against the tool's own parameters_schema() before execute runs. Missing required fields and top-level type mismatches short-circuit to a directive error message that names the failure, echoes the sent args, and explicitly tells the model not to retry with identical arguments:

Tool call rejected before execution by `exec`.
Missing required field(s): `command`.
You sent: {}
Do not retry with the same arguments. If you do not know what arguments to use,
respond in plain text and ask the user for clarification.

Deliberately narrow in scope — only catches the reflex-retry class (missing-required / wrong-type at the top level). Tools still own deeper semantic validation.

2. Loop detector with escalating intervention (Fix A + Fix C)

New crates/agents/src/tool_loop_detector.rs tracks a ring buffer of recent (tool_name, args_hash, error_hash) outcomes and fires when N consecutive failures share the same tool and (args OR error).

Two escalation stages:

Stage 1 — Nudge: inject a directive intervention message into the conversation history listing the exact repeated calls, explicitly forbidding another tool call, and telling the model to respond in plain text.
Stage 2 — Strip tools: if a fourth consecutive failure lands after the nudge, pass schemas_for_api = vec![] for a single turn so the model physically cannot emit another tool call. After that forced-text turn, normal schemas are restored.

Any successful tool call resets both the ring buffer and the escalation stage, so legitimate retry patterns (fail → retry with different args → succeed) do not trip the detector.

3. Event reorder + debug logging (Fix D)

RunnerEvent::ToolCallStart is now emitted only for calls that pass validation. Rejected calls emit a new ToolCallRejected event instead, so the UI stops showing 💻 Executing command... for calls that never executed.
The streaming tool-call accumulator now logs each finalized args string at debug! level so future variants of "default to {} because no deltas arrived" can be diagnosed from a single log file.

Config

Two new fields in [tools] (defaults are opt-out, per CLAUDE.md):

agent_loop_detector_window = 3                         # 0 = disable
agent_loop_detector_strip_tools_on_second_fire = true

New RunnerEvent variants

Both surfaced through crates/chat/src/lib.rs event forwarder so the UI and channels get appropriate signals:

ToolCallRejected { id, name, arguments, error } — reported as a tool_call_end with rejected: true
LoopInterventionFired { stage, tool_name } — reported as a notice with loopInterventionStage + stuckTool

Test plan

Automated

New tests (all passing):

tool_arg_validator::tests::* — 13 unit tests covering empty schema, missing required, null-as-missing, type mismatch, non-object args, array/object types, unknown types, LLM error message formatting
tool_loop_detector::tests::* — 11 unit tests covering window=0 disabled, 3-identical-fires-nudge, 4th-strips-tools, strip-disabled-stays-nudged, success resets state, same-error-different-args still fires, different tools do not fire, legitimate-retry does not fire, canonicalization stability, intervention message content
runner::tests::reflex_loop_fires_detector_and_terminates_non_streaming — end-to-end: reflex exec({}) → validation rejects → detector fires → intervention → model returns text → run terminates at iter ≤5
runner::tests::reflex_loop_fires_detector_and_terminates_streaming — same scenario on the streaming path (uses stream_with_tools + mid-stream tool_use with no argument deltas)
runner::tests::legitimate_retry_does_not_fire_loop_detector — regression: fail once with a real error, retry with different args, succeed. Detector must not fire.

Validation

Completed

cargo test -p moltis-agents — 352 passed, 0 failed
cargo test -p moltis-config — 185 passed
cargo test -p moltis-chat — 173 passed
cargo test --workspace --exclude moltis-providers --exclude moltis-gateway — all green
cargo +nightly-2025-11-30 fmt --all -- --check — clean
cargo +nightly-2025-11-30 clippy -p moltis-agents -p moltis-config -p moltis-chat --all-targets -- -D warnings — clean

Remaining

just lint — local env missing CUDA toolkit for llama-cpp-sys-2; CI will run the full matrix
just test — same
Swift/iOS build steps — Darwin-only gates, will be exercised by CI

Manual QA

Reproduce the original [Bug]: Runner dispatches empty-args tool calls, no loop detection on repeated identical failures (25-iter dead zone) #658 scenario with a real Claude Haiku session (ambiguous prompt that triggers empty exec args) and confirm the run terminates within ~4 iterations instead of hanging for 25 iterations
Confirm the UI activity log shows the new loop-detected notice and that rejected calls no longer display "Executing command..."
Verify a legitimate failure-then-retry flow (e.g. ls /nonexistent → ls /tmp) does not emit any LoopInterventionFired event
Confirm agent_loop_detector_window = 0 in moltis.toml disables detection end-to-end

Runner previously dispatched tool calls with empty or malformed args straight to tool.execute without pre-validation, then had no detection for repeated identical failures. A model stuck in a reflex-retry loop (e.g. exec({}) on every iteration) would burn through all 25 iterations before max_iterations fired, producing a ~4 minute dead zone with no visible progress. Three defensive layers added at the runner boundary: 1. Pre-dispatch schema validation (tool_arg_validator.rs): each tool call's arguments are checked against the tool's parameters_schema before execute() runs. Missing required fields or top-level type mismatches short-circuit to a structured, directive error that names the failure, echoes the args, and tells the model not to retry with identical arguments. 2. Loop detector with escalating intervention (tool_loop_detector.rs): tracks a ring buffer of recent (tool, args_hash, error_hash) outcomes. Three consecutive failures sharing the same tool and (args OR error) fire stage 1 (inject a strong directive intervention message). A fourth consecutive failure after the nudge fires stage 2 (strip tool schemas for one turn, forcing a text response). Any successful tool call resets the state. 3. Event reordering + raw-args debug logging: ToolCallStart is now emitted only after validation passes; rejected calls emit a new ToolCallRejected event so the UI stops showing "Executing..." for calls that never executed. The streaming tool-call accumulator logs each finalized args string at debug level to aid diagnosis of future variants. New config fields in [tools]: - agent_loop_detector_window (default 3, 0 = off) - agent_loop_detector_strip_tools_on_second_fire (default true) New RunnerEvent variants surfaced through the chat event forwarder: - ToolCallRejected: reported as a tool_call_end with rejected=true - LoopInterventionFired: reported as a notice with stage + stuck tool Integration tests cover the reflex-loop scenario end-to-end in both the non-streaming and streaming paths, plus a legitimate one-shot retry regression test to ensure normal failure/recovery patterns do not trip the detector. Entire-Checkpoint: c441f764a037

codspeed-hq · 2026-04-11T19:24:22Z

Merging this PR will not alter performance

✅ 39 untouched benchmarks
⏩ 5 skipped benchmarks¹

_{Comparing evergreen-paper (158085a) with main (c3da499)}

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2026-04-11T19:28:49Z

Codecov Report

❌ Patch coverage is 87.55328% with 146 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/agents/src/runner.rs	83.62%	85 Missing ⚠️
crates/chat/src/lib.rs	5.26%	36 Missing ⚠️
crates/agents/src/tool_arg_validator.rs	94.96%	13 Missing ⚠️
crates/agents/src/tool_loop_detector.rs	96.62%	12 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-04-11T19:29:22Z

Greptile Summary

This PR adds three layered defenses against tool-call reflex loops in the agent runner: pre-dispatch schema validation (tool_arg_validator.rs), a rolling loop detector with two escalation stages (tool_loop_detector.rs), and event reordering so the UI never shows "Executing…" for calls that never ran. The implementation is well-tested with 24+ new unit tests and three end-to-end integration tests covering the non-streaming path, the streaming path, and the legitimate-retry regression.

All three issues flagged in the previous Greptile round have been resolved: clear_strip_tools now flushes the deque (not just transitions the stage), integer type-checking accepts 30.0 as a valid integer-valued float, and format_strip_tools_message now returns String for API consistency with format_intervention_message.

Confidence Score: 5/5

Safe to merge — all three previously-flagged issues are resolved and no new P0/P1 defects found.

All prior Greptile concerns addressed: clear_strip_tools flushes deque, integer check accepts 30.0, format_strip_tools_message returns String. 24+ unit tests plus 3 E2E integration tests cover all edge cases. No remaining P0/P1 findings.

No files require special attention.

Important Files Changed

Filename	Overview
crates/agents/src/tool_loop_detector.rs	New rolling loop detector with two escalation stages (nudge / strip-tools). consume_pending_action correctly handles trailing-success suppression and stage-skip edge cases. clear_strip_tools now flushes the deque to prevent oscillation.
crates/agents/src/tool_arg_validator.rs	New lightweight schema validator. Correctly handles required fields, null-as-missing, type mismatches, and integer-valued floats. 13 unit tests cover all branches.
crates/agents/src/runner.rs	Integration of schema validation and loop detector into both streaming and non-streaming loops. apply_loop_detector_intervention correctly derives post-batch action via consume_pending_action.
crates/chat/src/lib.rs	New RunnerEvent variants forwarded to WebSocket clients correctly. ToolCallRejected emits terminal tool_call_end with rejected: true.
crates/config/src/schema.rs	Two new ToolsConfig fields with sensible defaults and serde helpers, Default impl updated consistently.

Sequence Diagram

sequenceDiagram
    participant R as Runner Loop
    participant V as tool_arg_validator
    participant LD as ToolLoopDetector
    participant T as Tool.execute
    participant E as on_event callback

    R->>V: validate_tool_args(schema, args)
    alt Validation fails
        V-->>R: Err(ToolArgError)
        R->>E: ToolCallRejected { rejected: true }
        R->>LD: record(failure fingerprint)
    else Validation passes
        V-->>R: Ok(())
        R->>E: ToolCallStart
        R->>T: execute(args)
        T-->>R: (success/failure, result)
        R->>E: ToolCallEnd
        R->>LD: record(success/failure fingerprint)
    end

    R->>LD: consume_pending_action()
    alt No intervention
        LD-->>R: None
    else Stage 1 - Nudge
        LD-->>R: InjectNudge
        R->>E: LoopInterventionFired { stage: 1 }
        R->>R: messages.push(directive user message)
    else Stage 2 - Strip tools
        LD-->>R: StripTools
        R->>E: LoopInterventionFired { stage: 2 }
        R->>R: strip_tools_next_iter = true
        Note over R: Next iter: schemas_for_api = vec![]
        R->>LD: clear_strip_tools() [flushes deque + resets]
    end

_{Reviews (5): Last reviewed commit: "fix(agents): treat success=false without..." | Re-trigger Greptile}

Three P2 findings from code review: 1. Loop detector oscillation — `clear_strip_tools()` used to leave the `recent` deque full of `window` matching failures while only transitioning the stage from StripTools → Nudged. A single identical failure after tools were restored would immediately re-fire stage 2 with `stage: Nudged` + `strip_on_second_fire: true`, creating a strip → text → restore → single-fail → strip oscillation that burned through iterations with almost no runway for the model. Treat the forced-text turn as a full reset: clear both the stage AND the deque. Added a dedicated regression test (`post_strip_single_failure_does_not_immediately_refire`). 2. Integer type check rejected valid integer-valued floats. Some LLMs serialize integers with a trailing decimal (e.g. `"timeout": 30.0`) and `serde_json` stores those as f64-backed Numbers whose `as_i64()` / `as_u64()` return `None`. The validator now accepts any float whose fractional part is zero, so `30.0` passes while `30.5` is still rejected. Covered by `integer_accepts_integer_valued_floats`. 3. API asymmetry — `format_strip_tools_message` returned `&'static str` while `format_intervention_message` returned `String`. Both are consumed identically at the call site via `Into<String>`. Changed the former to return `String` for uniformity. Entire-Checkpoint: 73323a6c31e5

) Two P2 edge cases surfaced by the second Greptile review (PR #664): 1. **False intervention after trailing success in the same batch.** When a batch was `[fail, fail, success]` and the detector was one failure away from the window, the fail that pushed the window full would set a pending nudge, the trailing success would reset the detector, and the runner would still inject the stale intervention after the batch. 2. **Stage-skip when both escalations fire within one batch.** A parallel batch like `[fail, fail, fail, fail]` would fire `InjectNudge` on the third call and `StripTools` on the fourth. The old per-call accumulator shadowed the nudge with the strip action, robbing the model of its chance to recover via plain text before tools were stripped. The fix changes both result-processing loops to derive the intervention action from the detector's *post-batch state* rather than from per-call `record()` return values: - Add `consume_pending_action()` on the detector. It looks at the current `stage` plus a new internal `nudge_delivered` flag and returns the correct single action, one-shotting stage transitions so the same intervention can't be applied twice. When it would advance to `StripTools` without having delivered a nudge first, it demotes back to `Nudged` and returns `InjectNudge` so the nudge lands first; strip-tools can fire on the next batch if the pattern persists. - Extract the intervention-application logic into a single helper, `apply_loop_detector_intervention`, shared by the streaming and non-streaming runner loops. This eliminates duplicated match arms and ensures both paths stay in sync. Regression coverage: - `tool_loop_detector::tests` — 5 new unit tests covering `consume_pending_action` behaviour for all stage/delivered combinations, trailing-success suppression, and the stage-skip guard. - `runner::tests::mixed_batch_with_trailing_success_does_not_fire_intervention` — end-to-end, a parallel batch `[exec({}), exec({}), exec("true")]` must not emit any `LoopInterventionFired` event because the trailing success recovers cleanly. - `runner::tests::parallel_batch_with_stage_skip_delivers_nudge_first` — end-to-end, a parallel batch of four identical `exec({})` calls must emit a stage-1 `LoopInterventionFired` first, not jump straight to stage 2. Entire-Checkpoint: 37184e642bc0

penso · 2026-04-11T22:52:26Z

@greptile-apps review

Tools that return `{success: false}` without an `error` key were treated as successes by the loop detector because `is_failure` derived from `error_hash.is_some()`. This means e.g. a `BrowserResponse` with `success: false` and no error text would silently reset the detector instead of contributing to the reflex-loop window. Split `ToolCallFingerprint` into explicit `success()` / `failure()` constructors and store a `failed: bool` field. The runner now passes `!success` from the dispatch result, so logical failures without an error string are correctly counted. New test: `failure_without_error_string_still_counts_as_failure`. Entire-Checkpoint: 7aff7471302c

greptile-apps bot reviewed Apr 11, 2026

View reviewed changes

Comment thread crates/agents/src/tool_loop_detector.rs

Comment thread crates/agents/src/tool_arg_validator.rs Outdated

Comment thread crates/agents/src/tool_loop_detector.rs Outdated

penso added 2 commits April 11, 2026 20:37

github-actions bot mentioned this pull request Apr 12, 2026

🦞 OpenClaw 生态日报 2026-04-12 gsscsd/big_model_radar#173

Open

penso merged commit 8828cc5 into main Apr 12, 2026
40 checks passed

penso deleted the evergreen-paper branch April 12, 2026 08:35

Ari4ka approved these changes Apr 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agents): detect and break tool-call reflex loops (#658)#664

fix(agents): detect and break tool-call reflex loops (#658)#664
penso merged 4 commits intomainfrom
evergreen-paper

penso commented Apr 11, 2026

Uh oh!

codspeed-hq bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

penso commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

penso commented Apr 11, 2026

Summary

1. Pre-dispatch schema validation (Fix B from the issue)

2. Loop detector with escalating intervention (Fix A + Fix C)

3. Event reorder + debug logging (Fix D)

Config

New RunnerEvent variants

Test plan

Automated

Validation

Completed

Remaining

Manual QA

Uh oh!

codspeed-hq bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

codecov bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

penso commented Apr 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Apr 11, 2026 •

edited

Loading

codecov bot commented Apr 11, 2026 •

edited

Loading

greptile-apps bot commented Apr 11, 2026 •

edited

Loading