Skip to content

No overall wall-clock timeout in MCP reflect: calls may hang for up to 40 minutes #642

@ThePlenkov

Description

@ThePlenkov

Bug Description

MCP reflect and related layers do not have a wall-clock timeout. The operation can run up to 20 iterations (for "high" budget), each up to the LLM timeout (default 120s), meaning the call can hang for 40 minutes if not completed. This causes the client (e.g., Claude Desktop) to appear to hang or time out earlier than the server completes. No overall timeout exists in mcp_tools.py, memory_engine.py, or the related HTTP handler.

Steps to Reproduce

  1. Call the MCP reflect endpoint with a complex enough query to cause long-running LLM calls or high iteration counts
  2. Do not set very small timeout or budget limits
  3. Observe that the client hangs or times out well before the server completes

Expected Behavior

A timeout at the MCP or API level to abort the operation and free up resources if the total reflect process exceeds a reasonable wall time (e.g., 2–5 minutes).

Actual Behavior

The process hangs indefinitely (up to 40 minutes for high budget), or until the client times out/cancels. There is no global timeout guard—only per-LLM-call timeouts limit operation. See code citations and summary below for the locations where the timeout should be managed:

  • mcp_tools.py (no asyncio.wait_for wrapper)
  • memory_engine.py/reflect_async (no overall timeout)
  • agent.py/run_reflect_agent (looped up to budgeted max_iterations)
  • api/http.py (no timeout added on endpoint)

This leads to poor user experience for long/complex queries.

Version

No response

LLM Provider

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions