fix(chat): handle message creation in own thread #7303

Danelegend · 2026-01-09T01:27:59Z

Description

There is a bug where if we refresh while streaming, the connection becomes lost resulting in fastapi no longer yielding results from the streaming object. This leads to message processing in the 'handle_stream_message_objects' function get blocked (or halting) on the yield from. As such, we don't continue don't the control flow, and don't end up saving the finished chat in the database.

This thread spins the message creation/processing logic in it's own producer thread and puts the result in a buffer that the consumer can stream to the client.

How Has This Been Tested?

Manually tested that chats still stream

Additional Options

[Optional] Override Linear Check

Summary by cubic

Fixes chat streaming hangs on refresh by moving message creation to a background producer thread and streaming from an async buffer. Ensures chats finish processing and are saved even if the SSE connection drops.

Bug Fixes
- Converted send-chat-message handler to async and implemented a producer–consumer flow with an asyncio queue.
- Run message processing in a background thread; consumer streams buffered events to the client.
- Capture headers before threading, emit error events, and send an explicit end-of-stream signal to prevent blocking.

^{Written for commit 5848975. Summary will update on new commits.}

cubic-dev-ai

1 issue found across 1 file

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="backend/onyx/server/query_and_chat/chat_backend.py">

<violation number="1" location="backend/onyx/server/query_and_chat/chat_backend.py:580">
P2: Use `asyncio.get_running_loop()` instead of `asyncio.get_event_loop()`. In Python 3.10+, `get_event_loop()` is deprecated inside async functions. Since this is an async endpoint with a running event loop, `get_running_loop()` is the recommended and explicit approach.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

backend/onyx/server/query_and_chat/chat_backend.py

greptile-apps

Greptile Overview

Greptile Summary

Fixes a bug where refreshing during streaming causes message processing to halt by implementing a producer-consumer pattern using asyncio.Queue and a background thread. The producer thread continues processing messages even if the client connection drops, ensuring chat messages are saved to the database. Headers are captured before thread creation to preserve request context across the thread boundary.

Confidence Score: 2/5

Moderate risk - fixes the refresh bug but introduces resource management issues with orphaned background threads
The PR successfully addresses the refresh bug by decoupling message processing from client streaming using a producer-consumer pattern. However, there's a critical issue: if the consumer fails or terminates early (e.g., client disconnect), the producer thread continues running indefinitely with no cleanup mechanism. The background thread reference isn't stored, making it impossible to clean up. Additionally, the unbounded queue could lead to memory issues, and there's temporary debug logging that should be removed. These resource management concerns warrant careful testing before merge.
backend/onyx/server/query_and_chat/chat_backend.py requires attention for thread lifecycle management and error handling between producer and consumer

Important Files Changed

File Analysis

Filename	Score	Overview
backend/onyx/server/query_and_chat/chat_backend.py	2/5	Implements producer-consumer pattern for streaming chat messages using asyncio.Queue and background thread; missing error propagation and consumer cancellation handling

Sequence Diagram

sequenceDiagram
    participant Client
    participant FastAPI as FastAPI Handler<br/>(async)
    participant Consumer as stream_from_buffer()<br/>(async generator)
    participant Queue as asyncio.Queue
    participant Producer as producer()<br/>(background thread)
    participant DB as handle_stream_message_objects

    Client->>FastAPI: POST /send-chat-message
    FastAPI->>FastAPI: Create asyncio.Queue
    FastAPI->>FastAPI: Capture request headers
    FastAPI->>Producer: run_in_background(producer)
    activate Producer
    FastAPI->>Consumer: StreamingResponse(stream_from_buffer())
    activate Consumer
    FastAPI-->>Client: Start SSE stream
    
    Producer->>DB: Iterate handle_stream_message_objects()
    activate DB
    
    loop For each message object
        DB-->>Producer: yield message object
        Producer->>Queue: loop.call_soon_threadsafe(put_nowait)
        Queue-->>Consumer: await buffer.get()
        Consumer-->>Client: yield JSON line
    end
    
    DB-->>Producer: Iteration complete
    deactivate DB
    Producer->>Queue: put_nowait(None) - end signal
    deactivate Producer
    
    Queue-->>Consumer: await buffer.get() returns None
    Consumer-->>Client: Close stream
    deactivate Consumer
    
    Note over Producer,Consumer: If client disconnects early,<br/>consumer stops but producer<br/>continues processing

greptile-apps · 2026-01-09T01:31:34Z

backend/onyx/server/query_and_chat/chat_backend.py


-    return StreamingResponse(stream_generator(), media_type="text/event-stream")
+    # Start the producer in a background thread
+    run_in_background(producer)


The producer thread continues running even if the client disconnects, which is the intended fix for the refresh bug. However, if the consumer (stream_from_buffer) terminates early (e.g., client disconnect, error), the producer thread has no way to know and will keep processing. This could lead to wasted resources processing messages nobody will receive.

Consider tracking the consumer's state and providing a way for the producer to check if it should stop early, or use a cancellation token pattern.

We should only stop if the user presses the stop button. There is another mechanism that is applied that takes care of this.

greptile-apps · 2026-01-09T01:31:34Z

backend/onyx/server/query_and_chat/chat_backend.py

+    async def stream_from_buffer() -> AsyncGenerator[str, None]:
+        """
+        Async generator that reads from the buffer and yields to the client.
+        """
+        try:
+            while True:
+                # Await the next item from the buffer
+                item = await buffer.get()
+                if item is None:
+                    # End of stream signal
+                    break
+                yield item
+        finally:
+            logger.debug("Stream consumer finished")


If stream_from_buffer() raises an exception (e.g., during buffer.get()), the consumer terminates but the producer thread continues running in the background. The producer will keep writing to the buffer indefinitely, wasting resources. The background thread started by run_in_background has no reference stored and cannot be cleaned up.

Consider storing the thread reference and implementing cleanup logic to handle consumer failures, or add try-except around the consumer to signal the producer to stop.

We don't want to stop the producer. The producer should complete it's job.

backend/onyx/server/query_and_chat/chat_backend.py

greptile-apps · 2026-01-09T01:31:36Z

backend/onyx/server/query_and_chat/chat_backend.py

+    buffer: asyncio.Queue[str | None] = asyncio.Queue()
+    loop = asyncio.get_event_loop()


The asyncio.Queue is created without a maxsize limit, which means the producer can fill the queue indefinitely if it produces faster than the consumer can consume. If the consumer is blocked or slow, this could lead to unbounded memory growth.

Consider adding a reasonable maxsize parameter (e.g., asyncio.Queue(maxsize=100)) to apply backpressure and prevent memory issues.

Not expecting this to be a limitation

greptile-apps · 2026-01-09T01:31:37Z

backend/onyx/server/query_and_chat/chat_backend.py

+                    loop.call_soon_threadsafe(
+                        buffer.put_nowait, get_json_line(obj.model_dump())
+                    )


If the asyncio queue is full (when maxsize is set), put_nowait will raise asyncio.QueueFull exception, which would crash the producer thread. The current code doesn't handle this case.

While the queue is currently unbounded, if backpressure is added in the future, this needs proper error handling.

There is currently no maxsize set, and for now we don't expect to put one

nits

409cfdc

Danelegend requested a review from a team as a code owner January 9, 2026 01:28

Change loggin

7c2b5fa

cubic-dev-ai bot reviewed Jan 9, 2026

View reviewed changes

backend/onyx/server/query_and_chat/chat_backend.py Outdated Show resolved Hide resolved

greptile-apps bot reviewed Jan 9, 2026

View reviewed changes

Dane Urban added 4 commits January 8, 2026 19:16

Change which event loop we get

3e47599

Handle error and log

d0f5f1f

Remove comment

dcc3300

Remove comment

5848975

Danelegend closed this Jan 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(chat): handle message creation in own thread #7303

fix(chat): handle message creation in own thread #7303

Danelegend commented Jan 9, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 9, 2026

Uh oh!

Danelegend Jan 9, 2026

Uh oh!

greptile-apps bot Jan 9, 2026

Uh oh!

Danelegend Jan 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps bot Jan 9, 2026

Uh oh!

Danelegend Jan 9, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Jan 9, 2026

Uh oh!

Danelegend Jan 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		buffer: asyncio.Queue[str \| None] = asyncio.Queue()
		loop = asyncio.get_event_loop()

fix(chat): handle message creation in own thread #7303

fix(chat): handle message creation in own thread #7303

Conversation

Danelegend commented Jan 9, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Additional Options

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Danelegend Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Danelegend Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Danelegend Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Danelegend Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Danelegend commented Jan 9, 2026 •

edited by cubic-dev-ai bot

Loading

Danelegend Jan 9, 2026 •

edited

Loading

Danelegend Jan 9, 2026 •

edited

Loading

Danelegend Jan 9, 2026 •

edited

Loading