fix(chat): post llm loop callback #7309

Danelegend · 2026-01-09T06:32:43Z

Description

There is a bug where if the fastapi client stops yielding from the stream, the flow halts on a 'yield from' and does not progress onwards. If this path occurs, the code that occurs after will not run. This includes logic to save the chat to the database.

This PR moves the logic into a callback that is run at the conclusion of the llm thread. This pr does somewhat implicitly use the magic of the 'finally' block in that it is guaranteed (afaik) to be run, but it still makes semantic sense in this instance

How Has This Been Tested?

Manually test cases including sending messages and refreshing mid stream. Also tested stopping message scenario.

Additional Options

[Optional] Override Linear Check

Summary by cubic

Fixes a bug where the chat flow could stall if the FastAPI stream stopped, causing the post-run logic (including DB save) to be skipped. Moves post-LLM work into a completion callback so the chat turn is always finalized and saved.

Bug Fixes
- Added completion_callback to run_chat_loop_with_state_containers, runs after the background thread; emits PacketException on failure.
- Added llm_loop_completion_handle to build the final answer, collect citations, and save the assistant message.
- Handles user-stopped runs by appending a stop notice and still saving the turn.

^{Written for commit 05b56df. Summary will update on new commits.}

cubic-dev-ai

No issues found across 2 files

greptile-apps

Greptile Overview

Greptile Summary

This PR attempts to fix a bug where chat state isn't saved to the database when the FastAPI stream stops prematurely. The fix moves the post-LLM logic (building final answer, collecting citations, saving to DB) into a completion_callback that runs in a finally block, ensuring it executes even if the stream halts.

Changes:

Added completion_callback parameter to run_chat_loop_with_state_containers that executes in the finally block after the LLM thread completes
Extracted post-LLM logic into new function llm_loop_completion_handle
Created callback llm_loop_completion_callback that invokes the handle function with necessary context

Issues:
The implementation has a critical flaw with exception handling. When an exception occurs in the LLM loop:

The exception is caught and emitted as PacketException
It's re-raised in the main thread (line 179 of chat_state.py)
The finally block executes the callback, which calls save_chat_turn and commits to DB
The exception continues propagating to handle_stream_message_objects
Exception handlers there call db_session.rollback(), undoing the save

This means exceptions will cause the chat to be saved then immediately rolled back, defeating the purpose of the fix. The callback needs to only run on successful completion, or the exception handlers need to be aware that the callback has already run and avoid rolling back.

Confidence Score: 1/5

Not safe to merge - contains critical database consistency bug with exception handling
The PR introduces a critical bug where exceptions cause the chat to be saved and then immediately rolled back. When an exception occurs in the LLM loop, the completion callback runs in the finally block (committing chat state to DB), but then the exception propagates to outer exception handlers that call db_session.rollback(). This creates database inconsistency and defeats the purpose of the fix. Additionally, the callback itself has no exception handling, which could mask original exceptions.
backend/onyx/chat/chat_state.py (callback execution timing), backend/onyx/chat/process_message.py (exception handling in callback)

Important Files Changed

File Analysis

Filename	Score	Overview
backend/onyx/chat/chat_state.py	2/5	Added completion_callback parameter that runs in finally block after LLM loop completes, enabling post-processing regardless of success/failure
backend/onyx/chat/process_message.py	2/5	Moved post-LLM logic into llm_loop_completion_handle callback to ensure chat is saved even if stream stops, but has exception handling issue with db rollback

Sequence Diagram

sequenceDiagram
    participant Client
    participant handle_stream_message_objects
    participant run_chat_loop_with_state_containers
    participant Background Thread
    participant Callback
    participant DB

    Client->>handle_stream_message_objects: Send message
    handle_stream_message_objects->>run_chat_loop_with_state_containers: Start LLM loop
    run_chat_loop_with_state_containers->>Background Thread: Launch LLM execution
    
    alt Normal completion
        Background Thread->>run_chat_loop_with_state_containers: Emit packets
        run_chat_loop_with_state_containers->>Client: Yield packets
        Background Thread->>run_chat_loop_with_state_containers: Emit OverallStop
        Note over run_chat_loop_with_state_containers: Finally block executes
        run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
        Callback->>DB: save_chat_turn() + commit()
        run_chat_loop_with_state_containers-->>handle_stream_message_objects: Return
    else Exception in LLM
        Background Thread->>Background Thread: Exception occurs
        Background Thread->>run_chat_loop_with_state_containers: Emit PacketException
        run_chat_loop_with_state_containers->>run_chat_loop_with_state_containers: Re-raise exception (line 179)
        Note over run_chat_loop_with_state_containers: Finally block executes FIRST
        run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
        Callback->>DB: save_chat_turn() + commit()
        Note over DB: Chat saved to DB
        run_chat_loop_with_state_containers-->>handle_stream_message_objects: Exception propagates
        handle_stream_message_objects->>DB: rollback()
        Note over DB: CRITICAL: Chat save rolled back!
        handle_stream_message_objects->>Client: Yield StreamingError
    else User cancellation
        Client->>run_chat_loop_with_state_containers: Disconnect
        run_chat_loop_with_state_containers->>run_chat_loop_with_state_containers: Detect !is_connected()
        Note over run_chat_loop_with_state_containers: Finally block executes
        run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
        Callback->>DB: save_chat_turn() with stop message + commit()
        run_chat_loop_with_state_containers-->>handle_stream_message_objects: Return
    end

backend/onyx/chat/process_message.py

greptile-apps · 2026-01-09T06:36:00Z

backend/onyx/chat/process_message.py

+    if completed_normally:
+        if state_container.answer_tokens is None:
+            raise RuntimeError(
+                "LLM run completed normally but did not return an answer."
+            )


[P2] When the LLM completes normally but answer_tokens is None, this raises RuntimeError. However, since this runs in the completion callback (which executes in the finally block), this exception will propagate differently than the original design where it was caught by the exception handlers in handle_stream_message_objects. The error handling path has changed.

Prompt To Fix With AI

This is a comment left during a code review. Path: backend/onyx/chat/process_message.py Line: 678:682 Comment: [P2] When the LLM completes normally but `answer_tokens` is None, this raises RuntimeError. However, since this runs in the completion callback (which executes in the `finally` block), this exception will propagate differently than the original design where it was caught by the exception handlers in `handle_stream_message_objects`. The error handling path has changed. How can I resolve this? If you propose a fix, please make it concise.

This is handled in the layer that runs this func

greptile-apps · 2026-01-09T06:36:02Z

Additional Comments (1)

backend/onyx/chat/chat_state.py
[P0] The callback runs before exceptions propagate, causing database inconsistency. When a PacketException is raised at line 179, the finally block executes the callback (which commits to DB), then the exception continues to handle_stream_message_objects where db_session.rollback() is called (lines 615, 662 in process_message.py). This rolls back the chat save that just happened. The callback should only run on successful completion, not when exceptions occur.

Prompt To Fix With AI

This is a comment left during a code review.
Path: backend/onyx/chat/chat_state.py
Line: 195:200

Comment:
[P0] The callback runs before exceptions propagate, causing database inconsistency. When a `PacketException` is raised at line 179, the `finally` block executes the callback (which commits to DB), then the exception continues to `handle_stream_message_objects` where `db_session.rollback()` is called (lines 615, 662 in process_message.py). This rolls back the chat save that just happened. The callback should only run on successful completion, not when exceptions occur.

How can I resolve this? If you propose a fix, please make it concise.

evan-onyx

could you add an integration test for this before merging?

backend/onyx/chat/process_message.py

cubic-dev-ai

2 issues found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="backend/tests/integration/common_utils/managers/chat.py">

<violation number="1" location="backend/tests/integration/common_utils/managers/chat.py:196">
P2: Docstring references non-existent parameter `disconnect_after_type`. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.</violation>

<violation number="2" location="backend/tests/integration/common_utils/managers/chat.py:202">
P2: Docstring incorrectly states the function returns a `StreamedResponse`, but the function is annotated as `-> None` and returns `None`. Update the docstring to reflect the actual return type.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-01-09T23:32:29Z

backend/tests/integration/common_utils/managers/chat.py

+            message: The message to send
+            disconnect_after_packets: Disconnect after receiving this many packets.
+                If None, disconnect_after_type must be specified.
+            disconnect_after_type: Disconnect after receiving a packet of this type


P2: Docstring references non-existent parameter disconnect_after_type. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/integration/common_utils/managers/chat.py, line 196: <comment>Docstring references non-existent parameter `disconnect_after_type`. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.</comment> <file context> @@ -164,6 +164,87 @@ def send_message( + message: The message to send + disconnect_after_packets: Disconnect after receiving this many packets. + If None, disconnect_after_type must be specified. + disconnect_after_type: Disconnect after receiving a packet of this type + (e.g., "message_start", "search_tool_start"). If None, + disconnect_after_packets must be specified. </file context>

cubic-dev-ai · 2026-01-09T23:32:29Z

backend/tests/integration/common_utils/managers/chat.py

+            StreamedResponse containing data received before disconnect,
+            with is_disconnected=True flag set.


P2: Docstring incorrectly states the function returns a StreamedResponse, but the function is annotated as -> None and returns None. Update the docstring to reflect the actual return type.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/integration/common_utils/managers/chat.py, line 202: <comment>Docstring incorrectly states the function returns a `StreamedResponse`, but the function is annotated as `-> None` and returns `None`. Update the docstring to reflect the actual return type.</comment> <file context> @@ -164,6 +164,87 @@ def send_message( + ... (other standard message parameters) + + Returns: + StreamedResponse containing data received before disconnect, + with is_disconnected=True flag set. + """ </file context>

Suggested change

StreamedResponse containing data received before disconnect,

with is_disconnected=True flag set.

None. This method simulates a disconnect and does not return response data.

nit

020af02

Danelegend requested a review from a team as a code owner January 9, 2026 06:32

cubic-dev-ai bot reviewed Jan 9, 2026

View reviewed changes

greptile-apps bot reviewed Jan 9, 2026

View reviewed changes

Dane Urban added 3 commits January 8, 2026 22:39

nits

fde21ed

replace log with emit

1d72eff

update last turn index

d273e63

evan-onyx approved these changes Jan 9, 2026

View reviewed changes

backend/onyx/chat/process_message.py Outdated Show resolved Hide resolved

Dane Urban added 3 commits January 9, 2026 11:11

Fix nit

fa218ac

Resolve merge conflict

cee60dd

nits

448e97b

cubic-dev-ai bot reviewed Jan 9, 2026

View reviewed changes

nits

05b56df

Danelegend added this pull request to the merge queue Jan 9, 2026

Merged via the queue into main with commit f4d4d14 Jan 9, 2026
74 checks passed

Danelegend deleted the db_save_in_thrd branch January 9, 2026 23:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(chat): post llm loop callback #7309

fix(chat): post llm loop callback #7309

Danelegend commented Jan 9, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

greptile-apps bot Jan 9, 2026

Uh oh!

Danelegend Jan 9, 2026

Uh oh!

greptile-apps bot commented Jan 9, 2026

Uh oh!

evan-onyx left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Jan 9, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot Jan 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		StreamedResponse containing data received before disconnect,
		with is_disconnected=True flag set.

	StreamedResponse containing data received before disconnect,
	with is_disconnected=True flag set.
	None. This method simulates a disconnect and does not return response data.

fix(chat): post llm loop callback #7309

fix(chat): post llm loop callback #7309

Conversation

Danelegend commented Jan 9, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Additional Options

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Danelegend Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 9, 2026

Uh oh!

evan-onyx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Danelegend commented Jan 9, 2026 •

edited by cubic-dev-ai bot

Loading

cubic-dev-ai bot Jan 9, 2026 •

edited

Loading

cubic-dev-ai bot Jan 9, 2026 •

edited

Loading