-
Notifications
You must be signed in to change notification settings - Fork 2.3k
fix(chat): post llm loop callback #7309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 2 files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR attempts to fix a bug where chat state isn't saved to the database when the FastAPI stream stops prematurely. The fix moves the post-LLM logic (building final answer, collecting citations, saving to DB) into a completion_callback that runs in a finally block, ensuring it executes even if the stream halts.
Changes:
- Added
completion_callbackparameter torun_chat_loop_with_state_containersthat executes in thefinallyblock after the LLM thread completes - Extracted post-LLM logic into new function
llm_loop_completion_handle - Created callback
llm_loop_completion_callbackthat invokes the handle function with necessary context
Issues:
The implementation has a critical flaw with exception handling. When an exception occurs in the LLM loop:
- The exception is caught and emitted as
PacketException - It's re-raised in the main thread (line 179 of chat_state.py)
- The
finallyblock executes the callback, which callssave_chat_turnand commits to DB - The exception continues propagating to
handle_stream_message_objects - Exception handlers there call
db_session.rollback(), undoing the save
This means exceptions will cause the chat to be saved then immediately rolled back, defeating the purpose of the fix. The callback needs to only run on successful completion, or the exception handlers need to be aware that the callback has already run and avoid rolling back.
Confidence Score: 1/5
- Not safe to merge - contains critical database consistency bug with exception handling
- The PR introduces a critical bug where exceptions cause the chat to be saved and then immediately rolled back. When an exception occurs in the LLM loop, the completion callback runs in the finally block (committing chat state to DB), but then the exception propagates to outer exception handlers that call db_session.rollback(). This creates database inconsistency and defeats the purpose of the fix. Additionally, the callback itself has no exception handling, which could mask original exceptions.
- backend/onyx/chat/chat_state.py (callback execution timing), backend/onyx/chat/process_message.py (exception handling in callback)
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| backend/onyx/chat/chat_state.py | 2/5 | Added completion_callback parameter that runs in finally block after LLM loop completes, enabling post-processing regardless of success/failure |
| backend/onyx/chat/process_message.py | 2/5 | Moved post-LLM logic into llm_loop_completion_handle callback to ensure chat is saved even if stream stops, but has exception handling issue with db rollback |
Sequence Diagram
sequenceDiagram
participant Client
participant handle_stream_message_objects
participant run_chat_loop_with_state_containers
participant Background Thread
participant Callback
participant DB
Client->>handle_stream_message_objects: Send message
handle_stream_message_objects->>run_chat_loop_with_state_containers: Start LLM loop
run_chat_loop_with_state_containers->>Background Thread: Launch LLM execution
alt Normal completion
Background Thread->>run_chat_loop_with_state_containers: Emit packets
run_chat_loop_with_state_containers->>Client: Yield packets
Background Thread->>run_chat_loop_with_state_containers: Emit OverallStop
Note over run_chat_loop_with_state_containers: Finally block executes
run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
Callback->>DB: save_chat_turn() + commit()
run_chat_loop_with_state_containers-->>handle_stream_message_objects: Return
else Exception in LLM
Background Thread->>Background Thread: Exception occurs
Background Thread->>run_chat_loop_with_state_containers: Emit PacketException
run_chat_loop_with_state_containers->>run_chat_loop_with_state_containers: Re-raise exception (line 179)
Note over run_chat_loop_with_state_containers: Finally block executes FIRST
run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
Callback->>DB: save_chat_turn() + commit()
Note over DB: Chat saved to DB
run_chat_loop_with_state_containers-->>handle_stream_message_objects: Exception propagates
handle_stream_message_objects->>DB: rollback()
Note over DB: CRITICAL: Chat save rolled back!
handle_stream_message_objects->>Client: Yield StreamingError
else User cancellation
Client->>run_chat_loop_with_state_containers: Disconnect
run_chat_loop_with_state_containers->>run_chat_loop_with_state_containers: Detect !is_connected()
Note over run_chat_loop_with_state_containers: Finally block executes
run_chat_loop_with_state_containers->>Callback: completion_callback(state_container)
Callback->>DB: save_chat_turn() with stop message + commit()
run_chat_loop_with_state_containers-->>handle_stream_message_objects: Return
end
| if completed_normally: | ||
| if state_container.answer_tokens is None: | ||
| raise RuntimeError( | ||
| "LLM run completed normally but did not return an answer." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P2] When the LLM completes normally but answer_tokens is None, this raises RuntimeError. However, since this runs in the completion callback (which executes in the finally block), this exception will propagate differently than the original design where it was caught by the exception handlers in handle_stream_message_objects. The error handling path has changed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/onyx/chat/process_message.py
Line: 678:682
Comment:
[P2] When the LLM completes normally but `answer_tokens` is None, this raises RuntimeError. However, since this runs in the completion callback (which executes in the `finally` block), this exception will propagate differently than the original design where it was caught by the exception handlers in `handle_stream_message_objects`. The error handling path has changed.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled in the layer that runs this func
Additional Comments (1)
Prompt To Fix With AIThis is a comment left during a code review.
Path: backend/onyx/chat/chat_state.py
Line: 195:200
Comment:
[P0] The callback runs before exceptions propagate, causing database inconsistency. When a `PacketException` is raised at line 179, the `finally` block executes the callback (which commits to DB), then the exception continues to `handle_stream_message_objects` where `db_session.rollback()` is called (lines 615, 662 in process_message.py). This rolls back the chat save that just happened. The callback should only run on successful completion, not when exceptions occur.
How can I resolve this? If you propose a fix, please make it concise. |
evan-onyx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add an integration test for this before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/tests/integration/common_utils/managers/chat.py">
<violation number="1" location="backend/tests/integration/common_utils/managers/chat.py:196">
P2: Docstring references non-existent parameter `disconnect_after_type`. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.</violation>
<violation number="2" location="backend/tests/integration/common_utils/managers/chat.py:202">
P2: Docstring incorrectly states the function returns a `StreamedResponse`, but the function is annotated as `-> None` and returns `None`. Update the docstring to reflect the actual return type.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| message: The message to send | ||
| disconnect_after_packets: Disconnect after receiving this many packets. | ||
| If None, disconnect_after_type must be specified. | ||
| disconnect_after_type: Disconnect after receiving a packet of this type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Docstring references non-existent parameter disconnect_after_type. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/integration/common_utils/managers/chat.py, line 196:
<comment>Docstring references non-existent parameter `disconnect_after_type`. This parameter is documented but not present in the function signature. Either add the parameter or remove the documentation.</comment>
<file context>
@@ -164,6 +164,87 @@ def send_message(
+ message: The message to send
+ disconnect_after_packets: Disconnect after receiving this many packets.
+ If None, disconnect_after_type must be specified.
+ disconnect_after_type: Disconnect after receiving a packet of this type
+ (e.g., "message_start", "search_tool_start"). If None,
+ disconnect_after_packets must be specified.
</file context>
| StreamedResponse containing data received before disconnect, | ||
| with is_disconnected=True flag set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Docstring incorrectly states the function returns a StreamedResponse, but the function is annotated as -> None and returns None. Update the docstring to reflect the actual return type.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/tests/integration/common_utils/managers/chat.py, line 202:
<comment>Docstring incorrectly states the function returns a `StreamedResponse`, but the function is annotated as `-> None` and returns `None`. Update the docstring to reflect the actual return type.</comment>
<file context>
@@ -164,6 +164,87 @@ def send_message(
+ ... (other standard message parameters)
+
+ Returns:
+ StreamedResponse containing data received before disconnect,
+ with is_disconnected=True flag set.
+ """
</file context>
| StreamedResponse containing data received before disconnect, | |
| with is_disconnected=True flag set. | |
| None. This method simulates a disconnect and does not return response data. |
Description
There is a bug where if the fastapi client stops yielding from the stream, the flow halts on a 'yield from' and does not progress onwards. If this path occurs, the code that occurs after will not run. This includes logic to save the chat to the database.
This PR moves the logic into a callback that is run at the conclusion of the llm thread. This pr does somewhat implicitly use the magic of the 'finally' block in that it is guaranteed (afaik) to be run, but it still makes semantic sense in this instance
How Has This Been Tested?
Manually test cases including sending messages and refreshing mid stream. Also tested stopping message scenario.
Additional Options
Summary by cubic
Fixes a bug where the chat flow could stall if the FastAPI stream stopped, causing the post-run logic (including DB save) to be skipped. Moves post-LLM work into a completion callback so the chat turn is always finalized and saved.
Written for commit 05b56df. Summary will update on new commits.