-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Centralize generating history #5188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@codex review this |
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
pub(crate) fn prepare_prompt_input( | ||
&mut self, | ||
task_kind: TaskKind, | ||
pending_input: Vec<ResponseItem>, | ||
) -> Vec<ResponseItem> { | ||
if !pending_input.is_empty() { | ||
self.history.add_pending_input(pending_input, task_kind); | ||
} | ||
self.history.handle_missing_tool_call_output(task_kind); | ||
self.history.prompt(task_kind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insert synthetic tool outputs before new user input
Missing tool-call responses are now appended after pending user messages in prepare_prompt_input
. prepare_prompt_input
first writes any pending_input
into the history (lines 58‑65) and only afterwards calls handle_missing_tool_call_output
(lines 66‑67), which records the synthetic CustomToolCallOutput { call_id, output: "aborted" }
items at the end of the history (conversation_history.rs lines 81‑125). When a user interrupts a tool call and immediately sends follow‑up input, the resulting history is [ …, FunctionCall, <user follow‑up>, CustomToolCallOutput "aborted" ]
, so the model sees the follow‑up message before it ever receives the synthetic output that resolves the prior call id. This defeats the purpose of guaranteeing that the conversation shows completed tool calls and can lead the model to believe the previous call is still pending. The synthetic output should be inserted before the new user input (or immediately after the corresponding call) so the call appears resolved when the next user message is processed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good comment
self.record_items(pending_input, task_kind); | ||
} | ||
|
||
pub(crate) fn handle_missing_tool_call_output(&mut self, task_kind: TaskKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't love the task_kind: TaskKind that we have to pass around and fork based on.
Can we toss the ConversationHistory onto TaskContext and have Review create it's own ConversationHistory destination that we discard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I was thinking of introducing threads
also for something like that. will help with our general direction. wdut?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would that help solve? Is there existing layer we can user to solve the same thing?
self.record_items(pending_input, task_kind); | ||
} | ||
|
||
pub(crate) fn handle_missing_tool_call_output(&mut self, task_kind: TaskKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is a terminal workaround for buggy logic that doesn't add correct tool result or doesn't filter items correctly. Can we find ways to surface the original issue without breaking users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know the issue will fix it in a follow up PR. However, I think it's a good guardrail to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but how do we help devs discover the original issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @pakrym-oai
} | ||
|
||
self.items.push(item.clone()); | ||
match task_kind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit odd and not that much scalable
What if we have other tasks kind that also require dedicated history?
Instead I would stop a HashMap<String, Vec> and then you impl TaskKing to have a function key
And just make sure that key
of Regular and Compact are the same
self.items = items; | ||
} | ||
|
||
pub(crate) fn initialize_review_history( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment, we should have something as generic as possible here and avoid dedicated functions for each task kind
self.record_items(pending_input, task_kind); | ||
} | ||
|
||
pub(crate) fn handle_missing_tool_call_output(&mut self, task_kind: TaskKind) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @pakrym-oai
618ca43
to
b8fda27
Compare
b8fda27
to
79f4124
Compare
This PR moves the logic for generating the context history and for adding an input into
ConversationHistory
. It adds guarantees around missing tool calls and handles them more gracefully, without breaking the cache.