Stabilize checkpoint diffs for sparse turns and improve unavailable-diff UX#1078
Stabilize checkpoint diffs for sparse turns and improve unavailable-diff UX#1078Meetpatel006 wants to merge 3 commits intopingdotgg:mainfrom
Conversation
- tolerate sparse checkpoint turn counts in diff queries - keep completed turn captures from being dropped on activeTurnId mismatch - dedupe repeated provider diff placeholder events per turn - retry checkpoint-ref unavailable errors in the web diff query - add focused regression coverage for each path
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can make CodeRabbit's review stricter and more nitpicky using the `assertive` profile, if that's what you prefer.Change the |
juliusmarminge
left a comment
There was a problem hiding this comment.
the root cause of this issue is that we're not handling subagents very well yet (they emit unique turn ids so a single user-assistant pass emits multiple turns but we are not capturing these), and codex is constantly 'improving' them. checkpointing worked flawlessly before we shipped, never had issues.
This seems like another bandaid to me(?), I was working on a real and proper fix earlier this week but got distracted.
Really like the UI improvements though!
Thanks for the context on the subagent root cause that's genuinely helpful to have documented. I dug into the codex-rs app-server some of the source to understand the event model better before putting this together. A few things that stood out: The subagent fan-out case is documented in the app-server model too spawn/collab flows carry sender/receiver thread relationships, meaning one user-visible exchange can fan out into multiple turns across multiple threads. So you're correct that the root fix needs to be thread-aware first, then turn-aware, not flattened. This PR doesn't go there. That said, I'd push back slightly on "bandaid" for the PR. The surface-area is really two separate things:
Happy to split the PR if that unblocks review UI changes fast-tracked, structural mitigations held until your fix lands. Or if you want to share what you had in mind for the subagent fix, I can rebase onto that approach instead. |
What Changed
This PR fixes checkpoint diff behavior for sparse turn histories and improves Diff Panel handling when checkpoints are unavailable.
turn.diff.updated) for the same thread+turn.activeTurnIdpointed to a different turn, so completed turns are still captured.Why
Users could hit unreliable or confusing diff behavior:
This approach keeps behavior predictable under partial/missing checkpoint data, avoids duplicate placeholder writes, and provides clearer UI feedback without expanding scope beyond bug fixes.
UI Changes
Before
After
Checklist
Note
Stabilize checkpoint diffs for sparse turns and improve unavailable-diff UX
getTurnDiffin CheckpointDiffQuery.ts now resolves to the nearest checkpoint at or before the requested turn count instead of throwing when an exact match is missing.activeTurnIdreferences a different turn.turn.diff.updatedevents for the same turn only enqueue one placeholder.isCheckpointTemporarilyUnavailablenow treats'checkpoint ref is unavailable for turn'as a retriable condition.Macroscope summarized b0fa258.