Skip to content

Conversation

@ReyNeill
Copy link

@ReyNeill ReyNeill commented Dec 29, 2025

Problem

When auto-compaction triggers mid-implementation, the summary plus preserved user messages can cause the model to re-check the entire session instead of continuing the current task.

Solution

Refocus the compaction prompt and summary prefix so the summary targets only the active task and explicitly avoids re-opening completed work unless the user asks.

Notes / Future options

  • Keep only the most recent N user messages after compaction
  • Store both an active-task summary and a full-session summary, reintroducing the latter only if needed
  • Use retrieval for prior context instead of always re-injecting full history

Changes

  • codex-rs/core/templates/compact/prompt.md
  • codex-rs/core/templates/compact/summary_prefix.md

Tests

Not run (content-only changes).

@github-actions
Copy link
Contributor

github-actions bot commented Dec 29, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@ReyNeill
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Dec 29, 2025
@etraut-openai
Copy link
Collaborator

Is there a bug report filed for this issue? In our contributor's guidance we ask that all bug fix PRs start with a bug report. This helps us prioritize and track issues, especially in cases where we decide to reject a PR.

@etraut-openai etraut-openai added the needs-response Additional information is requested label Dec 30, 2025
@etraut-openai
Copy link
Collaborator

We do significant testing with our system prompts. Changes to these prompts can easily introduce regressions, so we have a high bar for accepting PRs that modify prompts.

@ReyNeill
Copy link
Author

Is there a bug report filed for this issue? In our contributor's guidance we ask that all bug fix PRs start with a bug report. This helps us prioritize and track issues, especially in cases where we decide to reject a PR.

Understood, would that mean delete this PR and start a new one quoting the big report, or I can just simply add it here?

@etraut-openai
Copy link
Collaborator

@ReyNeill, no need to create a new PR. Just add a link to a bug report.

@ReyNeill
Copy link
Author

ReyNeill commented Dec 30, 2025

I happen to run with the issue right now and submitted the bug: https://github.com/openai/codex/issues/new?template=2-bug-report.yml&steps=Uploaded%20thread:%20019b704d-04f3-7031-b1bb-fb5af42be238

thread ID 019b704d-04f3-7031-b1bb-fb5af42be238

The model wastes a lot of tokens and time verifying all of the session's tasks instead of continuing the task it was doing when the session history auto-compacted

@etraut-openai
Copy link
Collaborator

How much testing have you done with your modified prompt? I'm trying to get a sense for whether the solution is speculative or whether you have strong evidence that it improves the behavior, at least for your usage.

@ReyNeill
Copy link
Author

Not enough testing, but I'll do it and comeback. Thanks for the fast feedback speed.

@Zaczero
Copy link

Zaczero commented Jan 8, 2026

I do confirm this is an issue and I'm surprised there's not more activity going here. It's very common for codex to get confused about the task it's working on and often starts doing something that was already finished. Example from just now:

› <file reference> i feel like the amount of patching is not justified, there must be a better way to run this historical museum software

<truncated>

  Evaluating global function conflicts

  I’m considering the global functions that might conflict, particularly with min and max templates in
  the global namespace potentially clashing with std::min. It seems they include cstring and climits
  but I wonder if the problem arises because the code includes <algorithm> and uses min/max without
  specifying the namespace. Instead of generating a header, we could try passing -include algorithm
  and -Dmin=std::min, but I'm not sure that’ll work for templates. I think it would be wise to check
  the upstream code for usage of min and max to spot any issues.

• Ran cd /home/user/Source/nixpkgs && src=$(nix-instantiate --eval -E 'with import ./. {};
  │ aburatan.src.outPath' | tr -d '"') && rg -n "\bmin\(|\bmax\(" "$src"/*.cpp "$src"/*.hpp | head
  │ -n 200
  └ /nix/store/xgj5a068d5w086w6vw9qqn7cs2h05d1i-source/dialog.cpp:44:    int widestText =
    max( Title.length() + 2, content.length() );
    … +59 lines
    /nix/store/xgj5a068d5w086w6vw9qqn7cs2h05d1i-source/standard.hpp:34:T min( T a, T b ) {
    /nix/store/xgj5a068d5w086w6vw9qqn7cs2h05d1i-source/standard.hpp:40:T max( T a, T b ) {

• Ran cd /home/user/Source/nixpkgs && src=$(nix-instantiate --eval -E 'with import ./. {};
  │ aburatan.src.outPath' | tr -d '"') && sed -n '1,120p' "$src/standard.hpp"
  └ // [standard.hpp]
    // (C) Copyright 2000 Michael Blackney
    … +77 lines
    #endif // STANDARD_HPP
    

─ Worked for 3m 01s ──────────────────────────────────────────────────────────────────────────────────

• Context compacted

• I need to address the user’s question about what the "rev" parameter does and how it interacts with
  update scripts. It’s a straightforward explanation. In Nix, "rev" refers to the revision in
  fetchFromGitHub or fetchgit, which indicates a specific commit or tag used to derive the output
  hash. Update scripts like nix-update rely on the "src" to locate the latest version; for
  "fetchFromGitHub," this is usually done by referencing the "v${version}" or commit. If "rev" derives
  from "version," then only the version change affects "rev" automatically.

  Clarifying rev in updates

  If "rev" is set to a commit hash, then an update script might adjust the version to something like
  "0-unstable-date" while setting "rev" to the commit, updating the hash as needed. The nix-update-
  script then calls nix-update, which evaluates the derivation and modifies the relevant fields. It’s
  important to mention that "rev" is part of a fixed-output fetch; changes to it necessitate a hash
  update. For tarballs, "fetchurl" uses a URL but doesn’t involve "rev." Let's provide this
  explanation to the user!

<aborted>

The rev question was quite a while ago, like 4 or more messages in. There were other tasks done in the meantime. Completely off-topic and random thing to continue working on. This happens too often.

I'll get a modified codex build running and test some ideas on improving the prompt and report my findings when I'm happy with them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-response Additional information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants