-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Preflight Checklist
- I have searched existing requests and this feature hasn't been requested yet
- This is a single feature request (not multiple features)
Problem Statement
When Claude Code performs compaction (auto or manual), the entire conversation is replaced with a summary. This creates a jarring "conversation interrupted" experience - like someone cutting you off mid-sentence and saying "let me summarize what we talked about."
While the summary preserves important context, it breaks the natural flow of conversation. The agent loses track of what it was just doing, what files it just read, and what the user just asked. This is particularly disruptive during iterative development where the last 5-10 messages often contain the active working context.
This is NOT about:
- Better summarization ([BUG] Memory Loss After Auto-compact #1534, [BUG] Severe Session Memory Loss - Instructions and Context Not Retained #2545) - The summaries are fine
- Selective compaction ([FEATURE REQUEST]: Selective Context Compaction #2705) - We don't need complex range selection
- Adjusting compaction timing (feat: suggest adjusting context retention or timing of Compacting Conversation #2265) - The trigger point is reasonable
- Persistent memory - External tools (Graphiti, MCP servers) handle this well
The Specific Problem
When I'm debugging with Claude and compaction triggers:
- Before: "I see the error in line 42, let me fix that..."
- After compaction: "Hello! Based on our previous work on the debugging task..."
- User has to re-orient: "You were just looking at line 42"
Proposed Solution
Add a --keep-recent N parameter to compaction:
claude-code --keep-recent 10 # Keep last 10 messages verbatim after summary
Or in SDK configuration:
{
compaction: {
keepRecent: 10 // Preserve last 10 messages
}
}
How It Works
When compaction triggers with --keep-recent 10:
- Summarize messages 1 through N-10
- Keep messages N-9 through N verbatim
- Continue with: [Summary] + [Last 10 messages] + [New messages]
Example
Before compaction (100 messages):
Message 1: Help me build a CLI tool
Message 2-90: [long development conversation]
Message 91: Now let's add error handling
Message 92: I'll add try-catch blocks
Message 93: [Shows code]
...
Message 100: The error is on line 42
After compaction WITH --keep-recent 10:
Summary: We built a CLI tool with features X, Y, Z...
Message 91: Now let's add error handling [PRESERVED]
Message 92: I'll add try-catch blocks [PRESERVED]
...
Message 100: The error is on line 42 [PRESERVED]
The agent maintains continuity and knows it was just looking at line 42.
With tools like Graphiti and MCP Memory Keeper, we've largely solved persistent memory. The remaining pain point is conversational disruption. A simple rolling window would make compaction nearly invisible to users while still managing context effectively.
Implementation Priority
- SDK parameter (most important) - Let developers configure this
- CLI flag (nice to have) - Let users set their preference
- Default behavior (ideal) - Even --keep-recent 5 would greatly improve UX
This is a small change that would eliminate the most jarring aspect of compaction without requiring complex memory management systems.
Alternative Solutions
One could imagine a post-compact-hook that:
- Detects compaction has occurred
- Retrieves last N messages from external storage
- Injects them into the conversation
Why this fails:
- Hooks can only inject into system prompts or as tool results, not as conversation history
- The agent sees this as "context about previous messages" not "my actual previous messages"
- Creates a confusing double-context where the agent sees both the summary AND the injected "memories"
- The conversational flow is still broken - the agent restarts its response pattern
Priority
Medium - Would be very helpful
Feature Category
Configuration and settings
Use Case Example
Scenario: Debugging a complex multi-file issue
Without --keep-recent:
User: The API tests are failing
Claude: I'll check the test files... [reads 5 files, analyzes error]
Claude: The issue is in auth.js line 142, the token validation is...
[COMPACTION TRIGGERS]
Claude: Hello! I understand we're working on fixing API tests. Let me examine the situation.
User: You just found it! Line 142 of auth.js!
Claude: Let me look at auth.js line 142... [re-reads file]
With --keep-recent 5:
User: The API tests are failing
Claude: I'll check the test files... [reads 5 files, analyzes error]
Claude: The issue is in auth.js line 142, the token validation is...
[COMPACTION TRIGGERS]
Claude: ...checking for null values. Here's the fix for line 142:
The conversation continues seamlessly. The agent maintains its working context and completes the thought it was in the middle of expressing.
Additional Context
While #6390 proposes a comprehensive "context pruning" system with multiple commands (/prune --keep-last, /checkpoint, --from-message), this request suggests a single parameter that works with the existing compaction system. Rather than introducing new commands and checkpoint management, --keep-recent N simply modifies compaction behavior to preserve recent messages. This approach requires no new user commands, no message ID tracking, and no checkpoint system - just a configuration parameter that makes the existing compaction less disruptive. It's the difference between adding a new feature versus adjusting an existing one.