Subagent context compaction policy#78
Merged
Merged
Conversation
Working subagents now have a configurable in-memory history
compaction policy. The on-disk transcript is unaffected — only the
live model context is rewritten.
Config (defaults):
subagents:
context_compaction:
mode: inherit_strict # inherit_strict | inherit | off
compact_pct: 75 # child threshold (percent of model limit)
keep_recent_turns: 1 # turns preserved verbatim after compact
min_messages: 6 # below this, never compact
timeout_seconds: 60 # hard wall on the summarizer call
Modes:
inherit_strict: effective threshold = min(parent compact_pct, child
compact_pct). Strict means child can only be
equal-or-lower than parent — async work shouldn't
silently outgrow the parent's window.
inherit: use parent's context_compact_pct verbatim.
off: never compact this child.
Compaction flow (maybe_compact_turn_session):
- skip immediately if mode == off or kind == archive_holder
- skip if history shorter than min_messages
- measure context via context_usage_pct (PR #76); below threshold
means skip
- compact_find_cut locates the largest safe cut that doesn't
split a tool_use / tool_result pair and keeps the most recent
keep_recent_turns turns verbatim
- summarize the prefix via llm.api::chat (provider/model from the
child session), under a setTimeLimit
- on success, prepend a single [compacted history] assistant
entry to the kept tail; on any failure, log and leave history
untouched
Hook lives in subagent_turn_prompt after the disk-transcript append
and after archival has had its shot. Archive holders are skipped
via .subagent_state$kind, stamped in subagent_seed_history.
24 offline tests cover threshold resolution, the cut-point finder
(including the open-tool-use case), the pure history rewrite, and
the early-bail paths (archive holder / off / too short).
End-to-end smoke against moonshot: hook runs without error, exists
as a no-op below min_messages, transcript stays intact.
P1: compact_find_cut() was treating every role == "user" entry as
a user-turn boundary, but Anthropic-style tool_result messages also
have role == "user" with a tool_result block as content. A long
turn like
user -> assistant tool_use -> user tool_result -> assistant tool_use
-> user tool_result -> assistant final
could end up cut between the assistant tool_use and the user
tool_result that satisfies it.
Two changes:
- compact_entry_is_tool_result_only() filters those second-half
messages out of the user-turn boundary list.
- The pair-safety check now scans the entire prefix and walks
the cut back only if a tool_use in [1..cut] has no matching
tool_result in [1..cut]. The previous check only looked at
history[[cut]].
P2: maybe_compact_turn_session() called context_usage_pct() with
tools = NULL, but turn() resolves session$tools_filter via
skills_as_api_tools() whenever its tools arg is NULL. The hook now
resolves the same way so the trigger pct matches what the next
turn will actually send.
Tests cover the multi-tool turn case (tool_result-only user msgs
do not split a single turn), the cut-walks-back case for an open
pair, and compact_entry_is_tool_result_only() across pure-text,
pure-tool_result, and mixed contents. 1516/1516 OK.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Working subagents now have a configurable in-memory history compaction policy. The on-disk transcript (from PR #75) is untouched — only the live context sent to the model is rewritten.
Config
```yaml
subagents:
context_compaction:
mode: inherit_strict # inherit_strict | inherit | off
compact_pct: 75 # child threshold (percent of model limit)
keep_recent_turns: 1 # turns preserved verbatim after compact
min_messages: 6 # below this, never compact
timeout_seconds: 60 # hard wall on the summarizer call
```
Modes
Flow
`maybe_compact_turn_session` runs after the turn finishes (never mid-turn) and after archival has had its shot:
Archive-holder protection
`subagent_seed_history` now stamps `.subagent_state$kind <- "archive_holder"`. The compaction hook sees that and bails immediately, so an archived-turn holder never loses its seeded transcript.
Tests
Stack