Skip to content

feat(chat): add summary budget discipline for compaction#652

Open
Cstewart-HC wants to merge 2 commits intomoltis-org:mainfrom
Cstewart-HC:feat/summary-budget-discipline
Open

feat(chat): add summary budget discipline for compaction#652
Cstewart-HC wants to merge 2 commits intomoltis-org:mainfrom
Cstewart-HC:feat/summary-budget-discipline

Conversation

@Cstewart-HC
Copy link
Copy Markdown
Contributor

Summary

Enforce hard budget on LLM-generated compaction summaries via new compress_summary() function in crates/chat/src/lib.rs.

Without this, auto-compact can produce arbitrarily long summaries that overflow the context window on the next turn — defeating the purpose of compaction.

Changes

  • crates/chat/src/lib.rs (+269 lines) — new compress_summary() function:
    • Max 1,200 characters total
    • Max 24 lines
    • Max 160 characters per line
    • Case-insensitive line deduplication
    • Priority-based line dropping: headers > bullets > plain text
    • Omission notice appended when lines are dropped
  • Applied to both compact() method and standalone compact_session() function
  • 8 unit tests covering all budget enforcement paths

Testing

running 8 tests
test tests::compress_summary_empty_input ... ok
test tests::compress_summary_long_line_truncation ... ok
test tests::compress_summary_deduplication ... ok
test tests::compress_summary_single_very_long_line ... ok
test tests::compress_summary_under_budget_returns_unchanged ... ok
test tests::compress_summary_over_char_limit ... ok
test tests::compress_summary_header_preservation ... ok
test tests::compress_summary_over_line_count ... ok

Also verified: cargo check, cargo clippy, just format-check — all clean.

Checklist

  • Code follows project Rust style (no .unwrap()/.expect(), iterators over loops, #[must_use] on pure fns)
  • Tests included with high coverage
  • Single-file, minimal scope change
  • Commit message follows conventional commits format

Enforce hard budget on LLM-generated compaction summaries via new
compress_summary() function:
- Max 1,200 characters total
- Max 24 lines
- Max 160 characters per line
- Case-insensitive line deduplication
- Priority-based line dropping: headers > bullets > plain text
- Omission notice appended when lines are dropped

Applied to both compact() method and standalone compact_session() function.
Includes 8 unit tests covering all budget enforcement paths.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

This PR adds compress_summary() to enforce a hard budget (1,200 chars, 24 lines, 160 chars/line) on LLM-generated compaction summaries, applying it in both compact() and compact_session(). The logic is well-structured and the 8 unit tests cover the main paths, but two minor inconsistencies were found beyond the already-flagged off-by-one and notice-undercount issues: (1) the SUMMARY_MAX_LINES doc comment says "excluding omission notice" while the drop loop actually counts the notice toward the limit, and (2) the edge-case char_budget is pre-computed with a potentially shorter notice string than what gets appended, allowing up to 2 bytes of overage in a degenerate all-header input.

Confidence Score: 4/5

Safe to merge after addressing the previously flagged off-by-one and notice-undercount bugs; the two new findings here are low-impact.

The three issues from prior review threads (off-by-one in fold budget, omission notice undercount for dropped headers, blank-line dedup) remain open and affect correctness of the budget enforcement in real inputs. The two new findings (doc comment vs implementation mismatch and stale notice estimate) are P2 but the doc mismatch weakens the test contract. Score is 4 pending resolution of at least the previously flagged P1-level issues.

crates/chat/src/lib.rs — the compress_summary function, specifically the fold-based total_len calculations and edge-case char_budget path

Important Files Changed

Filename Overview
crates/chat/src/lib.rs Adds compress_summary() with character/line/dedup budget enforcement and applies it to both compact() and compact_session(); two minor inconsistencies remain: the SUMMARY_MAX_LINES doc comment contradicts the drop-loop implementation (notice is included in the count, not excluded), and the edge-case char_budget calculation uses a stale notice length estimate that can allow 1–2 bytes of overage in rare inputs.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([compress_summary input]) --> B{text.trim is empty?}
    B -- yes --> Z([return empty string])
    B -- no --> C[Step 1: deduplicate lines case-insensitive, keep first]
    C --> D[Step 2: truncate lines > 160 chars at char boundary]
    D --> E{len <= 24 lines AND <= 1200 chars?}
    E -- yes --> F([return joined as-is])
    E -- no --> G[Partition into headers / bullets / others]
    G --> H[candidates = bullets ++ others, headers always kept]
    H --> I{all candidates fit with headers?}
    I -- yes --> J([return headers ++ candidates])
    I -- no --> K[Drop loop: drop from tail of candidates until line_count + notice <= 24 AND total chars <= 1200]
    K -- fits --> L([return headers ++ kept ++ notice])
    K -- never fits --> M[Edge case: headers alone too long, keep headers until char/line budget, append notice with full drop count]
    M --> N([return kept headers ++ notice])
Loading

Reviews (2): Last reviewed commit: "fix(chat): correct budget accounting bug..." | Re-trigger Greptile

Fix three Greptile review findings from PR moltis-org#652:

1. Off-by-one in fold-based character budget: fold adds l.len() + 1
   per element but join("\n") produces N-1 separators, overcounting
   by 1 byte and dropping an extra line at the budget boundary.
   Added .saturating_sub(1) to both fold sites.

2. Omission notice undercounts when headers are also truncated:
   all_dropped_count only counted candidates, not headers skipped in
   the edge-case loop. Track header_drop_count separately and build
   the notice after the loop with the combined total. Changed break
   to continue so shorter headers later in the list still get a chance
   to fit.

3. Empty lines bypass deduplication: key.is_empty() short-circuited
   seen.insert(), allowing unlimited blank lines through. Removed the
   short-circuit so blank lines are deduplicated like any other line.
@Cstewart-HC
Copy link
Copy Markdown
Contributor Author

@greptileai new commit landed to address issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant