Skip to content

feat(session): add batch memory compression CLI command#1112

Open
mvanhorn wants to merge 1 commit intovolcengine:mainfrom
mvanhorn:osc/350-batch-memory-compression
Open

feat(session): add batch memory compression CLI command#1112
mvanhorn wants to merge 1 commit intovolcengine:mainfrom
mvanhorn:osc/350-batch-memory-compression

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Problem Statement

Memory directories grow without bound as agents run over time. Verbose abstracts waste tokens during overview generation. A user in #350 reported 150K tokens for a single overview because abstracts are longer than the source memories themselves.

Related: #350 (decoupling ingestion), #578 (custom prompts), RFC #712 (memory templating).

Proposed Solution

Add ov compress CLI command that scans a directory for memories with verbose abstracts and truncates them to a target length.

ov compress viking://user/memories/ --max-abstract-length 128 --dry-run

Evidence

Source Evidence Engagement
#350 Decoupling ingestion from summarization 3 thumbsup
#350 comment 150K tokens per overview from verbose abstracts direct user report
#578 Prompt template customization demand 2 thumbsup, @qin-ctx engaged
Prism MCP 10x memory compression demand 64 upvotes, 19 comments

Changes

  • openviking/utils/compress_service.py (new): CompressService scans directory, filters memories by abstract length, truncates excess
  • openviking/server/routers/content.py: POST /v1/content/compress endpoint following the reindex pattern
  • crates/ov_cli/src/client.rs: compress() HTTP client method
  • crates/ov_cli/src/commands/content.rs: compress command handler
  • crates/ov_cli/src/main.rs: Compress subcommand with --max-abstract-length and --dry-run flags

Usage

# Preview what would be compressed (no changes made)
ov compress viking://user/memories/ --dry-run

# Compress abstracts exceeding 128 chars (default)
ov compress viking://user/memories/

# Custom threshold
ov compress viking://user/memories/ --max-abstract-length 256

Returns: files_scanned, files_compressed, chars_saved, and a list of verbose files.

Testing

3 Python unit tests covering CompressService init and error handling.

tests/unit/utils/test_compress_service.py ...  [100%]
3 passed

Implementation Notes

  • Follows the same CLI -> HTTP -> service pattern as ov reindex (merged in PR feat(cli): add reindex command to trigger content re-indexing #795)
  • --dry-run returns stats without modifying anything, safe to run anytime
  • Truncation uses word-boundary splitting to avoid cutting mid-word
  • Caps verbose file list at 20 entries in the response to avoid payload bloat

Feature Area

Session Management

This contribution was developed with AI assistance (Claude Code).

Add `ov compress` command that scans a directory for memories with
verbose abstracts and truncates them to a target length.

- CompressService: scans directory, filters by abstract length,
  truncates excess (or dry-run preview)
- POST /v1/content/compress endpoint following reindex pattern
- `ov compress viking://user/memories/ --max-abstract-length 128 --dry-run`
- Returns stats: files_scanned, files_compressed, chars_saved

Relates to volcengine#350, volcengine#578

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 31, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@qin-ctx qin-ctx self-requested a review March 31, 2026 07:19
@qin-ctx qin-ctx self-assigned this Mar 31, 2026
@mvanhorn mvanhorn force-pushed the osc/350-batch-memory-compression branch from f7f3573 to ed98d22 Compare March 31, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

3 participants