Skip to content

fix(gemini): add conversation history truncation to prevent O(N²) token cost growth#1368

Open
vnz wants to merge 1 commit intothedotmack:mainfrom
vnz:fix/gemini-context-truncation
Open

fix(gemini): add conversation history truncation to prevent O(N²) token cost growth#1368
vnz wants to merge 1 commit intothedotmack:mainfrom
vnz:fix/gemini-context-truncation

Conversation

@vnz
Copy link
Contributor

@vnz vnz commented Mar 15, 2026

Summary

  • GeminiAgent sends the full conversation history with every API call, causing O(N²) token growth per session — a 100-observation session sends ~30M cumulative input tokens
  • Ports the proven truncateHistory() sliding window from OpenRouterAgent to GeminiAgent
  • Adds CLAUDE_MEM_GEMINI_MAX_CONTEXT_MESSAGES (default: 20) and CLAUDE_MEM_GEMINI_MAX_TOKENS (default: 100000) settings
  • Always preserves at least the newest message to avoid empty contents: [] API requests (a bug also present in OpenRouterAgent)
  • Reuses shared estimateTokens() from src/shared/timeline-formatting.ts instead of duplicating

Files Changed

File Change
src/shared/SettingsDefaultsManager.ts Add 2 settings to interface + defaults
src/services/worker/GeminiAgent.ts Add truncation constants, methods, call site, USER_SETTINGS_PATH import
src/services/worker/http/routes/SettingsRoutes.ts Register 2 new setting keys + validation
tests/gemini_agent.test.ts Add truncation tests + oversized single-prompt edge case

Test plan

  • bun test tests/gemini_agent.test.ts — 10/10 tests pass
  • E2E: build-and-sync, worker logs show Context window truncated to prevent runaway costs {originalMessages=22, keptMessages=20}
  • Observations still stored correctly in DB after truncation

Note

OpenRouterAgent has the same empty-history bug (no truncated.length > 0 guard). Follow-up issue to be opened after this merges.

🤖 Generated with Claude Code

@vnz vnz force-pushed the fix/gemini-context-truncation branch from 000d1b9 to 6f2607f Compare March 17, 2026 07:00
…en cost growth

GeminiAgent sends the full conversation history with every API call,
causing quadratic token growth per session. A 100-observation session
sends ~30M cumulative input tokens. This ports the proven truncateHistory()
sliding window from OpenRouterAgent to GeminiAgent.

- Add CLAUDE_MEM_GEMINI_MAX_CONTEXT_MESSAGES (default: 20) and
  CLAUDE_MEM_GEMINI_MAX_TOKENS (default: 100000) settings
- Add truncateHistory() to GeminiAgent using shared estimateTokens()
- Always preserve at least the newest message to avoid empty API requests
- Add settings validation in SettingsRoutes (1-100 messages, 1K-1M tokens)
- Add regression tests for truncation and oversized single-prompt edge case

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vnz vnz force-pushed the fix/gemini-context-truncation branch from 6f2607f to 81553ac Compare March 17, 2026 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant