-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
While caching support is already implemented for Anthropic, OpenAI, and Bedrock, it's missing for Google/VertexAI. Caching enables up to a 90% cost reduction and significantly reduces latency for long-running sessions.
Proposed Changes
1. Provider Transformation
In packages/opencode/src/provider/transform.ts, the applyCaching function currently excludes Google. We need to:
- Update
applyCachingto recognize Gemini models. - Add Google-specific caching configurations to the
providerOptionsobject. - Ensure
ProviderTransform.optionscorrectly sets up the environment for caching when using@ai-sdk/googleor@ai-sdk/google-vertex.
2. Usage Tracking
In packages/opencode/src/session/index.ts, the getUsage function needs to be updated to handle Gemini's metadata:
- Currently, it handles
anthropicandbedrockspecific metadata for cache writes. - We need to add logic to extract
cachedInputTokens(or the equivalent Google metadata) to ensure costs are calculated accurately and displayed correctly in the TUI/Stats.
3. LLM Orchestration
In packages/opencode/src/session/llm.ts, verify that the streamText call correctly passes through the necessary providerOptions or experimental_cache parameters required by the AI SDK for Google context caching.
Context
Gemini 3 Flash offers a 10x cost improvement for cached tokens ($0.05/1M vs $0.50/1M), making this optimization essential for high-volume usage. Reference: Gemini Pricing & Features.