Skip to content

[FEATURE]: Google/VertexAI Context Caching #6851

@thisisryanswift

Description

@thisisryanswift

While caching support is already implemented for Anthropic, OpenAI, and Bedrock, it's missing for Google/VertexAI. Caching enables up to a 90% cost reduction and significantly reduces latency for long-running sessions.

Proposed Changes

1. Provider Transformation

In packages/opencode/src/provider/transform.ts, the applyCaching function currently excludes Google. We need to:

  • Update applyCaching to recognize Gemini models.
  • Add Google-specific caching configurations to the providerOptions object.
  • Ensure ProviderTransform.options correctly sets up the environment for caching when using @ai-sdk/google or @ai-sdk/google-vertex.

2. Usage Tracking

In packages/opencode/src/session/index.ts, the getUsage function needs to be updated to handle Gemini's metadata:

  • Currently, it handles anthropic and bedrock specific metadata for cache writes.
  • We need to add logic to extract cachedInputTokens (or the equivalent Google metadata) to ensure costs are calculated accurately and displayed correctly in the TUI/Stats.

3. LLM Orchestration

In packages/opencode/src/session/llm.ts, verify that the streamText call correctly passes through the necessary providerOptions or experimental_cache parameters required by the AI SDK for Google context caching.

Context

Gemini 3 Flash offers a 10x cost improvement for cached tokens ($0.05/1M vs $0.50/1M), making this optimization essential for high-volume usage. Reference: Gemini Pricing & Features.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions