Skip to content

Does EXPLICIT prompt caching work in Gemini 2.5 Pro? #4482

@zerocorebeta

Description

@zerocorebeta

Issue

Currently, it seems aider uses:

Caching automatically occurs when subsequent requests contain the identical text, images, and cache_control parameter as the first request. All requests must also include the cache_control parameter in the same blocks.

This is how it works in anthropic model.

But in vertex SDK for explicit prompt caching they've this example:

# Create a model instance
model = genai.GenerativeModel.from_cached_content(cached_content=cached_content)

# Now, generate content using the cache.
# The content in the cache will be used as a prefix to your prompt.
response = model.generate_content("Based on the document, what is the main topic?")

print(response.text)

There is no mention if sending the same content without doing genai.GenerativeModel.from_cached_content(cached_content=cached_content) will result in cache being used effectively.

Version and model info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions