How to implement prompt caching for sub-agents in Google ADK async workflow? #3368
Replies: 1 comment
-
|
Response from ADK Answering Agent (experimental, answer may be inaccurate) TLDR: Google ADK offers a built-in "Context Caching" feature that is likely what you're looking for. It's configured at the Hello! Great question on optimizing your multi-agent workflow. Here’s a breakdown of how you can implement caching in ADK. Built-in Context CachingGoogle ADK provides a built-in mechanism called Context Caching. This feature is designed to reduce latency and costs by caching the LLM prompt and response context, which is especially useful for agents with large instructions or in multi-turn scenarios. [1, 2] You can enable this by wrapping your root agent in an Here is how you can apply it to your setup: from google.adk.agents.context_cache_config import ContextCacheConfig
from google.adk.apps.app import App
# Your existing agent setup
# SupervisorAgent = LlmAgent(...)
# 1. Define the cache configuration
cache_config = ContextCacheConfig(
min_tokens=4096, # Only cache if the context is larger than this
ttl_seconds=600, # Cache items expire after 10 minutes
cache_intervals=3 # Force a refresh every 3 invocations
)
# 2. Wrap your root agent in an App with the cache config
supervisor_app = App(
name="supervisor_app",
root_agent=SupervisorAgent,
context_cache_config=cache_config,
)
# 3. Run the app (instead of the agent directly)
# Note: You'll need an AppRunner to run the App
# await runner.run_async(input_data) -> This will now use the App's runnerYou can see a full example of this implementation in the ADK samples. [1] Custom Caching with PluginsIf the built-in context caching doesn't fit your needs and you specifically want to integrate an external cache like Redis or Memorystore, the recommended ADK-native approach is to use Plugins. Plugins allow you to introduce custom, cross-cutting logic (like caching, logging, or authentication) into the agent's execution lifecycle without modifying the core agent logic. [3] You could create a plugin that:
This approach gives you full control over the caching strategy and is the standard way to extend ADK with custom behaviors. [1] ADK Cache Analysis Sample Agent |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I’m working on a multi-agent workflow using Google ADK where a root_agent coordinates three sub-agents.
Here’s a simplified version of how I’ve set up the agents:
SupervisorAgent = LlmAgent(
name="root_agent",
model=LiteLlm(model=llm_model),
description="Coordinates and monitors tasks across sub-agents.",
instruction=instruction,
sub_agents=[sub-agent1, sub-agent2, sub-agent3],
)
The workflow runs asynchronously using:
await runner.run_async(input_data)
Now, I want to implement prompt caching specifically for the sub-agent2 and sub-agent3 to reduce redundant LLM calls and improve performance for repeated or similar tasks.
I have a few questions on how best to integrate caching in this setup:
Does Google ADK currently provide a built-in caching mechanism for LLM prompts/responses? if yes how to implement with this workflow?
If not, what’s the recommended approach to integrate custom caching (e.g., using Redis or Google Cloud Memorystore) within this async agent workflow?
I'm using updated adk version.
Beta Was this translation helpful? Give feedback.
All reactions