Google Gemini model integration for Amplifier via Google AI API.
The simplest way to use Gemini with Amplifier. Once installed, Amplifier will automatically discover it.
-
Set your API key:
export GOOGLE_API_KEY="your-api-key-here"
Get your API key from Google AI Studio.
-
Add the module:
amplifier module add provider-gemini --source git+https://github.com/microsoft/amplifier-module-provider-gemini@main --global
-
Use Gemini as your provider:
amplifier provider use gemini --global
-
Start using it:
amplifier run "Hello from Gemini!"
That's it! The module is now available for all your projects. Use --project instead of --global to install for just the current project.
For more control over configuration or to compose with other capabilities, use a bundle:
-
Set your API key:
export GOOGLE_API_KEY="your-api-key-here"
Get your API key from Google AI Studio.
-
Create a bundle in your project or home directory (e.g.,
gemini-bundle/bundle.md):--- bundle: name: gemini-dev version: 1.0.0 description: Gemini provider with full 1M context includes: - bundle: foundation session: context: config: max_tokens: 1048576 # Full 1M input context providers: - module: provider-gemini source: git+https://github.com/microsoft/amplifier-module-provider-gemini@main config: default_model: gemini-2.5-flash max_tokens: 65536 # Full 65K output capacity temperature: 0.7 priority: 50 # Lower number = higher priority (beats default 100) --- # Gemini Development Bundle This bundle configures Gemini with full context windows and includes foundation capabilities. ## Available Models - **Gemini Flash** - `gemini-2.5-flash` - Balanced performance with 1M token context - **Gemini Flash-Lite** - `gemini-2.5-flash-lite` - Fastest and most cost-efficient model - **Gemini Pro** - `gemini-2.5-pro` - Most powerful model with extended thinking capabilities - **Gemini 3.0 (Preview)** - `gemini-3-pro-preview` - Best model for advanced reasoning and text generation
-
Use it:
amplifier run --bundle ./gemini-bundle "Hello from Gemini!"
- Python 3.11+
- UV - Fast Python package manager
# macOS/Linux/WSL
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"Provides access to Google's Gemini models as an LLM provider for Amplifier with 1M token context windows and extended thinking capabilities.
Module Type: Provider
Mount Point: providers
Entry Point: amplifier_module_provider_gemini:mount
Current support: Text generation, tool calling, and thinking. Multimodal capabilities (images, video, audio) are not yet implemented.
gemini-3-pro-preview- Best model for advanced reasoning and text generation (1M context, 65K max output)
gemini-2.5-flash- Best price-performance for large-scale processing (1M context, 65K max output, default)gemini-2.5-pro- State-of-the-art thinking model for complex reasoning (1M context, 65K max output)gemini-2.5-flash-lite- Fastest model optimized for cost-efficiency (1M context, 65K max output)
gemini-2.0-flash- Well-rounded capabilities with focus on price-performance (1M context, 8K max output)gemini-2.0-flash-lite- Optimized for cost efficiency and low latency (1M context, 8K max output)
Note: Image/video/audio models not listed as the provider doesn't support multimodal capabilities yet.
[[providers]]
module = "provider-gemini"
name = "gemini"
config = {
default_model = "gemini-2.5-flash",
max_tokens = 8192,
temperature = 0.7,
debug = false,
raw_debug = false
}Standard Debug (debug: true):
- Emits
llm:request:debugandllm:response:debugevents - Contains request/response summaries with truncated values (default 180 chars)
- Moderate log volume, suitable for development
Raw Debug (debug: true, raw_debug: true):
- Emits
llm:request:rawandllm:response:rawevents - Contains complete, unmodified request params and response objects
- Extreme log volume, use only for deep provider integration debugging
- Captures the exact data sent to/from Gemini API before any processing
Example:
providers:
- module: provider-gemini
config:
debug: true # Enable debug events
raw_debug: true # Enable raw API I/O capture
debug_truncate_length: 180 # Control truncation length
default_model: gemini-2.5-flash| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
string | env: GOOGLE_API_KEY |
Google AI API key |
default_model |
string | gemini-2.5-flash |
Default model to use |
max_tokens |
int | 8192 | Maximum output tokens |
temperature |
float | 0.7 | Sampling temperature (0.0-1.0) |
timeout |
float | 300.0 | API timeout in seconds |
priority |
int | 100 | Provider selection priority |
debug |
bool | false | Enable debug-level logging with truncated values |
raw_debug |
bool | false | Enable ultra-verbose raw API I/O logging (requires debug=true) |
debug_truncate_length |
int | 180 | Maximum string length in debug logs |
export GOOGLE_API_KEY="your-api-key-here"Get your API key from Google AI Studio.
# In amplifier configuration
[provider]
name = "gemini"
default_model = "gemini-2.5-flash"For advanced configuration, add Gemini to any bundle. These examples show the YAML configuration section (the frontmatter between --- markers in your bundle.md file):
Basic Configuration:
providers:
- module: provider-gemini
source: git+https://github.com/microsoft/amplifier-module-provider-gemini@main
config:
default_model: gemini-2.5-flash
max_tokens: 65536 # Use full 65K output capacity
temperature: 0.7
priority: 50 # IMPORTANT: Lower number = higher priority (beats default 100)Balanced (1M context, cost-effective):
providers:
- module: provider-gemini
source: git+https://github.com/microsoft/amplifier-module-provider-gemini@main
config:
default_model: gemini-2.5-flash
max_tokens: 65536 # Full 65K output capacity
priority: 50 # Lower number = higher priorityThinking (complex reasoning with full 1M context):
session:
context:
config:
max_tokens: 1048576 # Full 1M input context
orchestrator:
module: loop-streaming
source: git+https://github.com/microsoft/amplifier-module-loop-streaming@main
config:
extended_thinking: true # Show thinking content
providers:
- module: provider-gemini
source: git+https://github.com/microsoft/amplifier-module-provider-gemini@main
config:
default_model: gemini-2.5-pro
max_tokens: 65536 # Full 65K output capacity
temperature: 1.0
priority: 50 # Lower number = higher priorityFast (simple queries, low cost):
providers:
- module: provider-gemini
source: git+https://github.com/microsoft/amplifier-module-provider-gemini@main
config:
default_model: gemini-2.5-flash-lite
max_tokens: 65536 # Full 65K output capacity
temperature: 0.5
priority: 50 # Lower number = higher priority- Text Generation - Single and multi-turn conversations
- Tool/Function Calling - OpenAPI schema format
- Extended Thinking - Reasoning with adjustable token budget
- Streaming Support - Incremental response generation
- 1M Token Context - Process extremely large inputs (Flash models)
- Message Validation - Defense-in-depth error checking
Gemini 2.5 models (Pro and Flash) think by default using dynamic token budgets. The provider automatically captures thinking content from the Gemini API.
To display thinking output, configure your orchestrator (not the provider):
session:
orchestrator:
module: loop-streaming # Required for thinking display
source: git+https://github.com/microsoft/amplifier-module-loop-streaming@main
config:
extended_thinking: true # Show thinking content to userModel thinking behavior:
- gemini-2.5-pro: Thinks by default (best for complex reasoning)
- gemini-2.5-flash: Thinks by default (good for most tasks)
- gemini-2.5-flash-lite: Does NOT think by default
Note: The provider captures thinking from the API automatically. The orchestrator's extended_thinking: true config controls whether it's displayed. Without this config, thinking still happens but isn't shown to the user.
Functions are declared using OpenAPI schema format:
tools = [{
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}]The provider handles tool call marshaling and response integration automatically.
The provider implements automatic repair for incomplete tool call sequences:
The Problem: If tool results are missing from conversation history (due to context compaction bugs, parsing errors, or state corruption), the Gemini API rejects the entire request, breaking the user's session.
The Solution: The provider automatically detects and repairs missing tool_results by injecting synthetic results:
- Repair before API call - Detects missing tool_results and injects synthetic ones
- Make failures visible - Synthetic results contain
[SYSTEM ERROR: Tool result missing]messages - Maintain conversation validity - API accepts repaired messages, session continues
- Enable recovery - LLM acknowledges error and can ask user to retry
- Provide observability - Emits
provider:tool_sequence_repairedevent with repair details
Example:
# Conversation with missing tool result
messages = [
{
"role": "assistant",
"content": [
{"type": "tool_call", "id": "gemini_call_abc123", "name": "get_weather", "input": {...}}
]
},
# MISSING: {"role": "tool", "tool_call_id": "gemini_call_abc123", "content": "..."}
{"role": "user", "content": "Thanks"}
]
# Provider repairs by injecting synthetic result:
{
"role": "tool",
"tool_call_id": "gemini_call_abc123",
"name": "get_weather",
"content": "[SYSTEM ERROR: Tool result missing from conversation history]\n\nTool: get_weather\nCall ID: gemini_call_abc123\n\nThis indicates the tool result was lost after execution.\nLikely causes: context compaction bug, message parsing error, or state corruption.\n\nThe tool may have executed successfully, but the result was lost.\nPlease acknowledge this error and offer to retry the operation."
}This is a defense-in-depth safety net. The orchestrator should handle tool execution errors at runtime, so this repair only triggers when results go missing due to bugs in context management.
The Gemini API does not provide tool call IDs (unlike Anthropic and OpenAI). The provider generates synthetic IDs using the format gemini_call_{uuid} to maintain compatibility with Amplifier's tool protocol.
Impact: Tool call IDs are unique and functional but not provided by the API itself. This is transparent to users but documented for debugging purposes.
The provider implements text generation, tool calling, and thinking support. Multimodal capabilities (images, video, audio) are not yet supported.
google-genai>=1.40.0- Official Google AI Python SDK
Test your local provider changes with the installed amplifier CLI:
# From the provider repository root
cd amplifier-module-provider-gemini
# Add your local provider (use --local for development, not --global)
amplifier module add provider-gemini --source file://. --local
# Set your API key
export GOOGLE_API_KEY="your-api-key-here"
# Now you can use amplifier init and see your local provider in the menu
amplifier init
# Or configure it directly
amplifier provider use gemini --local
# Test your local changes
amplifier run "Hello, testing local Gemini provider!"
# List modules to verify your local provider is registered
amplifier module list -t providerWhen you're done testing:
# Remove the local provider registration
amplifier module remove provider-gemini --localWhy --local instead of --global?
--localregisters the provider only for the current working directory--globalwould affect all your projects (not ideal during development)--projectworks if you want to share with your team
For rapid iteration without the CLI:
cd amplifier-module-provider-gemini
# Install dependencies
uv sync --dev
# Run validation tests (protocol compliance)
uv run pytest
# Test with coverage
uv run pytest --covThese tests validate the provider implements the required protocol without needing the full CLI.
Note
This project is not currently accepting external contributions, but we're actively working toward opening this up. We value community input and look forward to collaborating in the future. For now, feel free to fork and experiment!
Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit Contributor License Agreements.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.