A comprehensive collection of runnable examples for building voice, video, and telephony agents with LiveKit
This repository contains everything you need to learn and build production-ready voice AI agents using LiveKit Agents. From single-file quickstarts to multi-agent orchestration systems with companion frontends, these examples demonstrate real-world patterns and best practices.
python-agents-examples/
โโโ docs/examples/ # 50+ focused, single-concept demos
โโโ complex-agents/ # 20+ production-style applications with frontends
Every example includes YAML frontmatter metadata (title, category, tags, difficulty, description) for easy discovery by both humans and tooling.
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.10+ | Required |
| pip / uv | Latest | Package management |
| LiveKit Account | โ | Sign up free |
| Node.js | 18+ | Only for frontend demos |
| pnpm | Latest | Only for frontend demos |
# Clone the repository
git clone https://github.com/livekit-examples/python-agents-examples.git
cd python-agents-examples
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtCreate a .env file in the repository root:
# Required - LiveKit credentials
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
# Provider keys (add as needed for specific examples)
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...
ELEVENLABS_API_KEY=...
ANTHROPIC_API_KEY=...# Start an interactive voice session
python docs/examples/listen_and_respond/listen_and_respond.py consoleThe console argument opens an interactive terminal session where you can speak or type with the agent.
Start here to understand core agent concepts.
| Example | Description | Difficulty |
|---|---|---|
| Listen and Respond | The simplest voice agentโlistens and responds | Beginner |
| Tool Calling | Add function tools agents can invoke | Beginner |
| Context Variables | Inject user context into agent instructions | Beginner |
| Playing Audio | Play audio files within an agent | Beginner |
| Repeater | Echo back exactly what the user says | Beginner |
| Uninterruptable | Complete responses without interruptions | Beginner |
| Exit Message | Handle graceful session endings | Beginner |
Build complex workflows with multiple specialized agents.
| Example | Description | Difficulty |
|---|---|---|
| Agent Transfer | Switch between agents mid-call using function tools | Intermediate |
| Medical Office Triage | Multi-department routing with context preservation | Advanced |
| Personal Shopper | E-commerce with triage, sales, and returns agents | Advanced |
| Doheny Surf Desk | Phone booking system with background observer agent | Advanced |
Voice AI for phone systems.
| Example | Description | Difficulty |
|---|---|---|
| Answer Call | Basic inbound call handling | Beginner |
| Make Call | Outbound calling via SIP trunks | Beginner |
| Warm Handoff | Transfer calls to human agents | Intermediate |
| SIP Lifecycle | Complete call lifecycle management | Advanced |
| Survey Caller | Automated surveys with CSV data collection | Intermediate |
| IVR Navigator | Navigate phone menus using DTMF | Advanced |
Intercept and modify the STT โ LLM โ TTS pipeline.
| Example | Description | Difficulty |
|---|---|---|
| Simple Content Filter | Keyword-based output filtering | Beginner |
| LLM Content Filter | Dual-LLM moderation system | Advanced |
| TTS Node Override | Custom text replacements before speech | Intermediate |
| Transcription Node | Modify transcriptions before LLM | Intermediate |
| Short Replies Only | Interrupt verbose responses | Beginner |
| LLM Output Replacement | Strip thinking tags from reasoning models | Intermediate |
Agents that can see.
| Example | Description | Difficulty |
|---|---|---|
| Gemini Live Vision | Real-time vision with Gemini 2.0 | Beginner |
| Vision Agent | Camera vision with Grok-2 Vision | Intermediate |
| Moondream Vision | Add vision to non-vision LLMs | Intermediate |
Bring your agent to life with animated avatars.
| Example | Description | Difficulty |
|---|---|---|
| Hedra Pipeline Avatar | Static image avatar with pipeline architecture | Intermediate |
| Hedra Realtime Avatar | OpenAI Realtime + Hedra avatar | Intermediate |
| Dynamic Avatar | Create avatars on-the-fly | Intermediate |
| Education Avatar | Teaching avatar with flash cards via RPC | Advanced |
| Tavus Avatar | Tavus-powered avatar assistant | Intermediate |
Break language barriers.
| Example | Description | Difficulty |
|---|---|---|
| Pipeline Translator | English โ French voice translation | Intermediate |
| TTS Translator | Advanced translation with Gladia code-switching | Advanced |
| Change Language | Dynamic language switching via function tools | Intermediate |
Monitor and debug your agents.
| Example | Description | Difficulty |
|---|---|---|
| LLM Metrics | Token counts, TTFT, throughput | Beginner |
| STT Metrics | Transcription timing and errors | Beginner |
| TTS Metrics | Speech synthesis performance | Beginner |
| VAD Metrics | Voice activity detection stats | Beginner |
| Langfuse Tracing | Full session tracing with Langfuse | Intermediate |
React to conversation events and manage state.
| Example | Description | Difficulty |
|---|---|---|
| Basic Events | Register event listeners with on/off/once | Beginner |
| Event Emitters | Custom event handling patterns | Beginner |
| Conversation Monitoring | Log and inspect conversation events | Beginner |
| State Tracking | Complex NPC state with rapport system | Advanced |
| RPC State Management | CRUD operations over RPC | Advanced |
Connect to external services.
| Example | Description | Difficulty |
|---|---|---|
| MCP Client (stdio) | Connect to local MCP servers | Beginner |
| MCP Client (HTTP) | Connect to remote MCP servers | Beginner |
| Home Automation | Control smart home devices | Intermediate |
| RAG Voice Agent | Vector search with Annoy + embeddings | Advanced |
| Shopify Voice | Voice shopping with MCP + Shopify | Advanced |
These full-stack applications include both backend agents and React frontends.
Voice-driven D&D RPG with narrator/combat agents, character progression, and turn-based combat.
cd complex-agents/role-playing
python agent.py dev
# In another terminal
cd role_playing_frontend && pnpm install && pnpm devFeatures: Multi-agent switching, dice mechanics, NPC generation, inventory system, combat AI
Phone booking system with background observer agent and task groups.
cd complex-agents/doheny-surf-desk
python agent.py devFeatures: 5 specialized agents, LLM-based guardrails, sequential tasks, context injection
Voice-controlled research agent using EXA for web intelligence.
cd complex-agents/exa-deep-researcher
python agent.py dev
# In another terminal
cd frontend && pnpm install && pnpm devFeatures: Background research jobs, RPC streaming, cited reports
Multi-department medical system with agent transfers.
cd complex-agents/medical_office_triage
python triage.py devFeatures: Triage โ Specialist routing, chat history preservation, YAML prompts
Fast food ordering system with menu management.
cd complex-agents/drive-thru/drive-thru-agent
python agent.py dev
# In another terminal
cd ../frontend && pnpm install && pnpm devJob application interview with AWS Realtime.
cd complex-agents/nova-sonic
python form_agent.py dev
# In another terminal
cd nova-sonic-form-agent && pnpm install && pnpm devFeatures: AWS Realtime model, structured data collection, live form updates
Examples demonstrate integration with these providers:
| Category | Providers |
|---|---|
| LLM | OpenAI, Anthropic, Google Gemini, Groq, Cerebras, AWS Bedrock, X.AI |
| STT | Deepgram, AssemblyAI, Gladia, Cartesia |
| TTS | Cartesia, ElevenLabs, Rime, PlayAI, Inworld, OpenAI |
| VAD | Silero |
| Avatar | Hedra, Tavus |
| Vision | OpenAI GPT-4V, Google Gemini, X.AI Grok, Moondream |
| Realtime | OpenAI Realtime, Google Gemini Live, AWS Nova Sonic |
The complete catalog lives in docs/index.yaml with metadata for every example:
- file_path: docs/examples/tool_calling/tool_calling.py
title: Tool Calling
category: basics
tags: [tool-calling, deepgram, openai, cartesia]
difficulty: beginner
description: Shows how to use tool calling in an agent.
demonstrates:
- Using the most basic form of tool calling# Find all telephony examples
rg "tags:.*telephony" docs/index.yaml
# Find all advanced examples
rg "difficulty: advanced" docs/index.yamlEvery Python example starts with YAML frontmatter:
# Find examples using specific providers
rg "tags:.*elevenlabs" -g "*.py"The repository includes testing utilities in complex-agents/testing/:
# Basic greeting test
async def test_agent_greeting():
session = await create_test_session()
response = await session.generate_reply()
assert "hello" in response.lower()Run tests with pytest:
cd complex-agents/testing
pytest -v| Resource | Link |
|---|---|
| LiveKit Agents Documentation | docs.livekit.io/agents |
| LiveKit Agents GitHub | github.com/livekit/agents |
| LiveKit Cloud | cloud.livekit.io |
We welcome contributions! Please open an issue or PR if you:
- Find a bug or have a suggestion
- Want to add a new example
- Improve documentation
Built with โค๏ธ by the LiveKit team
