Skip to content

livekit-examples/python-agents-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

LiveKit

Python Agents Examples

A comprehensive collection of runnable examples for building voice, video, and telephony agents with LiveKit

Documentation LiveKit Agents Python 3.10+


Overview

This repository contains everything you need to learn and build production-ready voice AI agents using LiveKit Agents. From single-file quickstarts to multi-agent orchestration systems with companion frontends, these examples demonstrate real-world patterns and best practices.

python-agents-examples/
โ”œโ”€โ”€ docs/examples/          # 50+ focused, single-concept demos
โ””โ”€โ”€ complex-agents/         # 20+ production-style applications with frontends

Every example includes YAML frontmatter metadata (title, category, tags, difficulty, description) for easy discovery by both humans and tooling.


Quick Start

Prerequisites

Requirement Version Notes
Python 3.10+ Required
pip / uv Latest Package management
LiveKit Account โ€” Sign up free
Node.js 18+ Only for frontend demos
pnpm Latest Only for frontend demos

Installation

# Clone the repository
git clone https://github.com/livekit-examples/python-agents-examples.git
cd python-agents-examples

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Environment Setup

Create a .env file in the repository root:

# Required - LiveKit credentials
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

# Provider keys (add as needed for specific examples)
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...
CARTESIA_API_KEY=...
ELEVENLABS_API_KEY=...
ANTHROPIC_API_KEY=...

Run Your First Agent

# Start an interactive voice session
python docs/examples/listen_and_respond/listen_and_respond.py console

The console argument opens an interactive terminal session where you can speak or type with the agent.


Examples by Category

Fundamentals

Start here to understand core agent concepts.

Example Description Difficulty
Listen and Respond The simplest voice agentโ€”listens and responds Beginner
Tool Calling Add function tools agents can invoke Beginner
Context Variables Inject user context into agent instructions Beginner
Playing Audio Play audio files within an agent Beginner
Repeater Echo back exactly what the user says Beginner
Uninterruptable Complete responses without interruptions Beginner
Exit Message Handle graceful session endings Beginner

Multi-Agent Systems

Build complex workflows with multiple specialized agents.

Example Description Difficulty
Agent Transfer Switch between agents mid-call using function tools Intermediate
Medical Office Triage Multi-department routing with context preservation Advanced
Personal Shopper E-commerce with triage, sales, and returns agents Advanced
Doheny Surf Desk Phone booking system with background observer agent Advanced

Telephony & SIP

Voice AI for phone systems.

Example Description Difficulty
Answer Call Basic inbound call handling Beginner
Make Call Outbound calling via SIP trunks Beginner
Warm Handoff Transfer calls to human agents Intermediate
SIP Lifecycle Complete call lifecycle management Advanced
Survey Caller Automated surveys with CSV data collection Intermediate
IVR Navigator Navigate phone menus using DTMF Advanced

Pipeline Customization

Intercept and modify the STT โ†’ LLM โ†’ TTS pipeline.

Example Description Difficulty
Simple Content Filter Keyword-based output filtering Beginner
LLM Content Filter Dual-LLM moderation system Advanced
TTS Node Override Custom text replacements before speech Intermediate
Transcription Node Modify transcriptions before LLM Intermediate
Short Replies Only Interrupt verbose responses Beginner
LLM Output Replacement Strip thinking tags from reasoning models Intermediate

Vision & Multimodal

Agents that can see.

Example Description Difficulty
Gemini Live Vision Real-time vision with Gemini 2.0 Beginner
Vision Agent Camera vision with Grok-2 Vision Intermediate
Moondream Vision Add vision to non-vision LLMs Intermediate

Avatars & Visual Agents

Bring your agent to life with animated avatars.

Example Description Difficulty
Hedra Pipeline Avatar Static image avatar with pipeline architecture Intermediate
Hedra Realtime Avatar OpenAI Realtime + Hedra avatar Intermediate
Dynamic Avatar Create avatars on-the-fly Intermediate
Education Avatar Teaching avatar with flash cards via RPC Advanced
Tavus Avatar Tavus-powered avatar assistant Intermediate

Translation & Multilingual

Break language barriers.

Example Description Difficulty
Pipeline Translator English โ†’ French voice translation Intermediate
TTS Translator Advanced translation with Gladia code-switching Advanced
Change Language Dynamic language switching via function tools Intermediate

Metrics & Observability

Monitor and debug your agents.

Example Description Difficulty
LLM Metrics Token counts, TTFT, throughput Beginner
STT Metrics Transcription timing and errors Beginner
TTS Metrics Speech synthesis performance Beginner
VAD Metrics Voice activity detection stats Beginner
Langfuse Tracing Full session tracing with Langfuse Intermediate

Events & State

React to conversation events and manage state.

Example Description Difficulty
Basic Events Register event listeners with on/off/once Beginner
Event Emitters Custom event handling patterns Beginner
Conversation Monitoring Log and inspect conversation events Beginner
State Tracking Complex NPC state with rapport system Advanced
RPC State Management CRUD operations over RPC Advanced

Advanced Integrations

Connect to external services.

Example Description Difficulty
MCP Client (stdio) Connect to local MCP servers Beginner
MCP Client (HTTP) Connect to remote MCP servers Beginner
Home Automation Control smart home devices Intermediate
RAG Voice Agent Vector search with Annoy + embeddings Advanced
Shopify Voice Voice shopping with MCP + Shopify Advanced

Full Applications

These full-stack applications include both backend agents and React frontends.

๐ŸŽฎ Dungeons & Agents

Voice-driven D&D RPG with narrator/combat agents, character progression, and turn-based combat.

cd complex-agents/role-playing
python agent.py dev

# In another terminal
cd role_playing_frontend && pnpm install && pnpm dev

Features: Multi-agent switching, dice mechanics, NPC generation, inventory system, combat AI


๐Ÿ“ž Doheny Surf Desk

Phone booking system with background observer agent and task groups.

cd complex-agents/doheny-surf-desk
python agent.py dev

Features: 5 specialized agents, LLM-based guardrails, sequential tasks, context injection


๐Ÿ”ฌ EXA Deep Researcher

Voice-controlled research agent using EXA for web intelligence.

cd complex-agents/exa-deep-researcher
python agent.py dev

# In another terminal  
cd frontend && pnpm install && pnpm dev

Features: Background research jobs, RPC streaming, cited reports


๐Ÿฅ Medical Office Triage

Multi-department medical system with agent transfers.

cd complex-agents/medical_office_triage
python triage.py dev

Features: Triage โ†’ Specialist routing, chat history preservation, YAML prompts


๐Ÿ” Drive-Thru

Fast food ordering system with menu management.

cd complex-agents/drive-thru/drive-thru-agent
python agent.py dev

# In another terminal
cd ../frontend && pnpm install && pnpm dev

๐Ÿ“ Nova Sonic Form Agent

Job application interview with AWS Realtime.

cd complex-agents/nova-sonic
python form_agent.py dev

# In another terminal
cd nova-sonic-form-agent && pnpm install && pnpm dev

Features: AWS Realtime model, structured data collection, live form updates


Provider Support

Examples demonstrate integration with these providers:

Category Providers
LLM OpenAI, Anthropic, Google Gemini, Groq, Cerebras, AWS Bedrock, X.AI
STT Deepgram, AssemblyAI, Gladia, Cartesia
TTS Cartesia, ElevenLabs, Rime, PlayAI, Inworld, OpenAI
VAD Silero
Avatar Hedra, Tavus
Vision OpenAI GPT-4V, Google Gemini, X.AI Grok, Moondream
Realtime OpenAI Realtime, Google Gemini Live, AWS Nova Sonic

Discovery Tools

Browse the Index

The complete catalog lives in docs/index.yaml with metadata for every example:

- file_path: docs/examples/tool_calling/tool_calling.py
  title: Tool Calling
  category: basics
  tags: [tool-calling, deepgram, openai, cartesia]
  difficulty: beginner
  description: Shows how to use tool calling in an agent.
  demonstrates:
    - Using the most basic form of tool calling

Find Examples by Tag

# Find all telephony examples
rg "tags:.*telephony" docs/index.yaml

# Find all advanced examples  
rg "difficulty: advanced" docs/index.yaml

Frontmatter Search

Every Python example starts with YAML frontmatter:

# Find examples using specific providers
rg "tags:.*elevenlabs" -g "*.py"

Testing

The repository includes testing utilities in complex-agents/testing/:

# Basic greeting test
async def test_agent_greeting():
    session = await create_test_session()
    response = await session.generate_reply()
    assert "hello" in response.lower()

Run tests with pytest:

cd complex-agents/testing
pytest -v

Resources

Resource Link
LiveKit Agents Documentation docs.livekit.io/agents
LiveKit Agents GitHub github.com/livekit/agents
LiveKit Cloud cloud.livekit.io

Contributing

We welcome contributions! Please open an issue or PR if you:

  • Find a bug or have a suggestion
  • Want to add a new example
  • Improve documentation

Built with โค๏ธ by the LiveKit team

About

Comprehensive collection of examples for LiveKit Agents with Python

Resources

Stars

Watchers

Forks