A production-ready FastAPI service for AI-powered web automation using browser-use with multiple LLM providers
BrowserUse AI Agent API is a professional-grade service that combines browser automation with Large Language Models (LLMs) to execute complex web tasks. Built with enterprise features like structured logging, configurable templates, and multiple LLM provider support, it's designed to be both powerful and easy to customize.
- 🔌 Multiple LLM Providers: Google (Gemini), OpenAI, Anthropic, Ollama
- 📝 Template System: Jinja2-based task templates for flexible task definition
- 📊 Professional Logging: JSON/text structured logging with rotation and multiple handlers
- ⚙️ YAML Configuration: Centralized, validated configuration management
- 🐳 Docker Ready: Production-ready Docker setup with multi-stage builds
- 🔄 RESTful API: Clean, documented API with Pydantic validation
- 🛠️ Extensible: Easy to add custom data processors and utilities
- 📈 Monitoring: Health checks and metrics endpoints
- 🎯 Type Safe: Full type hints and Pydantic models throughout
- Python 3.11 or higher
- An API key for your chosen LLM provider (Gemini, OpenAI, Anthropic, or Ollama)
- Clone the repository
git clone https://github.com/yourusername/browseruse-ai-api.git
cd browseruse-ai-api- Install dependencies
pip install -r requirements.txt- Configure the service
# Copy example configuration
cp config/config.yaml config/config.yaml
# Copy example environment file
cp .env.example .env
# Edit .env and add your API key
nano .env- Start the service
python run.pyThe service will be available at http://localhost:8000. Visit http://localhost:8000/docs for the interactive API documentation.
# Set your API key in .env file
echo "GEMINI_API_KEY=your_api_key_here" > .env
# Start the service
docker-compose up -d
# View logs
docker-compose logs -f
# Stop the service
docker-compose down# Build the image
docker build -t browseruse-ai-api .
# Run the container
docker run -d \
-p 8000:8000 \
-e GEMINI_API_KEY=your_api_key \
-v $(pwd)/logs:/app/logs \
--name browseruse-ai-api \
browseruse-ai-apiSend a POST request to /task endpoint:
curl -X POST "http://localhost:8000/task" \
-H "Content-Type: application/json" \
-d '{
"template_name": "default",
"task_data": {
"search_location": "New York, USA",
"primary_name": "Empire State Building",
"address": "350 5th Ave",
"city": "New York",
"district": "Manhattan"
}
}'import requests
response = requests.post(
"http://localhost:8000/task",
json={
"template_name": "default",
"task_data": {
"search_location": "Paris, France",
"primary_name": "Eiffel Tower",
"city": "Paris"
}
}
)
result = response.json()
print(f"Success: {result['success']}")
print(f"Result: {result['result']}")| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Service information |
/health |
GET | Health check |
/config |
GET | Current configuration |
/templates |
GET | List available templates |
/task |
POST | Execute a task |
/docs |
GET | Interactive API documentation |
Edit config/config.yaml to customize the service:
# LLM Configuration
llm:
provider: "google" # google, openai, anthropic, ollama
model_name: "gemini-2.0-flash"
temperature: 0.7
max_tokens: 4096
# Logging Configuration
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
format: "json" # json, text
console:
enabled: true
level: "INFO"
file:
enabled: true
level: "DEBUG"
path: "logs/app.log"
max_bytes: 10 # MB
backup_count: 5
# Task Configuration
tasks:
default_template: "default"
templates_dir: "templates"
timeout: 300
max_retries: 3Create a .env file in the project root:
# Google Gemini
GEMINI_API_KEY=your_gemini_api_key_here
# Or OpenAI
OPENAI_API_KEY=your_openai_api_key_here
# Or Anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_hereGoogle Gemini (Default)
llm:
provider: "google"
model_name: "gemini-2.0-flash"
api_key_env: "GEMINI_API_KEY"OpenAI
llm:
provider: "openai"
model_name: "gpt-4"
api_key_env: "OPENAI_API_KEY"Anthropic Claude
llm:
provider: "anthropic"
model_name: "claude-3-opus"
api_key_env: "ANTHROPIC_API_KEY"Ollama (Local)
llm:
provider: "ollama"
model_name: "llama2"
# No API key needed for local OllamaTemplates are stored in the templates/ directory and use Jinja2 syntax.
Example: Custom Search Template
Create templates/web_search.txt:
Search the web for information about {{ topic }}.
Steps:
1. Go to {{ search_engine | default("https://www.google.com") }}
2. Search for: "{{ topic }}"
{% if specific_site %}
3. Focus on results from {{ specific_site }}
{% endif %}
4. Extract the top {{ num_results | default(5) }} results
Return the results as a JSON array with title, URL, and snippet for each result.Using the template:
response = requests.post(
"http://localhost:8000/task",
json={
"template_name": "web_search",
"task_data": {
"topic": "climate change solutions",
"search_engine": "https://www.google.com",
"num_results": 10
}
}
)You can add custom data processors in src/utils/data_utils.py:
from src.utils import register_processor
@register_processor("my_custom_processor")
def process_my_data(data: dict) -> dict:
"""Custom data processing logic."""
# Your processing logic here
processed = {
"original": data,
"processed": data.get("field", "").upper()
}
return processedThen reference it in your configuration:
data_processing:
processors:
- "my_custom_processor"BrowserUse AI API provides comprehensive logging:
- Colored output for easy reading
- Configurable log level
- Real-time request/response logging
- JSON structured logs for easy parsing
- Automatic rotation when size limit reached
- Configurable retention (backup count)
- Full agent conversation history
- Saved per task execution
- Located in
logs/conversations/
Adjust in config/config.yaml:
logging:
level: "DEBUG" # See everything
# level: "INFO" # Production default
# level: "WARNING" # Only warnings and errors
# level: "ERROR" # Only errorsRun the example request:
# Test health endpoint
curl http://localhost:8000/health
# List available templates
curl http://localhost:8000/templates
# Test task execution
curl -X POST http://localhost:8000/task \
-H "Content-Type: application/json" \
-d @example_request.jsonbrowseruse-ai-api/
├── config/
│ └── config.yaml # Main configuration file
├── src/
│ ├── api/
│ │ ├── __init__.py
│ │ └── models.py # Pydantic models for API
│ ├── core/
│ │ ├── __init__.py
│ │ ├── config.py # Configuration management
│ │ └── logging.py # Logging setup
│ ├── services/
│ │ ├── __init__.py
│ │ ├── agent_service.py # Agent execution logic
│ │ ├── llm_service.py # LLM provider management
│ │ └── template_service.py # Template rendering
│ ├── utils/
│ │ ├── __init__.py
│ │ └── data_utils.py # Utility functions (customize here!)
│ └── app.py # FastAPI application
├── templates/
│ ├── default.txt # Default task template
│ └── google_maps_en.txt # Example: Google Maps search
├── logs/ # Log files (auto-created)
├── tests/ # Test files
├── .env # Environment variables (create this)
├── .env.example # Environment variables example
├── .gitignore
├── docker-compose.yaml # Docker Compose configuration
├── Dockerfile # Docker image definition
├── requirements.txt # Python dependencies
├── run.py # Application entry point
└── README.md # This file
- ✅ Never commit
.envfile or API keys to version control - ✅ Use environment variables for sensitive data
- ✅ Run Docker containers as non-root user (configured)
- ✅ Enable CORS only for trusted origins in production
- ✅ Use HTTPS in production deployments
- ✅ Regularly update dependencies
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please see CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with FastAPI
- Powered by browser-use
- Templating by Jinja2
- Configuration management with Pydantic
- Add more LLM provider support
- Implement rate limiting
- Add authentication/authorization
- Create web UI for task management
- Add task queue for async processing
- Implement caching for repeated tasks
- Add metrics and monitoring dashboard
BrowserUse AI Agent API - AI-Powered Web Automation Service