LLM Gateway

A production-ready enterprise gateway for Large Language Model (LLM) APIs. Provides unified access to multiple LLM providers (OpenAI, Anthropic, OpenRouter, etc.) with built-in authentication, rate limiting, cost tracking, and observability.

Features

Multi-Provider Support: OpenAI, Anthropic, OpenRouter, and generic OpenAI-compatible APIs
Streaming Support: Full SSE streaming proxy with time-to-first-token metrics
Authentication: JWT-based auth with SSO integration
Role-Based Access Control: Hierarchical roles with granular permissions
Rate Limiting: Configurable per-application rate limits
Cost Tracking: Usage metering and spending limits
Observability: Prometheus metrics, structured logging, OpenTelemetry traces
Failover: Automatic provider failover and health checking
Content Guardrails: Request/response filtering and safety checks
Real-Time Updates: Server-Sent Events for live dashboard updates
Admin Dashboard: React-based UI for management and monitoring

Quick Start

Prerequisites

Go 1.25+
Node.js 18+
Docker & Docker Compose
PostgreSQL (via Docker)
Redis (via Docker)

Development Setup

# Start databases
make db-up

# Run migrations and seed data
make db-reset

# Start development (backend + frontend)
make dev

Or run services individually:

# Backend only
make backend

# Frontend only (in separate terminal)
make frontend

Access Points

Gateway API: http://localhost:8080
Admin UI: http://localhost:5173
Health Check: http://localhost:8080/health
Metrics: http://localhost:8080/metrics

Project Structure

.
├── cmd/server/          # Application entrypoint
├── internal/            # Private application packages
│   ├── admin/          # Admin API handlers
│   ├── audit/          # Audit logging
│   ├── auth/           # JWT authentication
│   ├── cache/          # Redis caching layer
│   ├── cost/           # Cost tracking & spending limits
│   ├── failover/       # Provider failover logic
│   ├── guardrails/     # Content filtering
│   ├── middleware/     # HTTP middleware (auth, rate-limit, logging)
│   ├── proxy/          # LLM API proxy & streaming
│   ├── realtime/       # SSE event broadcasting
│   ├── routing/        # Request routing
│   ├── secrets/        # Secret management
│   ├── server/         # HTTP server & router
│   ├── sso/            # SSO integration
│   ├── store/          # Data persistence
│   ├── telemetry/      # Metrics & tracing
│   └── webhooks/       # Webhook handlers
├── frontend/            # React admin dashboard
├── migrations/          # Database migrations
├── deploy/              # Deployment configurations
│   ├── kubernetes/     # Kubernetes manifests
│   ├── helm/           # Helm chart
│   ├── prometheus/     # Prometheus alerts & rules
│   └── grafana/        # Grafana dashboards
├── test/                # Load & integration tests
└── docs/                # Documentation

Configuration

Environment variables:

Variable	Description	Default
`PORT`	Server port	`8080`
`DATABASE_URL`	PostgreSQL connection string	-
`REDIS_URL`	Redis connection string	-
`JWT_SECRET`	JWT signing secret	-
`LOG_LEVEL`	Logging level (debug, info, warn, error)	`info`
`LOG_FORMAT`	Log format (json, text)	`json`

API Usage

Proxy LLM Requests

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming Response

The gateway preserves SSE streaming from upstream providers:

curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [...], "stream": true}'

Testing

# Run unit tests
go test ./...

# Run integration tests
go test ./internal/integration/...

# Run load tests (requires OpenRouter API key)
OPENROUTER_API_KEY=<key> go test ./test/load/... -v

# Run frontend tests
cd frontend && npm test

Deployment

Docker

docker build -t llm-gateway .
docker run -p 8080:8080 llm-gateway

Kubernetes

kubectl apply -f deploy/kubernetes/

Helm

helm install llm-gateway deploy/helm/llm-gateway

Monitoring

The gateway exposes Prometheus metrics at /metrics:

llm_gateway_requests_total - Total requests by provider, status
llm_gateway_request_duration_seconds - Request latency histogram
llm_gateway_streaming_ttft_seconds - Time-to-first-token for streaming
llm_gateway_tokens_total - Token usage by provider

Pre-built Grafana dashboards are in deploy/grafana/dashboards/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Gateway

Features

Quick Start

Prerequisites

Development Setup

Access Points

Project Structure

Configuration

API Usage

Proxy LLM Requests

Streaming Response

Testing

Deployment

Docker

Kubernetes

Helm

Monitoring

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

LLM Gateway

Features

Quick Start

Prerequisites

Development Setup

Access Points

Project Structure

Configuration

API Usage

Proxy LLM Requests

Streaming Response

Testing

Deployment

Docker

Kubernetes

Helm

Monitoring

License