LLM Gateway

A production-ready enterprise gateway for Large Language Model (LLM) APIs. Provides unified access to multiple LLM providers (OpenAI, Anthropic, OpenRouter, etc.) with built-in authentication, rate limiting, cost tracking, and observability.

Features

Multi-Provider Support: OpenAI, Anthropic, OpenRouter, and generic OpenAI-compatible APIs
Streaming Support: Full SSE streaming proxy with time-to-first-token metrics
Authentication: JWT-based auth with SSO integration
Role-Based Access Control: Hierarchical roles with granular permissions
Rate Limiting: Configurable per-application rate limits
Cost Tracking: Usage metering and spending limits
Observability: Prometheus metrics, structured logging, OpenTelemetry traces
Failover: Automatic provider failover and health checking
Content Guardrails: Request/response filtering and safety checks
Real-Time Updates: Server-Sent Events for live dashboard updates
Admin Dashboard: React-based UI for management and monitoring

Quick Start

Prerequisites

Go 1.25+
Node.js 18+
Docker & Docker Compose
PostgreSQL (via Docker)
Redis (via Docker)

Development Setup

# Start databases
make db-up

# Run migrations and seed data
make db-reset

# Start development (backend + frontend)
make dev

Or run services individually:

# Backend only
make backend

# Frontend only (in separate terminal)
make frontend

Access Points

Gateway API: http://localhost:8080
Admin UI: http://localhost:5173
Health Check: http://localhost:8080/health
Metrics: http://localhost:8080/metrics

Project Structure

.
├── cmd/server/          # Application entrypoint
├── internal/            # Private application packages
│   ├── admin/          # Admin API handlers
│   ├── audit/          # Audit logging
│   ├── auth/           # JWT authentication
│   ├── cache/          # Redis caching layer
│   ├── cost/           # Cost tracking & spending limits
│   ├── failover/       # Provider failover logic
│   ├── guardrails/     # Content filtering
│   ├── middleware/     # HTTP middleware (auth, rate-limit, logging)
│   ├── proxy/          # LLM API proxy & streaming
│   ├── realtime/       # SSE event broadcasting
│   ├── routing/        # Request routing
│   ├── secrets/        # Secret management
│   ├── server/         # HTTP server & router
│   ├── sso/            # SSO integration
│   ├── store/          # Data persistence
│   ├── telemetry/      # Metrics & tracing
│   └── webhooks/       # Webhook handlers
├── frontend/            # React admin dashboard
├── migrations/          # Database migrations
├── deploy/              # Deployment configurations
│   ├── kubernetes/     # Kubernetes manifests
│   ├── helm/           # Helm chart
│   ├── prometheus/     # Prometheus alerts & rules
│   └── grafana/        # Grafana dashboards
├── test/                # Load & integration tests
└── docs/                # Documentation

Configuration

Environment variables:

Variable	Description	Default
`PORT`	Server port	`8080`
`DATABASE_URL`	PostgreSQL connection string	-
`REDIS_URL`	Redis connection string	-
`JWT_SECRET`	JWT signing secret	-
`LOG_LEVEL`	Logging level (debug, info, warn, error)	`info`
`LOG_FORMAT`	Log format (json, text)	`json`

API Usage

Proxy LLM Requests

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming Response

The gateway preserves SSE streaming from upstream providers:

curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [...], "stream": true}'

Testing

# Run unit tests
go test ./...

# Run integration tests
go test ./internal/integration/...

# Run load tests (requires OpenRouter API key)
OPENROUTER_API_KEY=<key> go test ./test/load/... -v

# Run frontend tests
cd frontend && npm test

Deployment

Docker

docker build -t llm-gateway .
docker run -p 8080:8080 llm-gateway

Kubernetes

kubectl apply -f deploy/kubernetes/

Helm

helm install llm-gateway deploy/helm/llm-gateway

Monitoring

The gateway exposes Prometheus metrics at /metrics:

llm_gateway_requests_total - Total requests by provider, status
llm_gateway_request_duration_seconds - Request latency histogram
llm_gateway_streaming_ttft_seconds - Time-to-first-token for streaming
llm_gateway_tokens_total - Token usage by provider

Pre-built Grafana dashboards are in deploy/grafana/dashboards/.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.claude/plans		.claude/plans
cmd/server		cmd/server
deploy		deploy
docs		docs
frontend		frontend
internal		internal
migrations		migrations
scripts		scripts
sdk/python		sdk/python
test/load		test/load
.env.example		.env.example
.gitignore		.gitignore
CLERK_AUTH_PROMPT.md		CLERK_AUTH_PROMPT.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Gateway

Features

Quick Start

Prerequisites

Development Setup

Access Points

Project Structure

Configuration

API Usage

Proxy LLM Requests

Streaming Response

Testing

Deployment

Docker

Kubernetes

Helm

Monitoring

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Gateway

Features

Quick Start

Prerequisites

Development Setup

Access Points

Project Structure

Configuration

API Usage

Proxy LLM Requests

Streaming Response

Testing

Deployment

Docker

Kubernetes

Helm

Monitoring

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages