Skip to content

Latest commit

 

History

History
182 lines (138 loc) · 5.1 KB

File metadata and controls

182 lines (138 loc) · 5.1 KB

LLM Gateway

A production-ready enterprise gateway for Large Language Model (LLM) APIs. Provides unified access to multiple LLM providers (OpenAI, Anthropic, OpenRouter, etc.) with built-in authentication, rate limiting, cost tracking, and observability.

Features

  • Multi-Provider Support: OpenAI, Anthropic, OpenRouter, and generic OpenAI-compatible APIs
  • Streaming Support: Full SSE streaming proxy with time-to-first-token metrics
  • Authentication: JWT-based auth with SSO integration
  • Role-Based Access Control: Hierarchical roles with granular permissions
  • Rate Limiting: Configurable per-application rate limits
  • Cost Tracking: Usage metering and spending limits
  • Observability: Prometheus metrics, structured logging, OpenTelemetry traces
  • Failover: Automatic provider failover and health checking
  • Content Guardrails: Request/response filtering and safety checks
  • Real-Time Updates: Server-Sent Events for live dashboard updates
  • Admin Dashboard: React-based UI for management and monitoring

Quick Start

Prerequisites

  • Go 1.25+
  • Node.js 18+
  • Docker & Docker Compose
  • PostgreSQL (via Docker)
  • Redis (via Docker)

Development Setup

# Start databases
make db-up

# Run migrations and seed data
make db-reset

# Start development (backend + frontend)
make dev

Or run services individually:

# Backend only
make backend

# Frontend only (in separate terminal)
make frontend

Access Points

Project Structure

.
├── cmd/server/          # Application entrypoint
├── internal/            # Private application packages
│   ├── admin/          # Admin API handlers
│   ├── audit/          # Audit logging
│   ├── auth/           # JWT authentication
│   ├── cache/          # Redis caching layer
│   ├── cost/           # Cost tracking & spending limits
│   ├── failover/       # Provider failover logic
│   ├── guardrails/     # Content filtering
│   ├── middleware/     # HTTP middleware (auth, rate-limit, logging)
│   ├── proxy/          # LLM API proxy & streaming
│   ├── realtime/       # SSE event broadcasting
│   ├── routing/        # Request routing
│   ├── secrets/        # Secret management
│   ├── server/         # HTTP server & router
│   ├── sso/            # SSO integration
│   ├── store/          # Data persistence
│   ├── telemetry/      # Metrics & tracing
│   └── webhooks/       # Webhook handlers
├── frontend/            # React admin dashboard
├── migrations/          # Database migrations
├── deploy/              # Deployment configurations
│   ├── kubernetes/     # Kubernetes manifests
│   ├── helm/           # Helm chart
│   ├── prometheus/     # Prometheus alerts & rules
│   └── grafana/        # Grafana dashboards
├── test/                # Load & integration tests
└── docs/                # Documentation

Configuration

Environment variables:

Variable Description Default
PORT Server port 8080
DATABASE_URL PostgreSQL connection string -
REDIS_URL Redis connection string -
JWT_SECRET JWT signing secret -
LOG_LEVEL Logging level (debug, info, warn, error) info
LOG_FORMAT Log format (json, text) json

API Usage

Proxy LLM Requests

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Streaming Response

The gateway preserves SSE streaming from upstream providers:

curl -N -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <app-token>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4", "messages": [...], "stream": true}'

Testing

# Run unit tests
go test ./...

# Run integration tests
go test ./internal/integration/...

# Run load tests (requires OpenRouter API key)
OPENROUTER_API_KEY=<key> go test ./test/load/... -v

# Run frontend tests
cd frontend && npm test

Deployment

Docker

docker build -t llm-gateway .
docker run -p 8080:8080 llm-gateway

Kubernetes

kubectl apply -f deploy/kubernetes/

Helm

helm install llm-gateway deploy/helm/llm-gateway

Monitoring

The gateway exposes Prometheus metrics at /metrics:

  • llm_gateway_requests_total - Total requests by provider, status
  • llm_gateway_request_duration_seconds - Request latency histogram
  • llm_gateway_streaming_ttft_seconds - Time-to-first-token for streaming
  • llm_gateway_tokens_total - Token usage by provider

Pre-built Grafana dashboards are in deploy/grafana/dashboards/.

License

Proprietary - All rights reserved