CLAUDE.md - OpenGuardrails Project Context

AI assistant guide for OpenGuardrails architecture and development workflows.

CRITICAL DEPLOYMENT REQUIREMENT

ONE-COMMAND DEPLOYMENT MUST ALWAYS WORK: docker compose up -d

Development mode:

cd frontend; npm run dev
cd backend; python start_admin_service.py
cd backend; python start_detection_service.py
cd backend; python start_proxy_service.py

Production Environment Notes:

Current environment uses systemctl to start Python services (not Docker)
Backend service ports:
- Admin service: 53333
- Detection service: 53334
- Proxy service: 53335

Before ANY changes affecting database/services/dependencies/config/Docker:

✅ Test: docker compose down -v && docker compose up -d
✅ All services start without manual intervention
✅ Database migrations run automatically
✅ Services have proper health checks

Testing checklist:

docker compose down -v && docker compose up -d
docker logs -f openguardrails-platform
docker ps  # All services healthy
curl http://localhost:3000/platform/
docker exec openguardrails-postgres psql -U openguardrails -d openguardrails -c "\dt"

Project Overview

OpenGuardrails: Enterprise AI safety platform with prompt attack detection, content safety (19 risk categories), and data leak detection.

License: Apache 2.0
Model: OpenGuardrails-Text-2510 (3.3B, 119 languages)
Repo: https://huggingface.co/openguardrails/OpenGuardrails-Text-2510
Contact: thomas@openguardrails.com

Core Features

API Call Mode (5001): Active detection API
Security Gateway Mode (5002): Transparent proxy (WAF-style)
Multi-turn conversation, multimodal (text+image), knowledge base responses, ban policy

Architecture

Four Containers

PostgreSQL (54321): Database
Text Model (58002): OpenGuardrails-Text-2510 (vLLM, GPU)
Embedding (58004): BAAI/bge-m3 (vLLM, GPU)
Platform (unified): Nginx + 3 FastAPI services
- Admin (5000, 2 workers): User/config management
- Detection (5001, 32 workers): Safety detection
- Proxy (5002, 24 workers): OpenAI-compatible gateway
- Frontend (3000): React UI

Start all: docker compose up -d (models auto-download from HuggingFace)

Project Structure

openguardrails/
├── backend/
│   ├── {admin,detection,proxy}_service.py  # FastAPI apps
│   ├── start_{admin,detection,proxy}_service.py  # Startup scripts
│   ├── database/{connection,models}.py  # SQLAlchemy ORM
│   ├── routers/  # API routes: auth, guardrails, proxy_api, config_api, etc.
│   ├── services/  # Business logic: guardrail_service, model_service, etc.
│   ├── models/{requests,responses}.py  # Pydantic models
│   ├── migrations/  # Auto-run migrations (entrypoint.sh)
│   └── entrypoint.sh  # Service startup + migrations
├── frontend/
│   ├── src/{pages,components,services,contexts}/
│   └── nginx.conf
└── docs/  # API_REFERENCE.md, DEPLOYMENT.md, MIGRATION_GUIDE.md, DATA_LEAKAGE_GUIDE.md

Database Schema (Key Tables)

Access DB: docker exec openguardrails-postgres psql -U openguardrails -d openguardrails

tenants: Users (id, email, api_key, is_super_admin)
detection_results: Detection logs (risk levels, categories, actions)
blacklist/whitelist: Keywords (tenant_id, keywords JSON)
response_templates: Custom responses (risk_category, template_text)
risk_type_config: Risk type settings (compliance/security/data configs)
ban_policy: Auto-ban rules (thresholds, duration)
knowledge_base: Q&A pairs (embeddings for similarity search)
proxy_keys: Proxy API keys (upstream provider configs)
upstream_api_config: Model configs (provider, api_base_url, safety attributes)
rate_limits: Rate limit settings
data_security_entity_types: Data leak entity types (regex/GenAI recognition)
application_data_leakage_policy: Data leakage disposal policies per application

Risk Categories (19 Types)

High Risk (S2,S3,S5,S9,S15,S17): Sensitive politics, violent crime, prompt attacks, WMDs, sexual crimes Medium Risk (S4,S6,S7,S16): Harm to minors, non-violent crime, pornography, self-harm Low Risk (S1,S8,S10-S14,S18,S19): General politics, hate, profanity, privacy, commercial, IP, harassment, threats, professional advice

Processing: High→preset responses, Medium→knowledge base, Low→allow

SCANNER PACKAGE SYSTEM (Planning Phase)

Status: Replacing hardcoded S1-S21 with flexible scanner packages

New Features:

Built-in packages (S1-S21 migrated)
Purchasable packages (admin-published)
Custom scanners (S100+, user-defined)
Three types: genai, regex, keyword

Tag Allocation: S1-S21 (built-in), S22-S99 (reserved), S100+ (custom)

New Tables: scanner_packages, scanners, application_scanner_configs, package_purchases, custom_scanners

API Endpoints: /api/v1/scanners/{packages,configs,custom,purchases/*}

See: docs/SCANNER_PACKAGE_IMPLEMENTATION_PLAN.md

DATA LEAKAGE PREVENTION SYSTEM

Status: Production-ready (v5.1.0+)

v5.1.0 Highlight: Automatic Private Model Switching - Enterprise AI agents can now seamlessly protect sensitive data without affecting user experience. When sensitive data is detected, requests are automatically routed to private/on-premise models instead of blocking, ensuring data never leaves your infrastructure while maintaining uninterrupted user workflows.

Architecture: Multi-layer protection system with format-aware detection, intelligent disposal strategies, and automatic private model switching

Key Components

Format Detection Service (backend/services/format_detection_service.py):

Auto-detects content format (JSON, YAML, CSV, Markdown, Plain Text)
Enables format-aware smart segmentation

Segmentation Service (backend/services/segmentation_service.py):

JSON: Split by top-level objects (max 50 segments)
YAML: Split by top-level keys (max 50 segments)
CSV: Split by rows with header retention (max 100 rows)
Markdown: Split by ## sections (max 30 sections)
Plain Text: Single segment (no segmentation)

Data Security Service (backend/services/data_security_service.py):

Regex entities: Applied to full text (ID cards, credit cards, etc.)
GenAI entities: Applied per-segment with parallel processing
Risk aggregation: Highest risk from all segments wins

Data Leakage Disposal Service (backend/services/data_leakage_disposal_service.py):

Block: Reject request completely (default for high risk)
Anonymize: Replace sensitive entities with placeholders (default for medium/low risk)
Switch Private Model: Redirect to data-private model (optional)
Pass: Allow request, log only (audit mode)

Data Leakage Policy Flow:

User Request → DLP Detection → Sensitive Data Found → Risk Level Determined
                                                              ↓
                           ┌──────────────────────────────────┼──────────────────────────────────┐
                           ↓                                  ↓                                  ↓
                    High Risk Action              Medium Risk Action                 Low Risk Action
                           ↓                                  ↓                                  ↓
                  Block (default)              Anonymize (default)               Anonymize (default)
                           ↓                                  ↓                                  ↓
                    Return Error           Replace with placeholders         Replace with placeholders

Enterprise Value: Users continue their workflow seamlessly while sensitive data automatically stays within your infrastructure.

Private Model System (v5.1.0 Enhanced)

Model Safety Attributes (upstream_api_config table):

is_data_safe: Mark as safe for sensitive data (on-premise, private cloud, air-gapped)
is_default_private_model: Tenant-wide default private model
private_model_priority: Priority ranking (0-100, higher = preferred)

Private Model Selection Priority:

Application policy private_model_id (explicit configuration)
Tenant default private model (is_default_private_model = true)
Highest priority private model (private_model_priority DESC)

Configuration via Admin UI:

Navigate to Proxy Model Management → Select model → Safety Settings
Set is_data_safe = true for on-premise/private models
Optionally set as default or configure priority

Application-Level Policies

Table: application_data_leakage_policy

Configuration per application:

high_risk_action: block | switch_private_model | anonymize | pass
medium_risk_action: block | switch_private_model | anonymize | pass
low_risk_action: block | switch_private_model | anonymize | pass
private_model_id: Override private model for this app (nullable)
enable_format_detection: Enable/disable format detection
enable_smart_segmentation: Enable/disable smart segmentation

Default Strategy:

High Risk → block
Medium Risk → anonymize
Low Risk → anonymize

Performance Characteristics

Format Detection: ~5-10ms overhead
Smart Segmentation: ~10-20ms overhead
Parallel Processing Gain: 20-60% faster for large content (> 1KB)
Net Impact: Faster overall for medium/large content

See: docs/DATA_LEAKAGE_GUIDE.md (comprehensive user guide)

Key Environment Variables

Database: DATABASE_URL, RESET_DATABASE_ON_STARTUP (dev only) Auth: JWT_SECRET_KEY, SUPER_ADMIN_USERNAME/PASSWORD Models: GUARDRAILS_MODEL_API_URL, EMBEDDING_API_BASE_URL Services: {ADMIN,DETECTION,PROXY}_{PORT,UVICORN_WORKERS} Detection: MAX_DETECTION_CONTEXT_LENGTH (default: 7168, should be model max-len - 1000) Other: CORS_ORIGINS, DEBUG, LOG_LEVEL, DATA_DIR

API Authentication

API Key: Authorization: Bearer sk-xxai-{key} (Detection/Proxy)
JWT Token: Authorization: Bearer {jwt} (Admin)
Admin Switch: X-Switch-User: {user_id}

Key API Endpoints

Detection (5001)

POST /v1/guardrails - Main detection API
- Note: extra_body is SDK-only (OpenAI Python SDK unfolds it)
- HTTP/curl: flatten params to top level (no extra_body)
POST /v1/guardrails/input - Dify input moderation
POST /v1/guardrails/output - Dify output moderation

Proxy (5002)

POST /v1/chat/completions - OpenAI-compatible endpoint

Admin (5000)

/api/v1/auth/{login,register} - Authentication
/api/v1/config/{blacklist,whitelist,templates,ban-policy} - Config
/api/v1/config/data-leakage-policy - Data leakage disposal policy (per-app)
/api/v1/config/private-models - List available private models
/api/v1/risk-config - Risk types
/api/v1/proxy/{keys,models} - Proxy management (includes safety attributes)
/api/v1/dashboard/stats - Statistics
/api/v1/results - Detection results

Detection Flow

Check ban status (IP/user_id)
Whitelist check (early pass)
Blacklist check (early reject)
Model detection (security/compliance/data risks) - with sliding window for long content
Aggregate risks (highest wins)
Determine action (pass/reject/replace)
Get response (knowledge base or templates)
Log to DB (async)

Proxy mode: Run detection on input → forward if pass → run detection on output → return if pass

Sliding Window Detection (Long Content)

For content exceeding MAX_DETECTION_CONTEXT_LENGTH, the system applies sliding window:

User-only messages: Slide window on user content (window size = MAX_DETECTION_CONTEXT_LENGTH)
User+Assistant messages: Cross-product detection
- User window size = 1/2 MAX_DETECTION_CONTEXT_LENGTH
- Assistant window size = 1/2 MAX_DETECTION_CONTEXT_LENGTH
- Each user window × each assistant window = total detection windows
Window overlap: 20% overlap to avoid missing content at boundaries
Parallel execution: All windows detected in parallel for performance
Aggregation: Scanner matched if it triggers in ANY window (highest sensitivity wins)

Deployment

Quick Start

# Production (pre-built images)
curl -O https://raw.githubusercontent.com/openguardrails/openguardrails/main/docker-compose.prod.yml
export HF_TOKEN=your-hf-token
docker compose -f docker-compose.prod.yml up -d

# Development (build from source)
git clone https://github.com/openguardrails/openguardrails
cd openguardrails
export HF_TOKEN=your-hf-token
docker compose up -d --build

Access: http://localhost:3000/platform/ (admin@yourdomain.com / CHANGE-THIS-PASSWORD-IN-PRODUCTION)

Automatic Migrations

All migrations run automatically on docker compose up -d

Flow:

PostgreSQL starts (healthcheck)
Admin service → entrypoint.sh → wait for PG → run migrations (with lock) → start service
Detection/Proxy → entrypoint.sh → wait for PG → skip migrations → start service

Key Points:

✅ Migrations run at CONTAINER level (once), NOT worker level
✅ Admin runs migrations BEFORE uvicorn starts (before workers fork)
✅ Detection/Proxy skip migrations (SERVICE_NAME != admin)
✅ PostgreSQL advisory locks prevent concurrent execution
✅ 58 total workers, but migrations execute ONLY ONCE

Monitor: docker logs -f openguardrails-admin | grep -i migration

Check history:

docker exec openguardrails-postgres psql -U openguardrails -d openguardrails \
  -c "SELECT version, description, executed_at, success FROM schema_migrations ORDER BY version;"

See: backend/migrations/README.md, docs/MIGRATION_FAQ.md

Common Workflows

Add New API Endpoint

Create route in backend/routers/
Add service logic in backend/services/
Update Pydantic models in backend/models/
Add frontend service in frontend/src/services/
Create/update page component
⚠️ TEST: docker compose down -v && docker compose up -d

Add Database Migration

cd backend/migrations
./create_migration.sh description_of_change
# Edit versions/XXX_description_of_change.sql (use idempotent SQL)
docker compose down -v && docker compose up -d
docker logs openguardrails-admin | grep -i migration
git add backend/migrations/versions/XXX_description_of_change.sql
git commit -m "Add migration: description of change"

NEVER: Manually modify schema, require manual SQL, edit existing migrations ALWAYS: Use migrations, test from clean state, use idempotent SQL (IF EXISTS)

Troubleshooting

Deployment fails:

Check PostgreSQL: docker logs openguardrails-postgres
Check migrations: docker logs openguardrails-admin | grep -i migration
Check health: docker ps
Reset: docker compose down -v && docker system prune -f && docker compose up -d

Common issues:

PostgreSQL not ready → healthcheck in docker-compose.yml
Migration failed → SQL syntax error
Port conflicts → check 3000,5000,5001,5002,54321 available

FAQs

Q: Won't migrations run multiple times with 58 workers? A: NO. Migrations run at container level (once), before uvicorn forks workers. Only admin runs migrations; detection/proxy skip.

Q: Can I use RESET_DATABASE_ON_STARTUP? A: NO. Deprecated. Use migrations (no data loss).

Q: Do I need to run migrations manually? A: NO. Automatic on docker compose up -d.

Q: Can I change worker count? A: YES. Safe. Migrations unaffected (run once per container, not per worker).

Dependencies

Backend: FastAPI, SQLAlchemy, Pydantic, Uvicorn, PostgreSQL, OpenAI SDK, Pillow, PyJWT Frontend: React 18, Ant Design, React Router, i18next, Axios, Vite

Caching & Performance

Auth cache (1h TTL)
Keyword cache (5m TTL)
Risk config cache (5m TTL)
Template cache (5m TTL)
Rate limiting (per-tenant, global concurrent limits)

Security

bcrypt password hashing
JWT expiration
Secure API key generation
SQLAlchemy ORM (SQL injection protection)
CORS configured
Rate limiting
Multi-tenant isolation

Last Updated: 2026-01-06 Generated for: AI assistants to quickly understand OpenGuardrails architecture.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md - OpenGuardrails Project Context

CRITICAL DEPLOYMENT REQUIREMENT

Project Overview

Core Features

Architecture

Four Containers

Project Structure

Database Schema (Key Tables)

Risk Categories (19 Types)

SCANNER PACKAGE SYSTEM (Planning Phase)

DATA LEAKAGE PREVENTION SYSTEM

Key Components

Private Model System (v5.1.0 Enhanced)

Application-Level Policies

Performance Characteristics

Key Environment Variables

API Authentication

Key API Endpoints

Detection (5001)

Proxy (5002)

Admin (5000)

Detection Flow

Sliding Window Detection (Long Content)

Deployment

Quick Start

Automatic Migrations

Common Workflows

Add New API Endpoint

Add Database Migration

Troubleshooting

FAQs

Dependencies

Caching & Performance

Security

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md - OpenGuardrails Project Context

CRITICAL DEPLOYMENT REQUIREMENT

Project Overview

Core Features

Architecture

Four Containers

Project Structure

Database Schema (Key Tables)

Risk Categories (19 Types)

SCANNER PACKAGE SYSTEM (Planning Phase)

DATA LEAKAGE PREVENTION SYSTEM

Key Components

Private Model System (v5.1.0 Enhanced)

Application-Level Policies

Performance Characteristics

Key Environment Variables

API Authentication

Key API Endpoints

Detection (5001)

Proxy (5002)

Admin (5000)

Detection Flow

Sliding Window Detection (Long Content)

Deployment

Quick Start

Automatic Migrations

Common Workflows

Add New API Endpoint

Add Database Migration

Troubleshooting

FAQs

Dependencies

Caching & Performance

Security