Security Considerations

Overview

This document outlines security considerations, best practices, and potential risks when using and deploying the LLM Red Teaming Platform.

Security Architecture

Threat Model

This platform evaluates LLM security by:

Generating adversarial prompts
Testing for jailbreaks and prompt injections
Assessing model robustness

Key Threats Addressed:

Prompt injection attacks
Jailbreak attempts
Data exfiltration via prompts
Indirect prompt injection
Model bias exploitation

Security Layers

┌─────────────────────────────────────┐
│   Application Security              │
│   - Input validation                │
│   - Output sanitization             │
│   - Error handling                  │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Authentication & Authorization    │
│   - Session management              │
│   - API key validation              │
│   - Role-based access (future)      │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Data Security                     │
│   - Encryption at rest              │
│   - Secure transmission (HTTPS)     │
│   - Database access control         │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Infrastructure Security           │
│   - Network isolation               │
│   - Firewall rules                  │
│   - Rate limiting                   │
└─────────────────────────────────────┘

API Key Management

Best Practices

1. Environment Variables

# .env file (NEVER commit to Git)
OPENAI_API_KEY=sk-proj-...
GROQ_API_KEY=gsk_...
ANTHROPIC_API_KEY=sk-ant-...
AZURE_OPENAI_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

# Use strong secret key for session management
SECRET_KEY=$(openssl rand -hex 32)

2. Key Rotation

Rotate API keys every 90 days
Use separate keys for development, staging, and production
Immediately revoke compromised keys

3. Access Control

# Restrict key access
os.chmod('.env', 0o600)  # Only owner can read/write

# Never log API keys
logger.info(f"Using model: {model_name}")  # Good
logger.info(f"API Key: {api_key}")  # NEVER DO THIS

4. Key Storage (Production)

Recommended Solutions:

AWS Secrets Manager
Azure Key Vault
HashiCorp Vault
Google Cloud Secret Manager

# Example: AWS Secrets Manager
import boto3
from botocore.exceptions import ClientError

def get_secret(secret_name):
    client = boto3.client('secretsmanager')
    try:
        response = client.get_secret_value(SecretId=secret_name)
        return response['SecretString']
    except ClientError as e:
        logger.error(f"Failed to retrieve secret: {e}")
        raise

What NOT to Do

❌ Never commit API keys to version control
❌ Never hardcode keys in source code
❌ Never share keys via email or chat
❌ Never use production keys in development
❌ Never log API keys or include in error messages

Authentication & Authorization

Current Implementation

Session-Based Authentication

# auth/authentication.py
def require_auth():
    """Enforce authentication for Streamlit pages."""
    if "authenticated" not in st.session_state:
        st.session_state.authenticated = False
    
    if not st.session_state.authenticated:
        username = st.text_input("Username")
        password = st.text_input("Password", type="password")
        
        if st.button("Login"):
            if verify_credentials(username, password):
                st.session_state.authenticated = True
                st.rerun()
            else:
                st.error("Invalid credentials")
        st.stop()

Production Recommendations

1. Password Security

import bcrypt

def hash_password(password: str) -> str:
    """Hash password with bcrypt."""
    salt = bcrypt.gensalt(rounds=12)
    return bcrypt.hashpw(password.encode(), salt).decode()

def verify_password(password: str, hashed: str) -> bool:
    """Verify password against hash."""
    return bcrypt.checkpw(password.encode(), hashed.encode())

2. Multi-Factor Authentication (MFA)

import pyotp

# Generate MFA secret
secret = pyotp.random_base32()

# Create TOTP instance
totp = pyotp.TOTP(secret)

# Verify code
if totp.verify(user_code):
    # Grant access
    pass

3. JWT Tokens (FastAPI)

from fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer
import jwt

security = HTTPBearer()

def verify_token(credentials: HTTPBearer = Depends(security)):
    token = credentials.credentials
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
        return payload
    except jwt.InvalidTokenError:
        raise HTTPException(status_code=401, detail="Invalid token")

4. Role-Based Access Control (RBAC)

class Role(enum.Enum):
    ADMIN = "admin"
    ANALYST = "analyst"
    VIEWER = "viewer"

def require_role(required_role: Role):
    def decorator(func):
        def wrapper(*args, **kwargs):
            user_role = get_current_user_role()
            if user_role != required_role:
                raise PermissionError("Insufficient permissions")
            return func(*args, **kwargs)
        return wrapper
    return decorator

@require_role(Role.ADMIN)
def delete_scan(scan_id: str):
    """Only admins can delete scans."""
    pass

Data Privacy

Sensitive Data Handling

1. Prompt Data

Risk: Prompts may contain PII or sensitive information
Mitigation:
- Warn users before testing with real data
- Implement data scrubbing for logs
- Provide option to delete scan history

2. Model Responses

Risk: Target models may leak training data
Mitigation:
- Store responses securely
- Implement data retention policies
- Provide export controls

3. Database Security

# Encrypt sensitive fields
from cryptography.fernet import Fernet

class EncryptedField:
    def __init__(self, key: bytes):
        self.cipher = Fernet(key)
    
    def encrypt(self, data: str) -> bytes:
        return self.cipher.encrypt(data.encode())
    
    def decrypt(self, encrypted_data: bytes) -> str:
        return self.cipher.decrypt(encrypted_data).decode()

4. Data Retention

# Implement automatic cleanup
import datetime

def cleanup_old_scans(days: int = 90):
    """Delete scans older than specified days."""
    cutoff = datetime.datetime.now() - datetime.timedelta(days=days)
    db_manager.delete_scans_before(cutoff)

Compliance Considerations

GDPR: Right to deletion, data portability
CCPA: Consumer data rights
HIPAA: If testing healthcare models
SOC 2: Audit logging, access controls

Network Security

HTTPS/TLS

# nginx configuration for HTTPS
server {
    listen 443 ssl http2;
    server_name redteaming.example.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    location / {
        proxy_pass http://localhost:8501;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Rate Limiting

from fastapi import Request
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/api/scan")
@limiter.limit("10/minute")
async def create_scan(request: Request):
    """Rate-limited scan endpoint."""
    pass

CORS Configuration

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://redteaming.example.com"],
    allow_credentials=True,
    allow_methods=["GET", "POST"],
    allow_headers=["*"],
)

Vulnerability Disclosure

Reporting Security Issues

DO NOT open public GitHub issues for security vulnerabilities.

Instead:

Email: security@example.com
Include:
- Description of vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if any)

Response Timeline

24 hours: Initial acknowledgment
7 days: Preliminary assessment
30 days: Fix development and testing
90 days: Public disclosure (coordinated)

Bug Bounty

Consider implementing a bug bounty program
Reward responsible disclosure
Recognize contributors in security advisories

Secure Deployment

Production Checklist

Application Security

# 1. Update dependencies
pip list --outdated
pip install --upgrade package-name

# 2. Security audit
pip-audit
safety check

# 3. Static analysis
bandit -r core/ ui/ database/

Environment Configuration

# Disable debug mode
DEBUG=False
STREAMLIT_SERVER_RUN_ON_SAVE=False

# Secure session
SESSION_COOKIE_SECURE=True
SESSION_COOKIE_HTTPONLY=True
SESSION_COOKIE_SAMESITE=Strict

# Database
DATABASE_URL=postgresql://user:pass@localhost/db
SQLALCHEMY_ECHO=False

Infrastructure

Deploy behind firewall
Use VPN for administrative access
Enable audit logging
Implement intrusion detection
Regular backups

Docker Security

# Use specific version, not 'latest'
FROM python:3.10.14-slim

# Run as non-root user
RUN useradd -m -u 1000 appuser
USER appuser

# Set secure permissions
COPY --chown=appuser:appuser . /app
WORKDIR /app

# Install only production dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
  CMD python -c "import requests; requests.get('http://localhost:8501')"

EXPOSE 8501
CMD ["streamlit", "run", "app.py"]

Known Limitations

Current Security Gaps

No encryption at rest for database
- Impact: Sensitive data readable if DB compromised
- Mitigation: Implement database encryption
Basic authentication only
- Impact: No MFA, password policies
- Mitigation: Add OAuth2, MFA
No audit logging
- Impact: Cannot track user actions
- Mitigation: Implement comprehensive logging
SQLite in production
- Impact: File-based DB, limited security
- Mitigation: Migrate to PostgreSQL
API keys in environment
- Impact: Risk if server compromised
- Mitigation: Use secret management service

Responsible Use

⚠️ This tool generates adversarial prompts

Only test models you have permission to test
Don't use on third-party APIs without authorization
Be aware of terms of service for LLM providers
Don't use for malicious purposes
Consider ethical implications

Security Checklist

Development

Deployment

Operations

Resources

Last Updated: February 2026
Security Contact: security@example.com

Remember: Security is a continuous process, not a destination. Stay vigilant and keep learning.

Security: jasoncobra3/LLM_Sentinel

Security

docs/SECURITY.md