Skip to content

Security: jasoncobra3/LLM_Sentinel

Security

docs/SECURITY.md

Security Considerations

Overview

This document outlines security considerations, best practices, and potential risks when using and deploying the LLM Red Teaming Platform.

Table of Contents

  1. Security Architecture
  2. API Key Management
  3. Authentication & Authorization
  4. Data Privacy
  5. Network Security
  6. Vulnerability Disclosure
  7. Secure Deployment
  8. Known Limitations
  9. Security Checklist

Security Architecture

Threat Model

This platform evaluates LLM security by:

  • Generating adversarial prompts
  • Testing for jailbreaks and prompt injections
  • Assessing model robustness

Key Threats Addressed:

  • Prompt injection attacks
  • Jailbreak attempts
  • Data exfiltration via prompts
  • Indirect prompt injection
  • Model bias exploitation

Security Layers

┌─────────────────────────────────────┐
│   Application Security              │
│   - Input validation                │
│   - Output sanitization             │
│   - Error handling                  │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Authentication & Authorization    │
│   - Session management              │
│   - API key validation              │
│   - Role-based access (future)      │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Data Security                     │
│   - Encryption at rest              │
│   - Secure transmission (HTTPS)     │
│   - Database access control         │
└─────────────────────────────────────┘
           │
┌─────────────────────────────────────┐
│   Infrastructure Security           │
│   - Network isolation               │
│   - Firewall rules                  │
│   - Rate limiting                   │
└─────────────────────────────────────┘

API Key Management

Best Practices

1. Environment Variables

# .env file (NEVER commit to Git)
OPENAI_API_KEY=sk-proj-...
GROQ_API_KEY=gsk_...
ANTHROPIC_API_KEY=sk-ant-...
AZURE_OPENAI_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/

# Use strong secret key for session management
SECRET_KEY=$(openssl rand -hex 32)

2. Key Rotation

  • Rotate API keys every 90 days
  • Use separate keys for development, staging, and production
  • Immediately revoke compromised keys

3. Access Control

# Restrict key access
os.chmod('.env', 0o600)  # Only owner can read/write

# Never log API keys
logger.info(f"Using model: {model_name}")  # Good
logger.info(f"API Key: {api_key}")  # NEVER DO THIS

4. Key Storage (Production)

Recommended Solutions:

  • AWS Secrets Manager
  • Azure Key Vault
  • HashiCorp Vault
  • Google Cloud Secret Manager
# Example: AWS Secrets Manager
import boto3
from botocore.exceptions import ClientError

def get_secret(secret_name):
    client = boto3.client('secretsmanager')
    try:
        response = client.get_secret_value(SecretId=secret_name)
        return response['SecretString']
    except ClientError as e:
        logger.error(f"Failed to retrieve secret: {e}")
        raise

What NOT to Do

Never commit API keys to version control
Never hardcode keys in source code
Never share keys via email or chat
Never use production keys in development
Never log API keys or include in error messages

Authentication & Authorization

Current Implementation

Session-Based Authentication

# auth/authentication.py
def require_auth():
    """Enforce authentication for Streamlit pages."""
    if "authenticated" not in st.session_state:
        st.session_state.authenticated = False
    
    if not st.session_state.authenticated:
        username = st.text_input("Username")
        password = st.text_input("Password", type="password")
        
        if st.button("Login"):
            if verify_credentials(username, password):
                st.session_state.authenticated = True
                st.rerun()
            else:
                st.error("Invalid credentials")
        st.stop()

Production Recommendations

1. Password Security

import bcrypt

def hash_password(password: str) -> str:
    """Hash password with bcrypt."""
    salt = bcrypt.gensalt(rounds=12)
    return bcrypt.hashpw(password.encode(), salt).decode()

def verify_password(password: str, hashed: str) -> bool:
    """Verify password against hash."""
    return bcrypt.checkpw(password.encode(), hashed.encode())

2. Multi-Factor Authentication (MFA)

import pyotp

# Generate MFA secret
secret = pyotp.random_base32()

# Create TOTP instance
totp = pyotp.TOTP(secret)

# Verify code
if totp.verify(user_code):
    # Grant access
    pass

3. JWT Tokens (FastAPI)

from fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer
import jwt

security = HTTPBearer()

def verify_token(credentials: HTTPBearer = Depends(security)):
    token = credentials.credentials
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
        return payload
    except jwt.InvalidTokenError:
        raise HTTPException(status_code=401, detail="Invalid token")

4. Role-Based Access Control (RBAC)

class Role(enum.Enum):
    ADMIN = "admin"
    ANALYST = "analyst"
    VIEWER = "viewer"

def require_role(required_role: Role):
    def decorator(func):
        def wrapper(*args, **kwargs):
            user_role = get_current_user_role()
            if user_role != required_role:
                raise PermissionError("Insufficient permissions")
            return func(*args, **kwargs)
        return wrapper
    return decorator

@require_role(Role.ADMIN)
def delete_scan(scan_id: str):
    """Only admins can delete scans."""
    pass

Data Privacy

Sensitive Data Handling

1. Prompt Data

  • Risk: Prompts may contain PII or sensitive information
  • Mitigation:
    • Warn users before testing with real data
    • Implement data scrubbing for logs
    • Provide option to delete scan history

2. Model Responses

  • Risk: Target models may leak training data
  • Mitigation:
    • Store responses securely
    • Implement data retention policies
    • Provide export controls

3. Database Security

# Encrypt sensitive fields
from cryptography.fernet import Fernet

class EncryptedField:
    def __init__(self, key: bytes):
        self.cipher = Fernet(key)
    
    def encrypt(self, data: str) -> bytes:
        return self.cipher.encrypt(data.encode())
    
    def decrypt(self, encrypted_data: bytes) -> str:
        return self.cipher.decrypt(encrypted_data).decode()

4. Data Retention

# Implement automatic cleanup
import datetime

def cleanup_old_scans(days: int = 90):
    """Delete scans older than specified days."""
    cutoff = datetime.datetime.now() - datetime.timedelta(days=days)
    db_manager.delete_scans_before(cutoff)

Compliance Considerations

  • GDPR: Right to deletion, data portability
  • CCPA: Consumer data rights
  • HIPAA: If testing healthcare models
  • SOC 2: Audit logging, access controls

Network Security

HTTPS/TLS

# nginx configuration for HTTPS
server {
    listen 443 ssl http2;
    server_name redteaming.example.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    location / {
        proxy_pass http://localhost:8501;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Rate Limiting

from fastapi import Request
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.post("/api/scan")
@limiter.limit("10/minute")
async def create_scan(request: Request):
    """Rate-limited scan endpoint."""
    pass

CORS Configuration

from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://redteaming.example.com"],
    allow_credentials=True,
    allow_methods=["GET", "POST"],
    allow_headers=["*"],
)

Vulnerability Disclosure

Reporting Security Issues

DO NOT open public GitHub issues for security vulnerabilities.

Instead:

  1. Email: security@example.com
  2. Include:
    • Description of vulnerability
    • Steps to reproduce
    • Potential impact
    • Suggested fix (if any)

Response Timeline

  • 24 hours: Initial acknowledgment
  • 7 days: Preliminary assessment
  • 30 days: Fix development and testing
  • 90 days: Public disclosure (coordinated)

Bug Bounty

  • Consider implementing a bug bounty program
  • Reward responsible disclosure
  • Recognize contributors in security advisories

Secure Deployment

Production Checklist

Application Security

# 1. Update dependencies
pip list --outdated
pip install --upgrade package-name

# 2. Security audit
pip-audit
safety check

# 3. Static analysis
bandit -r core/ ui/ database/

Environment Configuration

# Disable debug mode
DEBUG=False
STREAMLIT_SERVER_RUN_ON_SAVE=False

# Secure session
SESSION_COOKIE_SECURE=True
SESSION_COOKIE_HTTPONLY=True
SESSION_COOKIE_SAMESITE=Strict

# Database
DATABASE_URL=postgresql://user:pass@localhost/db
SQLALCHEMY_ECHO=False

Infrastructure

  • Deploy behind firewall
  • Use VPN for administrative access
  • Enable audit logging
  • Implement intrusion detection
  • Regular backups

Docker Security

# Use specific version, not 'latest'
FROM python:3.10.14-slim

# Run as non-root user
RUN useradd -m -u 1000 appuser
USER appuser

# Set secure permissions
COPY --chown=appuser:appuser . /app
WORKDIR /app

# Install only production dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Health check
HEALTHCHECK --interval=30s --timeout=3s \
  CMD python -c "import requests; requests.get('http://localhost:8501')"

EXPOSE 8501
CMD ["streamlit", "run", "app.py"]

Known Limitations

Current Security Gaps

  1. No encryption at rest for database

    • Impact: Sensitive data readable if DB compromised
    • Mitigation: Implement database encryption
  2. Basic authentication only

    • Impact: No MFA, password policies
    • Mitigation: Add OAuth2, MFA
  3. No audit logging

    • Impact: Cannot track user actions
    • Mitigation: Implement comprehensive logging
  4. SQLite in production

    • Impact: File-based DB, limited security
    • Mitigation: Migrate to PostgreSQL
  5. API keys in environment

    • Impact: Risk if server compromised
    • Mitigation: Use secret management service

Responsible Use

⚠️ This tool generates adversarial prompts

  • Only test models you have permission to test
  • Don't use on third-party APIs without authorization
  • Be aware of terms of service for LLM providers
  • Don't use for malicious purposes
  • Consider ethical implications

Security Checklist

Development

  • No hardcoded secrets
  • Input validation on all endpoints
  • SQL injection prevention
  • XSS protection
  • CSRF protection
  • Dependency vulnerability scanning

Deployment

  • HTTPS enabled
  • API keys in secure vault
  • Database encryption
  • Rate limiting configured
  • Firewall rules applied
  • Monitoring and alerting setup

Operations

  • Regular security updates
  • Log monitoring
  • Incident response plan
  • Backup and recovery tested
  • Access control review
  • Penetration testing

Resources


Last Updated: February 2026
Security Contact: security@example.com

Remember: Security is a continuous process, not a destination. Stay vigilant and keep learning.

There aren’t any published security advisories