This document outlines security considerations, best practices, and potential risks when using and deploying the LLM Red Teaming Platform.
- Security Architecture
- API Key Management
- Authentication & Authorization
- Data Privacy
- Network Security
- Vulnerability Disclosure
- Secure Deployment
- Known Limitations
- Security Checklist
This platform evaluates LLM security by:
- Generating adversarial prompts
- Testing for jailbreaks and prompt injections
- Assessing model robustness
Key Threats Addressed:
- Prompt injection attacks
- Jailbreak attempts
- Data exfiltration via prompts
- Indirect prompt injection
- Model bias exploitation
┌─────────────────────────────────────┐
│ Application Security │
│ - Input validation │
│ - Output sanitization │
│ - Error handling │
└─────────────────────────────────────┘
│
┌─────────────────────────────────────┐
│ Authentication & Authorization │
│ - Session management │
│ - API key validation │
│ - Role-based access (future) │
└─────────────────────────────────────┘
│
┌─────────────────────────────────────┐
│ Data Security │
│ - Encryption at rest │
│ - Secure transmission (HTTPS) │
│ - Database access control │
└─────────────────────────────────────┘
│
┌─────────────────────────────────────┐
│ Infrastructure Security │
│ - Network isolation │
│ - Firewall rules │
│ - Rate limiting │
└─────────────────────────────────────┘
# .env file (NEVER commit to Git)
OPENAI_API_KEY=sk-proj-...
GROQ_API_KEY=gsk_...
ANTHROPIC_API_KEY=sk-ant-...
AZURE_OPENAI_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
# Use strong secret key for session management
SECRET_KEY=$(openssl rand -hex 32)- Rotate API keys every 90 days
- Use separate keys for development, staging, and production
- Immediately revoke compromised keys
# Restrict key access
os.chmod('.env', 0o600) # Only owner can read/write
# Never log API keys
logger.info(f"Using model: {model_name}") # Good
logger.info(f"API Key: {api_key}") # NEVER DO THISRecommended Solutions:
- AWS Secrets Manager
- Azure Key Vault
- HashiCorp Vault
- Google Cloud Secret Manager
# Example: AWS Secrets Manager
import boto3
from botocore.exceptions import ClientError
def get_secret(secret_name):
client = boto3.client('secretsmanager')
try:
response = client.get_secret_value(SecretId=secret_name)
return response['SecretString']
except ClientError as e:
logger.error(f"Failed to retrieve secret: {e}")
raise❌ Never commit API keys to version control
❌ Never hardcode keys in source code
❌ Never share keys via email or chat
❌ Never use production keys in development
❌ Never log API keys or include in error messages
# auth/authentication.py
def require_auth():
"""Enforce authentication for Streamlit pages."""
if "authenticated" not in st.session_state:
st.session_state.authenticated = False
if not st.session_state.authenticated:
username = st.text_input("Username")
password = st.text_input("Password", type="password")
if st.button("Login"):
if verify_credentials(username, password):
st.session_state.authenticated = True
st.rerun()
else:
st.error("Invalid credentials")
st.stop()import bcrypt
def hash_password(password: str) -> str:
"""Hash password with bcrypt."""
salt = bcrypt.gensalt(rounds=12)
return bcrypt.hashpw(password.encode(), salt).decode()
def verify_password(password: str, hashed: str) -> bool:
"""Verify password against hash."""
return bcrypt.checkpw(password.encode(), hashed.encode())import pyotp
# Generate MFA secret
secret = pyotp.random_base32()
# Create TOTP instance
totp = pyotp.TOTP(secret)
# Verify code
if totp.verify(user_code):
# Grant access
passfrom fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer
import jwt
security = HTTPBearer()
def verify_token(credentials: HTTPBearer = Depends(security)):
token = credentials.credentials
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
return payload
except jwt.InvalidTokenError:
raise HTTPException(status_code=401, detail="Invalid token")class Role(enum.Enum):
ADMIN = "admin"
ANALYST = "analyst"
VIEWER = "viewer"
def require_role(required_role: Role):
def decorator(func):
def wrapper(*args, **kwargs):
user_role = get_current_user_role()
if user_role != required_role:
raise PermissionError("Insufficient permissions")
return func(*args, **kwargs)
return wrapper
return decorator
@require_role(Role.ADMIN)
def delete_scan(scan_id: str):
"""Only admins can delete scans."""
pass- Risk: Prompts may contain PII or sensitive information
- Mitigation:
- Warn users before testing with real data
- Implement data scrubbing for logs
- Provide option to delete scan history
- Risk: Target models may leak training data
- Mitigation:
- Store responses securely
- Implement data retention policies
- Provide export controls
# Encrypt sensitive fields
from cryptography.fernet import Fernet
class EncryptedField:
def __init__(self, key: bytes):
self.cipher = Fernet(key)
def encrypt(self, data: str) -> bytes:
return self.cipher.encrypt(data.encode())
def decrypt(self, encrypted_data: bytes) -> str:
return self.cipher.decrypt(encrypted_data).decode()# Implement automatic cleanup
import datetime
def cleanup_old_scans(days: int = 90):
"""Delete scans older than specified days."""
cutoff = datetime.datetime.now() - datetime.timedelta(days=days)
db_manager.delete_scans_before(cutoff)- GDPR: Right to deletion, data portability
- CCPA: Consumer data rights
- HIPAA: If testing healthcare models
- SOC 2: Audit logging, access controls
# nginx configuration for HTTPS
server {
listen 443 ssl http2;
server_name redteaming.example.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://localhost:8501;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}from fastapi import Request
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.post("/api/scan")
@limiter.limit("10/minute")
async def create_scan(request: Request):
"""Rate-limited scan endpoint."""
passfrom fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["https://redteaming.example.com"],
allow_credentials=True,
allow_methods=["GET", "POST"],
allow_headers=["*"],
)DO NOT open public GitHub issues for security vulnerabilities.
Instead:
- Email: security@example.com
- Include:
- Description of vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if any)
- 24 hours: Initial acknowledgment
- 7 days: Preliminary assessment
- 30 days: Fix development and testing
- 90 days: Public disclosure (coordinated)
- Consider implementing a bug bounty program
- Reward responsible disclosure
- Recognize contributors in security advisories
# 1. Update dependencies
pip list --outdated
pip install --upgrade package-name
# 2. Security audit
pip-audit
safety check
# 3. Static analysis
bandit -r core/ ui/ database/# Disable debug mode
DEBUG=False
STREAMLIT_SERVER_RUN_ON_SAVE=False
# Secure session
SESSION_COOKIE_SECURE=True
SESSION_COOKIE_HTTPONLY=True
SESSION_COOKIE_SAMESITE=Strict
# Database
DATABASE_URL=postgresql://user:pass@localhost/db
SQLALCHEMY_ECHO=False- Deploy behind firewall
- Use VPN for administrative access
- Enable audit logging
- Implement intrusion detection
- Regular backups
# Use specific version, not 'latest'
FROM python:3.10.14-slim
# Run as non-root user
RUN useradd -m -u 1000 appuser
USER appuser
# Set secure permissions
COPY --chown=appuser:appuser . /app
WORKDIR /app
# Install only production dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD python -c "import requests; requests.get('http://localhost:8501')"
EXPOSE 8501
CMD ["streamlit", "run", "app.py"]-
No encryption at rest for database
- Impact: Sensitive data readable if DB compromised
- Mitigation: Implement database encryption
-
Basic authentication only
- Impact: No MFA, password policies
- Mitigation: Add OAuth2, MFA
-
No audit logging
- Impact: Cannot track user actions
- Mitigation: Implement comprehensive logging
-
SQLite in production
- Impact: File-based DB, limited security
- Mitigation: Migrate to PostgreSQL
-
API keys in environment
- Impact: Risk if server compromised
- Mitigation: Use secret management service
- Only test models you have permission to test
- Don't use on third-party APIs without authorization
- Be aware of terms of service for LLM providers
- Don't use for malicious purposes
- Consider ethical implications
- No hardcoded secrets
- Input validation on all endpoints
- SQL injection prevention
- XSS protection
- CSRF protection
- Dependency vulnerability scanning
- HTTPS enabled
- API keys in secure vault
- Database encryption
- Rate limiting configured
- Firewall rules applied
- Monitoring and alerting setup
- Regular security updates
- Log monitoring
- Incident response plan
- Backup and recovery tested
- Access control review
- Penetration testing
Last Updated: February 2026
Security Contact: security@example.com
Remember: Security is a continuous process, not a destination. Stay vigilant and keep learning.