Pokémon Opus

An autonomous AI playthrough of Pokémon Blue, powered by Claude Opus 4.7.
No human input. No save scumming. No walkthroughs. Just the model, the screen, and the buttons.

🌐 opusplays.com · 𝕏 @OpusPlays · Quick Start · Architecture · How it works

🎮 What is Pokémon Opus?

Pokémon Opus is an autonomous AI playthrough of Pokémon Blue for Game Boy. Claude Opus 4.7 reads the screen and the emulator's RAM, decides which buttons to press, and works through the entire game on its own — choosing a starter, navigating routes, training a team, fighting gym leaders, and (eventually) taking down the Elite Four.

There is no human in the loop. No walkthrough is fed in. No script is hard-coded for the routes. The model gets the same information any player would — the screen, the party, the bag, the current dialog — and has to figure out what to do.

This repository contains the agent and game server. The companion live dashboard at opusplays.com displays the current team, badges earned, Pokédex progress, and a streaming feed of the model's reasoning, in real time.

Watching an AI think out loud while it tries to find Misty is a very specific kind of fun.

✨ Highlights

🧠 Claude Opus 4.7 (1M context) — the entire run history fits in a single conversation; no summarization required for the first ~150 hours of play
🎮 Headless PyBoy emulator with full RAM reading for Gen 1 (party, bag, badges, location, every flag)
🎯 Long-horizon objectives — the agent maintains its own scratchpad of goals, attempts, and failures across thousands of turns
🧩 Mode-aware reasoning — different sub-agents handle exploration, battles, dialog, and menus, each with their own tailored prompt
🔁 Streaming events — every decision is broadcast via WebSocket for the live dashboard at opusplays.com
🚫 Pure model-on-game — no save scumming, no external lookups, no human nudges

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│  PokemonOpenClaude (game server, port 8765)                 │
│  Headless PyBoy emulator + RAM reader + REST API            │
└──────────────────────────┬──────────────────────────────────┘
                           │ HTTP (GET /state, POST /action,
                           │       GET /screenshot)
┌──────────────────────────▼──────────────────────────────────┐
│  Pokémon Opus Backend (Python, port 3000)                   │
│                                                             │
│  ┌─────────────┐  ┌────────────┐  ┌──────────────────────┐ │
│  │ Orchestrator│  │ LLM Client │  │ Streaming Server     │ │
│  │ (state      │  │ (Anthropic,│  │ (FastAPI + WebSocket)│ │
│  │  machine)   │  │  OpenAI,   │  └──────────┬───────────┘ │
│  │             │  │  local)    │             │             │
│  │ ┌─────────┐ │  └────────────┘             │             │
│  │ │ Explore │ │                             │             │
│  │ │ Battle  │ │  ┌────────────┐             │             │
│  │ │ Menu    │ │  │ Memory     │             │             │
│  │ │ Strategy│ │  │ Objectives │             │             │
│  │ └─────────┘ │  │ Map Graph  │             │             │
│  └─────────────┘  │ Context    │             │             │
│                   └────────────┘             │             │
└──────────────────────────────────────────────┼─────────────┘
                                               │ WebSocket
┌──────────────────────────────────────────────▼─────────────┐
│  React Viewer (TypeScript, port 5173)                      │
│  Live game screen, AI reasoning, team, map, objectives     │
└────────────────────────────────────────────────────────────┘
                                               │
                                               │ HTTPS
┌──────────────────────────────────────────────▼─────────────┐
│  opusplays.com — public live dashboard                     │
│  Team, badges, Pokédex, milestones, FAQ, activity feed     │
└────────────────────────────────────────────────────────────┘

The system is built on three open-source foundations:

Project	Role
PokemonOpenClaude	Headless Game Boy emulator with REST API and full Gen 1 RAM reading
Zork-Opus	Proven AI game-playing architecture — memory, objectives, multi-model orchestration
Archon	Infrastructure patterns for streaming, events, and React dashboards

🧠 Features

AI Brain (adapted from Zork-Opus)

Game-mode state machine — explore → battle → dialog → menu → intro
Mode-specific sub-agents with tailored prompts and heuristics (pokemon_opus/agents/)
Battle agent with the full Gen 1 type chart (including the bugs — Ghost doesn't hit Psychic, Psychic is OP), STAB awareness, and a heuristic fast-path that skips the LLM for trivial wild encounters
Dual-cache memory system — persistent cross-episode memory plus an ephemeral working set
Strategic objective generation with gym-progression planning and Pokédex-completion targets
Map graph with BFS pathfinding and exploration-frontier tracking
Stuck detection, oscillation warnings, auto-save

Game Interface

Talks to PokemonOpenClaude via REST API
Full Gen 1 RAM state: party, bag, battle, dialog, map, badges, Pokédex
Frame-accurate button input (respects Game Boy timing — no rapid-fire glitches)
Screenshot capture for the viewer and for vision-model analysis

LLM Integration

Per-role models — exploration, battle, strategy, and memory each pick their own backend
Anthropic, OpenRouter, and local LLM support out of the box
Circuit breaker with exponential backoff retry
Token and cost tracking — know exactly how much each gym leader cost in API spend

Live Viewer

React + TypeScript + Tailwind v4
Live game screen with pixel-perfect rendering
Streaming AI reasoning panel
Team display with Pokémon sprites and HP bars
Badge timeline, objectives, inventory, milestones
WebSocket with auto-reconnection

🚀 Quick Start

Prerequisites

Python 3.11+
Node.js 18+ (for the local viewer)
A Pokémon Blue ROM — Pokemon - Blue Version (USA, Europe).gb (legally dump your own copy)
An Anthropic API key (or any OpenAI-compatible endpoint)

1. Start the game server

# Clone and install PokemonOpenClaude
git clone https://github.com/NousResearch/pokemon-agent
cd pokemon-agent
pip install -e ".[all]"

# Start it with your ROM
pokemon-agent serve --rom "path/to/Pokemon - Blue Version (USA, Europe).gb" --port 8765

2. Configure Pokémon Opus

git clone https://github.com/OpusPlays/Pokemon-Opus
cd Pokemon-Opus

# Create .env from the example
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

# Install Python dependencies
pip install -e .

3. Run the agent

python -m pokemon_opus.main

The agent starts polling the game server and making decisions. You'll see logs scrolling in your terminal — the model's current goal, what it sees, what it chose to do, and why.

4. (Optional) Start the local viewer

cd viewer
npm install
npm run dev
# Open http://localhost:5173

For the public-facing dashboard, see opusplays.com.

⚙️ Configuration

All settings live in pyproject.toml under [tool.pokemon-opus]:

[tool.pokemon-opus.game]
server_url = "http://localhost:8765"
max_turns_per_episode = 10000
save_interval = 50

[tool.pokemon-opus.llm]
client_base_url = "https://api.anthropic.com/v1"
agent_model      = "claude-opus-4-20250514"
battle_model     = "claude-opus-4-20250514"   # can swap for a faster local model
strategist_model = "claude-opus-4-20250514"
memory_model     = "claude-opus-4-20250514"

[tool.pokemon-opus.llm.battle_sampling]
temperature = 0.3
max_tokens  = 2048

Per-role model configuration

Role	Purpose	Recommended
`agent`	Overworld exploration decisions	Opus (needs reasoning)
`battle`	Move selection, switching, item use	Opus, or a fast local model
`strategist`	Long-term planning, objective updates	Opus
`memory`	Memory synthesis between turns	Opus or Sonnet

Each role can have its own base_url, model, and sampling parameters — mix and match providers freely.

Environment variables

See .env.example:

ANTHROPIC_API_KEY=sk-ant-...                # default provider
OPENROUTER_API_KEY=sk-or-...                # multi-model support
LOCAL_LLM_BASE_URL=http://localhost:8082/v1 # for local fast tactical models
GAME_SERVER_URL=http://localhost:8765       # PokemonOpenClaude

🔁 How a Turn Works

Each turn of the agent follows an 11-phase loop:

Read state — GET /state from the emulator's RAM
Detect mode — battle? dialog? menu? intro? → fall back to explore
Route to agent — pass to the mode-specific sub-agent
Execute actions — POST /action with the chosen button presses
Read post-state — capture what changed
Compute deltas — location, badges, party, items, battle results
Record history — append to the action log with the model's reasoning
Track milestones — badges, catches, level-ups, evolutions
Memory synthesis — create or update location/trainer/strategy memories
Map update — record visits, connections, and warps
Stream to viewer — broadcast new state + screenshot via WebSocket

Memory Categories

The memory manager classifies what the agent learns into typed buckets so older info can be retrieved by relevance, not just recency:

Category	Persistence	Example
`ROUTE`	Core	"Route 3 connects Pewter City to Mt. Moon"
`TRAINER`	Permanent	"Bug Catcher on Route 3 has Caterpie Lv9"
`ITEM`	Permanent	"Found Potion at Viridian Forest (12, 8)"
`POKEMON`	Permanent	"Pikachu spawns in Viridian Forest"
`BATTLE`	Permanent	"Brock's Onix is Lv14, Water Gun was super effective"
`STRATEGY`	Permanent	"Need Lv16 minimum before challenging Misty"
`LANDMARK`	Core	"Pokémon Center in Cerulean City at map ID 3"

Gen 1 Battle Intelligence

Full type effectiveness chart with Gen 1 quirks (no Dark/Steel, Bug→Poison is super effective, Psychic resists nothing useful, etc.)
STAB (Same Type Attack Bonus) awareness baked into move scoring
Move type guessing from name keywords when the bag/move metadata is incomplete
Heuristic fast-path that skips the LLM entirely for clearly-winning matchups against trash-tier wild Pokémon — saves a lot of tokens
LLM fallback for any non-trivial trainer battle, switching decisions, or status interactions

📁 Project Structure

Pokemon-Opus/
├── pokemon_opus/
│   ├── main.py              # Entry point — spins up the orchestrator
│   ├── config.py            # Pydantic config from TOML + env
│   ├── game_client.py       # HTTP client for PokemonOpenClaude
│   ├── orchestrator.py      # Game-mode state machine + turn loop
│   ├── state.py             # GameState, Pokemon, Objective models
│   ├── agents/
│   │   ├── explore.py       # Overworld navigation
│   │   ├── battle.py        # Battle decisions (type-aware)
│   │   ├── menu.py          # Dialog / menu handling (mostly mechanical)
│   │   ├── intro.py         # Intro / starter pick / new game flow
│   │   └── strategist.py    # Long-term planning + objectives
│   ├── memory/
│   │   └── manager.py       # Dual-cache memory system
│   ├── objectives/
│   │   └── manager.py       # Objective lifecycle tracking
│   ├── map/
│   │   └── graph.py         # Room connectivity + BFS pathfinding
│   ├── context/
│   │   └── builder.py       # Per-mode prompt assembly
│   ├── llm/
│   │   └── client.py        # Multi-provider LLM client
│   ├── streaming/
│   │   └── server.py        # FastAPI + WebSocket for the viewer
│   └── data/
│       ├── type_chart.py    # Gen 1 type effectiveness (with all the quirks)
│       └── map_data.py      # Gym order, HMs, progression milestones
├── viewer/                  # React + TypeScript local frontend
│   └── src/
│       ├── components/      # GameScreen, TeamPanel, MapView, etc.
│       ├── hooks/           # useWebSocket
│       └── lib/             # Types, sprite URLs, colors
├── tests/                   # pytest suite
├── memories.md              # Long-form notes the agent has accumulated
├── pyproject.toml           # Config + dependencies
└── .env.example             # API keys template

🌐 Live Dashboard

The public-facing dashboard at opusplays.com is a Next.js site that polls the agent's streaming server and renders:

🟢 Live status — current location, current goal, playtime timer
🐾 Current team — sprites, levels, HP bars, statuses, known moves
🏆 Gym badges — 8-badge grid that lights up as they're earned
📖 Pokédex progress — 151-cell visual grid (caught / seen / unknown)
📰 Activity feed — the model's actions, thoughts, battles, milestones, and deaths in real time
🪜 Milestones timeline — every badge, evolution, region transition, and party wipe
❓ FAQ + run rules — what counts as fair play

It polls every 5 seconds and updates without a page refresh.

🗺️ Roadmap

Follow @OpusPlays for milestone announcements.

🤝 Contributing

Issues and PRs welcome — especially:

Better prompts for stuck situations (puzzle gyms, Team Rocket hideouts, the Safari Zone)
Move metadata improvements (more accurate damage estimates)
Viewer features — anything you'd want to see while watching the run
Bug reports with the exact game state where the agent got stuck (memories.md is dumped on every save)

If you want to run your own AI Pokémon agent off this codebase, please credit the upstream projects below.

🙏 Credits

PokemonOpenClaude by NousResearch — emulator + RAM reading
Zork-Opus — AI agent architecture patterns (memory, objectives, orchestration)
Archon by Cole Medin — infrastructure patterns for streaming and dashboards
Pokémon sprites from PokeAPI
Built with Claude Opus by Anthropic

📜 License

MIT — see LICENSES/ for the full texts of bundled dependencies.

Pokémon and all related media are © Nintendo / Game Freak / Creatures. This is an unofficial fan project; no game code or copyrighted ROM data is distributed.

Run by @OpusPlays · Watch live at opusplays.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pokémon Opus

🎮 What is Pokémon Opus?

✨ Highlights

🏗️ Architecture

🧠 Features

AI Brain (adapted from Zork-Opus)

Game Interface

LLM Integration

Live Viewer

🚀 Quick Start

Prerequisites

1. Start the game server

2. Configure Pokémon Opus

3. Run the agent

4. (Optional) Start the local viewer

⚙️ Configuration

Per-role model configuration

Environment variables

🔁 How a Turn Works

Memory Categories

Gen 1 Battle Intelligence

📁 Project Structure

🌐 Live Dashboard

🗺️ Roadmap

🤝 Contributing

🙏 Credits

📜 License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSES		LICENSES
pokemon_agent		pokemon_agent
pokemon_opus		pokemon_opus
tests		tests
viewer		viewer
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
memories.md		memories.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Pokémon Opus

🎮 What is Pokémon Opus?

✨ Highlights

🏗️ Architecture

🧠 Features

AI Brain (adapted from Zork-Opus)

Game Interface

LLM Integration

Live Viewer

🚀 Quick Start

Prerequisites

1. Start the game server

2. Configure Pokémon Opus

3. Run the agent

4. (Optional) Start the local viewer

⚙️ Configuration

Per-role model configuration

Environment variables

🔁 How a Turn Works

Memory Categories

Gen 1 Battle Intelligence

📁 Project Structure

🌐 Live Dashboard

🗺️ Roadmap

🤝 Contributing

🙏 Credits

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages