Circus Maximus Arena Framework - An open-source adversarial arena framework for AI agents.
CMAX Arena lets developers create and run AI agents in deterministic, replayable sandbox "disciplines" (games/simulations). The framework is:
- Plugin-based: Add new arenas by implementing a
GameDefinitioninterface - Agent-agnostic: Agents can be rule-based, LLM-backed, multi-model, or remote-hosted
- Deterministic + Replayable: Every match produces an event log that can be replayed byte-for-byte
- Fair: Enforces per-turn timeouts and supports budget hooks for token/cost tracking
# Clone the repository
git clone https://github.com/wakesync/cmax-arena.git
cd cmax-arena
# Install dependencies
pnpm install
# Build all packages
pnpm build
# Run tests
pnpm test# Rock-Paper-Scissors with random agents
pnpm --filter @cmax/cli start -- run match --game rps --agents random,random --seed "demo" --rounds 100
# Kuhn Poker
pnpm --filter @cmax/cli start -- run match --game kuhn_poker --agents random,kuhn_rule --seed "demo"# Verify a match log is deterministic
pnpm --filter @cmax/cli start -- replay --log ./logs/<match-id>.jsonl --verify# Run multiple matches and compute Elo ratings
pnpm --filter @cmax/cli start -- run ladder --game kuhn_poker --agents random,kuhn_rule --matches 20 --seed "season-1"import { runMatch, runLadder } from "@cmax/core";
import { rps, kuhnPoker } from "@cmax/games";
import { randomAgent, kuhnRuleAgent } from "@cmax/agents";
// Single match
const report = await runMatch(kuhnPoker, [randomAgent, kuhnRuleAgent], {
seed: "my-match",
});
console.log(`Winner: Player ${report.results.winner}`);
// Ladder tournament
const ladder = await runLadder(rps, [randomAgent, counterAgent], {
matchesPerPair: 10,
});
console.log(`Leaderboard:`, ladder.leaderboard);See the examples/ directory for more usage patterns.
| Package | Description |
|---|---|
@cmax/core |
Framework core - game loop, events, RNG, ratings, types |
@cmax/games |
Built-in game disciplines (RPS, Kuhn Poker, Texas Hold'em) |
@cmax/agents |
Reference agents (random, rule-based, LLM-powered) |
@cmax/cli |
Command line interface |
@cmax/runner |
Match runner service for Supabase integration |
Implement the GameDefinition interface:
import { GameDefinition, GameState, Action } from "@cmax/core";
export const myGame: GameDefinition = {
id: "my_game",
version: "1.0.0",
numPlayers: 2,
reset({ seed, config }) {
// Initialize game state
},
observe({ state, playerId }) {
// Return player's observation
},
legalActions({ state, playerId }) {
// Return list of legal actions
},
step({ state, playerId, action, rng }) {
// Apply action and return new state
},
isTerminal(state) {
// Return true if game is over
},
getResults(state) {
// Return final scores/winner
},
};See docs/game-interface.md for full documentation.
Implement the Agent interface:
import { Agent, DecideInput, DecideOutput } from "@cmax/core";
export const myAgent: Agent = {
id: "my_agent",
version: "1.0.0",
displayName: "My Agent",
kind: "local",
async decide(input: DecideInput): Promise<DecideOutput> {
// Choose an action from input.legalActions
return { action: input.legalActions[0] };
},
};See docs/agent-interface.md for full documentation.
Use any LLM via OpenRouter to play games:
import { createOpenRouterAgent, createClaudeAgent } from "@cmax/agents";
// Create an LLM agent
const claude = createClaudeAgent(process.env.OPENROUTER_API_KEY);
// Or use any OpenRouter model
const gpt4 = createOpenRouterAgent({
apiKey: process.env.OPENROUTER_API_KEY,
model: "openai/gpt-4-turbo",
temperature: 0.2,
});
// Run a match
const report = await runMatch(kuhnPoker, [claude, randomAgent], {
seed: "llm-test",
});CLI usage:
# Use LLM agents with the llm: prefix
OPENROUTER_API_KEY=your_key pnpm --filter @cmax/cli start -- run match \
--game kuhn_poker \
--agents llm:anthropic/claude-3.5-sonnet,random \
--seed "llm-match"- Architecture - Package overview and data flow
- Game Interface - How to implement games
- Agent Interface - How to implement agents
- Event Log - JSONL schema and replay format
- Determinism - Seed, RNG, and replay verification
- Roadmap - Future plans
Simple 2-player game for testing. Configurable number of rounds.
Classic 2-player poker variant used in game theory research. 3-card deck, one betting round.
Full No-Limit Texas Hold'em poker with 2-6 players. Includes:
- Standard 52-card deck with deterministic shuffle
- All betting rounds (preflop, flop, turn, river)
- Complete hand evaluation (high card to royal flush)
- Side pot support for all-in situations
# Run a Texas Hold'em match
pnpm --filter @cmax/cli start -- run match --game texas_holdem --agents random,random --seed "poker-night"This framework does not support real-money gambling. It is designed for AI research, competitions, and educational purposes only.
See CONTRIBUTING.md for guidelines.
MIT - see LICENSE