|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +**Browser Operator** is an AI-native browser built on Chrome DevTools frontend. It adds a multi-agent AI framework to the DevTools panel, enabling intelligent automation and web interaction through specialized AI agents. |
| 8 | + |
| 9 | +## Build & Development Commands |
| 10 | + |
| 11 | +### Initial Setup |
| 12 | + |
| 13 | +```bash |
| 14 | +# Prerequisites: depot_tools in PATH (https://chromium.googlesource.com/chromium/tools/depot_tools.git) |
| 15 | +gclient sync |
| 16 | +npm install |
| 17 | +cp .env.example .env # Configure API keys |
| 18 | +``` |
| 19 | + |
| 20 | +### Build |
| 21 | + |
| 22 | +```bash |
| 23 | +npm run build # Standard build (runs gn gen automatically) |
| 24 | +npm run build -- --watch # Watch mode for development |
| 25 | +npm run build -- -t Debug # Build to out/Debug instead of out/Default |
| 26 | + |
| 27 | +# Fast build (skip type checking and bundling) |
| 28 | +gn gen out/fast-build --args="devtools_skip_typecheck=true devtools_bundle=false" |
| 29 | +npm run build -- -t fast-build |
| 30 | +``` |
| 31 | + |
| 32 | +### Running DevTools with Custom Build |
| 33 | + |
| 34 | +```bash |
| 35 | +# Terminal 1: Build with watch |
| 36 | +npm run build -- --watch |
| 37 | + |
| 38 | +# Terminal 2: Serve the built files |
| 39 | +cd out/Default/gen/front_end && python3 -m http.server 9000 |
| 40 | + |
| 41 | +# Terminal 3: Launch Browser Operator with custom DevTools |
| 42 | +/Applications/Browser\ Operator.app/Contents/MacOS/Browser\ Operator \ |
| 43 | + --disable-infobars \ |
| 44 | + --custom-devtools-frontend=http://localhost:9000/ \ |
| 45 | + --remote-debugging-port=9222 |
| 46 | +``` |
| 47 | + |
| 48 | +### Testing |
| 49 | + |
| 50 | +```bash |
| 51 | +npm run test # Unit tests (Karma/Mocha) |
| 52 | +npm run webtest # E2E tests (Puppeteer) |
| 53 | +npm run debug-webtest -- --spec=path/to/test # Debug specific test |
| 54 | +npm run lint # ESLint |
| 55 | +``` |
| 56 | + |
| 57 | +### Eval Runner (Agent Testing) |
| 58 | + |
| 59 | +**Recommended: Use the eval-runner-analyst agent** to run evals and get detailed analysis: |
| 60 | + |
| 61 | +``` |
| 62 | +# In Claude Code, use the Task tool with eval-runner-analyst agent: |
| 63 | +"Run the action agent evals with cerebras gpt-oss-120b" |
| 64 | +"Test action-agent-checkbox-001 and action-agent-form-001" |
| 65 | +"Compare V0 and V1 action agents on iframe tests" |
| 66 | +``` |
| 67 | + |
| 68 | +The eval-runner-analyst agent handles the complete workflow: running tests, collecting results, and providing detailed analysis of pass/fail patterns. |
| 69 | + |
| 70 | +**Manual CLI usage** (if needed): |
| 71 | + |
| 72 | +The eval runner automatically loads environment variables from `.env` in the project root. |
| 73 | + |
| 74 | +```bash |
| 75 | +# Run agent evaluations (launches headless Chrome by default) |
| 76 | +npx tsx scripts/eval-runner/cli.ts --tool action_agent --verbose |
| 77 | +npx tsx scripts/eval-runner/cli.ts --test action-agent-click-001 --verbose |
| 78 | + |
| 79 | +# Use Cerebras for fast inference (preferred models: zai-glm-4.6, gpt-oss-120b) |
| 80 | +npx tsx scripts/eval-runner/cli.ts --provider cerebras --model zai-glm-4.6 --tool action_agent |
| 81 | +npx tsx scripts/eval-runner/cli.ts --provider cerebras --model gpt-oss-120b --tool action_agent |
| 82 | + |
| 83 | +# Run V0 agent variant |
| 84 | +npx tsx scripts/eval-runner/cli.ts --tool action_agent --tool-override action_agent_v0 --provider cerebras --model gpt-oss-120b |
| 85 | + |
| 86 | +# Connect to running Browser Operator (bypasses bot detection, uses authenticated sessions) |
| 87 | +npx tsx scripts/eval-runner/cli.ts --tool action_agent --remote-debugging-port 9222 --verbose |
| 88 | + |
| 89 | +# Run with visible browser |
| 90 | +npx tsx scripts/eval-runner/cli.ts --tool action_agent --no-headless |
| 91 | +``` |
| 92 | + |
| 93 | +**Note:** The LLM judge defaults to OpenAI (`gpt-4o`) regardless of agent provider. Override with `--judge-provider` and `--judge-model`. |
| 94 | + |
| 95 | +## Architecture |
| 96 | + |
| 97 | +### DevTools Module Hierarchy |
| 98 | + |
| 99 | +``` |
| 100 | +front_end/ |
| 101 | +├── core/ # Shared utilities, CDP backend integration |
| 102 | +├── models/ # Business logic, data handling |
| 103 | +├── panels/ # High-level panels (one per DevTools tab) |
| 104 | +├── ui/components/ # Reusable UI components |
| 105 | +└── entrypoints/ # Application entrypoints (devtools_app.ts) |
| 106 | +``` |
| 107 | + |
| 108 | +Visibility rules: `core/` → `models/` → `panels/` → `entrypoints/` (enforced by GN build) |
| 109 | + |
| 110 | +### AI Chat Panel (`front_end/panels/ai_chat/`) |
| 111 | + |
| 112 | +``` |
| 113 | +ai_chat/ |
| 114 | +├── agent_framework/ # Agent execution engine |
| 115 | +│ ├── AgentRunner.ts # LLM loop, tool execution, handoffs |
| 116 | +│ ├── ConfigurableAgentTool.ts # Agent definition via config objects |
| 117 | +│ └── implementation/ # Concrete agent configs (ActionAgent, etc.) |
| 118 | +├── LLM/ # Provider integrations |
| 119 | +│ ├── LLMClient.ts # Client facade |
| 120 | +│ ├── LLMProviderRegistry.ts # Provider management |
| 121 | +│ └── *Provider.ts # OpenAI, Cerebras, Anthropic, Groq, etc. |
| 122 | +├── cdp/ # Chrome DevTools Protocol adapters |
| 123 | +│ ├── CDPSessionAdapter.ts # Abstract CDP interface |
| 124 | +│ ├── DirectCDPAdapter.ts # Direct CDP connection (eval runner) |
| 125 | +│ └── SDKTargetAdapter.ts # DevTools SDK integration |
| 126 | +├── tools/ # Agent tools (~30 tools for browser actions) |
| 127 | +├── dom/ # Element resolution (shadow DOM, iframes) |
| 128 | +├── common/ # Shared utilities (geometry, mouse, xpath) |
| 129 | +├── core/ # Orchestration, LLMConfigurationManager |
| 130 | +├── evaluation/ # Test case definitions |
| 131 | +└── ui/ # Chat panel UI components |
| 132 | +``` |
| 133 | + |
| 134 | +### Key Concepts |
| 135 | + |
| 136 | +**Agent Framework** |
| 137 | +- `ConfigurableAgentTool`: Agents defined via config (name, prompt, tools, schema, handoffs) |
| 138 | +- `AgentRunner`: Executes agent loop - LLM calls, tool execution, agent handoffs |
| 139 | +- `ToolRegistry`: Central registry for tools/agents (`ToolRegistry.registerToolFactory()`) |
| 140 | +- Handoffs: Agents transfer to specialists via LLM tool calls or max iterations |
| 141 | + |
| 142 | +**CDP Adapters** - Abstraction layer for Chrome DevTools Protocol: |
| 143 | +- `SDKTargetAdapter`: Used when running inside DevTools (has SDK access) |
| 144 | +- `DirectCDPAdapter`: Used by eval runner (connects via chrome-remote-interface) |
| 145 | +- Both implement `CDPSessionAdapter` interface with `getAgent(domain)` method |
| 146 | + |
| 147 | +**LLM Configuration** (via `LLMConfigurationManager`): |
| 148 | +- 3-tier models: Main (powerful), Mini (fast), Nano (simple tasks) |
| 149 | +- Override system: Per-request overrides for eval without affecting localStorage |
| 150 | +- Providers: openai, cerebras, anthropic, groq, openrouter, litellm |
| 151 | + |
| 152 | +### Adding a New Agent |
| 153 | + |
| 154 | +```typescript |
| 155 | +// In implementation/ConfiguredAgents.ts |
| 156 | +function createMyAgentConfig(): AgentToolConfig { |
| 157 | + return { |
| 158 | + name: 'my_agent', |
| 159 | + description: 'What this agent does', |
| 160 | + systemPrompt: 'Instructions for agent behavior', |
| 161 | + tools: ['navigate_url', 'perform_action'], // Registered tool names |
| 162 | + schema: { /* JSON schema for input */ }, |
| 163 | + handoffs: [{ targetAgentName: 'specialist_agent', trigger: 'llm_tool_call' }], |
| 164 | + maxIterations: 10, |
| 165 | + }; |
| 166 | +} |
| 167 | + |
| 168 | +// Register in initializeConfiguredAgents() |
| 169 | +const myAgent = new ConfigurableAgentTool(createMyAgentConfig()); |
| 170 | +ToolRegistry.registerToolFactory('my_agent', () => myAgent); |
| 171 | +``` |
| 172 | + |
| 173 | +### Adding a New Tool |
| 174 | + |
| 175 | +Tools implement the `Tool` interface with `name`, `description`, `schema`, and `execute()`. Register via `ToolRegistry.registerToolFactory()`. |
| 176 | + |
| 177 | +### Eval Runner Architecture |
| 178 | + |
| 179 | +``` |
| 180 | +scripts/eval-runner/ |
| 181 | +├── cli.ts # CLI entry point |
| 182 | +├── TestRunner.ts # Test orchestration |
| 183 | +├── BrowserExecutor.ts # Puppeteer/CDP automation |
| 184 | +├── AgentBridge.ts # Connects runner to agent tools |
| 185 | +├── LLMJudge.ts # LLM-based evaluation scoring |
| 186 | +└── reporters/ # Console, JSON, Markdown output |
| 187 | +``` |
| 188 | + |
| 189 | +Test cases defined in `front_end/panels/ai_chat/evaluation/test-cases/`. |
| 190 | + |
| 191 | +## Environment Variables |
| 192 | + |
| 193 | +```bash |
| 194 | +OPENAI_API_KEY=... # OpenAI |
| 195 | +CEREBRAS_API_KEY=... # Cerebras (fast inference) |
| 196 | +ANTHROPIC_API_KEY=... # Anthropic |
| 197 | +BRAINTRUST_API_KEY=... # Experiment tracking (optional) |
| 198 | +``` |
| 199 | + |
| 200 | +## Key Patterns |
| 201 | + |
| 202 | +- **Lazy loading**: Features dynamically imported via `*-meta.ts` files |
| 203 | +- **GN build system**: Visibility rules enforce module boundaries; edit BUILD.gn when adding files |
| 204 | +- **EventBus**: Uses `Common.ObjectWrapper.ObjectWrapper` for DevTools-compatible events |
| 205 | +- **Shadow DOM/iframe support**: `EnhancedElementResolver` and `buildBackendIdMaps()` handle composed trees |
| 206 | +- **Node ID mapping**: Accessibility tree `nodeId` differs from DOM `backendDOMNodeId`; use mapping utilities |
0 commit comments