Independent experiment. Not affiliated with, endorsed by, or approved by the T3Code team or Ping.gg. One developer's architectural exploration — nothing more.
An OTP supervision layer for multi-provider AI coding agents. Four providers, real responses, real tool execution — managed through a single supervised runtime instead of per-provider adapter discipline.
Built as a fork of T3Code. For the full architectural case: Why T3Code Might Want Two Runtimes.
Browser ──── Node Server ──────────────── T3Otp-Engine (Elixir)
│ │
│◄── Phoenix Channel WS ───────►│
│ (single connection) │
│ ├── CodexSession (stdio JSON-RPC)
├── ClaudeAdapter (Agent SDK) ├── CursorSession (stdio stream-json)
│ └── OpenCodeSession (HTTP + SSE)
│ │
│ normalized events │
│◄───────────────────────────────│
Note: Claude uses the Anthropic Agent SDK directly in Node (same as upstream T3Code). The remaining three providers — Codex, Cursor, OpenCode — are managed by the Elixir harness.
Once you manage several concurrent agent sessions — each with its own protocol, failure modes, and lifecycle — the job stops looking like "serve some HTTP" and starts looking like "supervise a tree of unstable runtimes." OTP was designed for exactly this class of problem.
The engine owns supervision, crash isolation, and per-session memory containment. Node keeps SDK access, TypeScript contracts, desktop integration, and the orchestration layer. Both runtimes share SQLite for durability.
| Concern | Owner |
|---|---|
| Provider process lifecycle | Elixir |
| Crash isolation + supervision | Elixir |
| Per-session memory containment | Elixir |
| Session + event durability | Both (SQLite) |
| Pending request crash recovery | Elixir |
| Canonical event mapping (TS types) | Node |
| Claude Agent SDK | Node |
| Browser/Electron WebSocket | Node |
| Desktop + product integration | Node |
Prerequisites: Pixi (manages Erlang toolchain), Elixir 1.19+, Node.js 20+, bun, and at least one provider CLI (codex, claude, cursor, opencode).
git clone https://github.com/Ranvier-Technologies/t3code-OTP.git t3otp-engine
cd t3otp-engine
pixi run setup # Install all dependencies (Node + Hex)
pixi run dev # Starts Elixir harness + Node server + ViteOpen browser, switch provider, send a prompt.
pixi run harness # Elixir harness only (port 4321)
pixi run dev-server # Node server only
pixi run dev-web # Vite dev server onlybun run scripts/harness-dry-run.ts # Connects WS, verifies all channel commands
pixi run test-elixir # ExUnit tests
pixi run credo # Elixir linterAll 4 providers verified E2E in browser with real prompts and real tool execution:
| Provider | Runtime | Transport | Tool Use |
|---|---|---|---|
| Claude | Node | Agent SDK (direct) | file write via bypassPermissions |
| Codex | Elixir | stdio JSON-RPC via Erlang Port | commandExecution — file created |
| Cursor | Elixir | stdio stream-json via Erlang Port | file write via --yolo |
| OpenCode | Elixir | HTTP + SSE via raw TCP + Req | file_change write via permission reply |
Full feature matrix including session lifecycle, approval requests, streaming tool output, and thread persistence: see the companion writeup.
8 test types comparing Node/V8 (shared heap) vs Elixir/OTP (per-process heaps). Single-run results framed as architectural demonstrations, not statistical claims. Structural properties (per-process heaps, supervision cleanup) are BEAM guarantees, not observations.
| Metric | Node | Elixir |
|---|---|---|
| Memory with 1 leaky session | Shared heap → 2,800% of baseline (158 MB) | BEAM total → 102% of baseline |
| Event loop lag during leak | p99 = 169ms | N/A (no shared event loop) |
| Sibling sessions affected | All degraded | Zero |
| Latency at 200 concurrent sessions | 3,314ms p99 | 607ms p99 |
| Throughput at 200 sessions | 5,779 ev/s (event loop saturated) | 18,000 ev/s |
| Per-session memory (5→200 sessions) | Unmeasurable (shared heap) | Constant ~268KB |
| Crash lag spike (up to 20 simultaneous) | 1.4–2.7ms | 0.0ms at every count |
Codex gpt-5.4 in plan mode, 10 subagents across 2 sessions: Node oscillated 1.8–54 MB. Elixir held 54–63 MB flat while processing 14,918 events and 824 tool calls.
Honest scorecard: Elixir wins on isolation, observability, and per-session memory attribution. Node wins on raw throughput and SDK ecosystem access. Hence: two runtimes.
Full analysis with methodology, caveats, and adversarial review: output/stress-test/analysis.md.
# Mock provider tests (no API keys needed)
bun run scripts/stress-test-runner.ts # Scaling + latency
bun run scripts/stress-test-node.ts # Same tests, Node baseline
bun run scripts/stress-test-memory-leak.ts # Memory leak isolation
bun run scripts/stress-test-exception.ts # Crash injection
bun run scripts/stress-test-scale50.ts # 50-session scale
bun run scripts/stress-test-gc-lab-elixir.ts # GC cross-contamination
# Real provider tests (requires API keys)
bun run scripts/stress-test-real-workload.ts --runtime=elixir
bun run scripts/stress-test-real-subagent.ts --runtime=elixir
# Generate charts
python3 output/stress-test/viz.py
python3 output/stress-test/viz-real.py| Module | Role |
|---|---|
SessionManager |
DynamicSupervisor routing, session lifecycle |
CodexSession |
Codex JSON-RPC GenServer |
CursorSession |
Cursor stream-json + tool mapping |
OpenCodeSession |
OpenCode HTTP+SSE + tool mapping |
ClaudeSession |
Claude CLI GenServer (stress tests only) |
MockSession |
Configurable mock for stress testing |
SnapshotServer |
In-memory event store + WAL replay + recovery |
Storage |
SQLite durability (sessions, events, pending reqs) |
Projector |
Pure event → snapshot projection |
ModelDiscovery |
CLI-based model listing with ETS cache |
HarnessChannel |
Phoenix Channel — single WS entry point |
| File | Role |
|---|---|
HarnessClientAdapter.ts |
Single adapter: all 13 ProviderAdapterShape methods |
HarnessClientManager.ts |
Phoenix Channel WS client with reconnection |
codexEventMapping.ts |
Canonical event mapping (shared with existing CodexAdapter) |
ClaudeAdapter.ts |
Claude Agent SDK adapter (Node-native, not via harness) |
- Create
apps/harness/lib/harness/providers/new_session.ex - Add clause to
SessionManager.provider_module/1 - Add provider kind to
packages/contracts/src/orchestration.ts
No new TypeScript adapter needed — HarnessClientAdapter handles all harness-routed providers generically.
- Sidebar stale state (KI-1):
ProviderRuntimeIngestion.tsrequires a thread to exist before processing events. Fix identified, not shipped. See known-issues.md. - No streaming tool output for Cursor/OpenCode (KI-3): upstream protocol limitation — only tool start/complete events emitted.
- Cursor: no rollback API, no user-input question events (protocol limitations).
- Desktop packaging: BEAM runtime adds ~60–80 MB to Electron bundle.
- Single-run stress tests: architectural demonstrations, not statistically significant.
This is a fork of T3Code by Ping.gg. The Elixir harness is entirely additive — existing Node adapters (CodexAdapter, ClaudeAdapter) continue to work unchanged.
The T3Code core team has an open PR exploring the same space: #581: Centralize all harnesses with single websocket server. Both approaches work. The difference is in execution guarantees: behavioral isolation (Node) vs structural isolation (OTP).
This fork is offered as an architectural exploration, not a merge request.
- Why T3Code Might Want Two Runtimes — the companion writeup with interactive diagrams
- George Guimarães: "Your Agent Framework Is Just a Bad Clone of Elixir" — industry context on the BEAM/agent convergence
- codingsh: "Why Elixir is Perfect for AI Agents" — practical BEAM patterns for long-running agent nodes
- Full stress test analysis — methodology, caveats, adversarial review
Same as upstream T3Code. See LICENSE.
Bastian Venegas Arevalo (@ranvier2d2) · CTO, Ranvier Technologies