T3Otp-Engine

Independent experiment. Not affiliated with, endorsed by, or approved by the T3Code team or Ping.gg. One developer's architectural exploration — nothing more.

An OTP supervision layer for multi-provider AI coding agents. Four providers, real responses, real tool execution — managed through a single supervised runtime instead of per-provider adapter discipline.

Built as a fork of T3Code. For the full architectural case: Why T3Code Might Want Two Runtimes.

Browser ──── Node Server ──────────────── T3Otp-Engine (Elixir)
               │                                │
               │◄── Phoenix Channel WS ───────►│
               │    (single connection)         │
               │                                ├── CodexSession    (stdio JSON-RPC)
               ├── ClaudeAdapter (Agent SDK)    ├── CursorSession   (stdio stream-json)
               │                                └── OpenCodeSession (HTTP + SSE)
               │                                │
               │    normalized events           │
               │◄───────────────────────────────│

Note: Claude uses the Anthropic Agent SDK directly in Node (same as upstream T3Code). The remaining three providers — Codex, Cursor, OpenCode — are managed by the Elixir harness.

The idea

Once you manage several concurrent agent sessions — each with its own protocol, failure modes, and lifecycle — the job stops looking like "serve some HTTP" and starts looking like "supervise a tree of unstable runtimes." OTP was designed for exactly this class of problem.

The engine owns supervision, crash isolation, and per-session memory containment. Node keeps SDK access, TypeScript contracts, desktop integration, and the orchestration layer. Both runtimes share SQLite for durability.

Concern	Owner
Provider process lifecycle	Elixir
Crash isolation + supervision	Elixir
Per-session memory containment	Elixir
Session + event durability	Both (SQLite)
Pending request crash recovery	Elixir
Canonical event mapping (TS types)	Node
Claude Agent SDK	Node
Browser/Electron WebSocket	Node
Desktop + product integration	Node

Quickstart

Prerequisites: Pixi (manages Erlang toolchain), Elixir 1.19+, Node.js 20+, bun, and at least one provider CLI (codex, claude, cursor, opencode).

git clone https://github.com/Ranvier-Technologies/t3code-OTP.git t3otp-engine
cd t3otp-engine

pixi run setup    # Install all dependencies (Node + Hex)
pixi run dev      # Starts Elixir harness + Node server + Vite

Open browser, switch provider, send a prompt.

Individual runtimes

pixi run harness      # Elixir harness only (port 4321)
pixi run dev-server   # Node server only
pixi run dev-web      # Vite dev server only

Verification

bun run scripts/harness-dry-run.ts   # Connects WS, verifies all channel commands
pixi run test-elixir                 # ExUnit tests
pixi run credo                       # Elixir linter

Provider support

All 4 providers verified E2E in browser with real prompts and real tool execution:

Provider	Runtime	Transport	Tool Use
Claude	Node	Agent SDK (direct)	file write via `bypassPermissions`
Codex	Elixir	stdio JSON-RPC via Erlang Port	`commandExecution` — file created
Cursor	Elixir	stdio stream-json via Erlang Port	file write via `--yolo`
OpenCode	Elixir	HTTP + SSE via raw TCP + Req	`file_change write` via permission reply

Full feature matrix including session lifecycle, approval requests, streaming tool output, and thread persistence: see the companion writeup.

Stress tests

8 test types comparing Node/V8 (shared heap) vs Elixir/OTP (per-process heaps). Single-run results framed as architectural demonstrations, not statistical claims. Structural properties (per-process heaps, supervision cleanup) are BEAM guarantees, not observations.

The headline numbers

Metric	Node	Elixir
Memory with 1 leaky session	Shared heap → 2,800% of baseline (158 MB)	BEAM total → 102% of baseline
Event loop lag during leak	p99 = 169ms	N/A (no shared event loop)
Sibling sessions affected	All degraded	Zero
Latency at 200 concurrent sessions	3,314ms p99	607ms p99
Throughput at 200 sessions	5,779 ev/s (event loop saturated)	18,000 ev/s
Per-session memory (5→200 sessions)	Unmeasurable (shared heap)	Constant ~268KB
Crash lag spike (up to 20 simultaneous)	1.4–2.7ms	0.0ms at every count

Real workload

Codex gpt-5.4 in plan mode, 10 subagents across 2 sessions: Node oscillated 1.8–54 MB. Elixir held 54–63 MB flat while processing 14,918 events and 824 tool calls.

Honest scorecard: Elixir wins on isolation, observability, and per-session memory attribution. Node wins on raw throughput and SDK ecosystem access. Hence: two runtimes.

Full analysis with methodology, caveats, and adversarial review: output/stress-test/analysis.md.

Running the tests

# Mock provider tests (no API keys needed)
bun run scripts/stress-test-runner.ts          # Scaling + latency
bun run scripts/stress-test-node.ts            # Same tests, Node baseline
bun run scripts/stress-test-memory-leak.ts     # Memory leak isolation
bun run scripts/stress-test-exception.ts       # Crash injection
bun run scripts/stress-test-scale50.ts         # 50-session scale
bun run scripts/stress-test-gc-lab-elixir.ts   # GC cross-contamination

# Real provider tests (requires API keys)
bun run scripts/stress-test-real-workload.ts --runtime=elixir
bun run scripts/stress-test-real-subagent.ts --runtime=elixir

# Generate charts
python3 output/stress-test/viz.py
python3 output/stress-test/viz-real.py

Architecture

Elixir harness (`apps/harness/`)

Module	Role
`SessionManager`	DynamicSupervisor routing, session lifecycle
`CodexSession`	Codex JSON-RPC GenServer
`CursorSession`	Cursor stream-json + tool mapping
`OpenCodeSession`	OpenCode HTTP+SSE + tool mapping
`ClaudeSession`	Claude CLI GenServer (stress tests only)
`MockSession`	Configurable mock for stress testing
`SnapshotServer`	In-memory event store + WAL replay + recovery
`Storage`	SQLite durability (sessions, events, pending reqs)
`Projector`	Pure event → snapshot projection
`ModelDiscovery`	CLI-based model listing with ETS cache
`HarnessChannel`	Phoenix Channel — single WS entry point

Node integration

File	Role
`HarnessClientAdapter.ts`	Single adapter: all 13 `ProviderAdapterShape` methods
`HarnessClientManager.ts`	Phoenix Channel WS client with reconnection
`codexEventMapping.ts`	Canonical event mapping (shared with existing CodexAdapter)
`ClaudeAdapter.ts`	Claude Agent SDK adapter (Node-native, not via harness)

Adding a new provider

Create apps/harness/lib/harness/providers/new_session.ex
Add clause to SessionManager.provider_module/1
Add provider kind to packages/contracts/src/orchestration.ts

No new TypeScript adapter needed — HarnessClientAdapter handles all harness-routed providers generically.

Known limitations

Sidebar stale state (KI-1): ProviderRuntimeIngestion.ts requires a thread to exist before processing events. Fix identified, not shipped. See known-issues.md.
No streaming tool output for Cursor/OpenCode (KI-3): upstream protocol limitation — only tool start/complete events emitted.
Cursor: no rollback API, no user-input question events (protocol limitations).
Desktop packaging: BEAM runtime adds ~60–80 MB to Electron bundle.
Single-run stress tests: architectural demonstrations, not statistically significant.

Relationship to T3Code

This is a fork of T3Code by Ping.gg. The Elixir harness is entirely additive — existing Node adapters (CodexAdapter, ClaudeAdapter) continue to work unchanged.

The T3Code core team has an open PR exploring the same space: #581: Centralize all harnesses with single websocket server. Both approaches work. The difference is in execution guarantees: behavioral isolation (Node) vs structural isolation (OTP).

This fork is offered as an architectural exploration, not a merge request.

License

Same as upstream T3Code. See LICENSE.

Author

Bastian Venegas Arevalo (@ranvier2d2) · CTO, Ranvier Technologies

Name		Name	Last commit message	Last commit date
Latest commit History 1,160 Commits
.claude		.claude
.docs		.docs
.github		.github
.plans		.plans
.vscode		.vscode
apps		apps
assets		assets
docs		docs
packages		packages
scripts		scripts
test-results		test-results
.gitignore		.gitignore
.mise.toml		.mise.toml
.oxfmtrc.json		.oxfmtrc.json
.oxlintrc.json		.oxlintrc.json
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
KEYBINDINGS.md		KEYBINDINGS.md
LICENSE		LICENSE
README.md		README.md
REMOTE.md		REMOTE.md
TEST_PLAN.md		TEST_PLAN.md
TODO.md		TODO.md
bun.lock		bun.lock
package.json		package.json
pixi.lock		pixi.lock
pixi.toml		pixi.toml
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

T3Otp-Engine

The idea

Quickstart

Individual runtimes

Verification

Provider support

Stress tests

The headline numbers

Real workload

Running the tests

Architecture

Elixir harness (`apps/harness/`)

Node integration

Adding a new provider

Known limitations

Relationship to T3Code

Related reading

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

T3Otp-Engine

The idea

Quickstart

Individual runtimes

Verification

Provider support

Stress tests

The headline numbers

Real workload

Running the tests

Architecture

Elixir harness (apps/harness/)

Node integration

Adding a new provider

Known limitations

Relationship to T3Code

Related reading

License

Author

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Elixir harness (`apps/harness/`)

Packages