-
Notifications
You must be signed in to change notification settings - Fork 0
feat: nanochat worker — Karpathy's LLM pipeline on iii-engine #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rohitg00
wants to merge
12
commits into
main
Choose a base branch
from
feat/nanochat-worker
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 10 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
fad4d00
feat: add proof worker — AI-powered browser testing
rohitg00 0213bb2
feat: add nanochat worker — Karpathy's LLM pipeline on iii-engine
rohitg00 3d437be
docs: trim README, remove SDK internals section and em-dashes
rohitg00 ee2fe0e
feat: full nanochat pipeline coverage (20 functions)
rohitg00 2345132
feat: add nanochat as submodule, delegate training to real scripts
rohitg00 86a1c43
feat: real-time training progress via stdout parsing to iii state
rohitg00 270b6b8
fix: address all CodeRabbit review findings
rohitg00 48fbdcd
fix: address round 2 CodeRabbit findings
rohitg00 333f996
fix: image-resize manifest test uses CARGO_PKG_VERSION instead of har…
rohitg00 8c08610
fix: address all remaining CodeRabbit findings (round 3)
rohitg00 215b2a8
feat: pre-forked subprocess launcher + full E2E pipeline working
rohitg00 8cedc80
docs: rewrite README with E2E results, pre-forked launcher architectu…
rohitg00 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| [submodule "nanochat/nanochat-upstream"] | ||
| path = nanochat/nanochat-upstream | ||
| url = https://github.com/karpathy/nanochat.git |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,186 @@ | ||
| # nanochat worker | ||
|
|
||
| A Python worker that brings [Karpathy's nanochat](https://github.com/karpathy/nanochat) (the minimal full-stack ChatGPT clone) onto the III engine. Train GPT models from scratch, fine-tune them, evaluate benchmarks, and serve chat completions, all as live iii functions that any connected worker can discover and call. | ||
|
|
||
| nanochat is ~7,000 lines of Python that trains a GPT-2 level model in ~2 hours on 8xH100 for ~$48. This worker wraps its entire pipeline (tokenizer, pretraining, SFT, evaluation, inference, tool use) into 13 registered functions with typed schemas and proper triggers. | ||
|
|
||
| ## Why this exists | ||
|
|
||
| nanochat is a standalone Python script. You train a model, then serve it with FastAPI. Nothing else on the engine can talk to it. | ||
|
|
||
| This worker changes that. Once it connects to an iii engine, every capability becomes a function that any other worker (Rust, TypeScript, Python) can invoke via `trigger("nanochat.chat.complete", ...)`. Training runs report progress to iii state. Conversations persist across sessions. The model can be hot-swapped without restarting the worker. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - Python 3.10+ | ||
| - iii-sdk 0.10.0+ (`pip install iii-sdk`) | ||
| - PyTorch 2.0+ (`pip install torch`) | ||
| - nanochat dependencies: `pip install tiktoken tokenizers rustbpe datasets pyarrow psutil` | ||
| - A running iii engine on `ws://localhost:49134` (or configure via `--engine-url`) | ||
| - For GPU inference/training: CUDA-capable GPU with sufficient VRAM | ||
|
|
||
| The nanochat source is included as a git submodule. If you cloned without `--recurse-submodules`, run `git submodule update --init`. To use a different nanochat checkout, set `NANOCHAT_DIR` or pass `--nanochat-dir`. | ||
|
|
||
| ## Quick start | ||
|
|
||
| ```bash | ||
| # Clone the workers repo with the nanochat submodule | ||
| git clone --recurse-submodules https://github.com/iii-hq/workers.git | ||
| cd workers/nanochat | ||
|
|
||
| # Install dependencies | ||
| pip install iii-sdk torch tiktoken tokenizers rustbpe | ||
|
|
||
| # Install nanochat's own dependencies | ||
| cd nanochat-upstream && pip install -e . && cd .. | ||
|
|
||
| # Start without a model (for testing registration and non-GPU functions) | ||
| python worker.py --no-autoload | ||
|
|
||
| # Start with a trained SFT model on CUDA | ||
| python worker.py --source sft --device cuda | ||
|
|
||
| # Start with a base model on MPS (Apple Silicon) | ||
| python worker.py --source base --device mps | ||
| ``` | ||
|
|
||
| The nanochat source is included as a git submodule at `nanochat-upstream/` pointing to [karpathy/nanochat](https://github.com/karpathy/nanochat). Training functions run the actual nanochat scripts as subprocesses from this directory, so you get 100% fidelity to the original implementation. | ||
|
|
||
| ## Functions | ||
|
|
||
| The worker registers 20 functions, each with an HTTP or queue trigger. Every handler uses Pydantic type hints for automatic request/response schema extraction, so the engine knows the exact input/output shape of every function. | ||
|
|
||
| **nanochat.chat.complete** - `POST /nanochat/chat/completions` | ||
|
|
||
| Takes a list of messages (OpenAI-style `role`/`content` format), generates a completion using the loaded model. Supports `temperature`, `top_k`, and `max_tokens`. Persists the full conversation to iii state under `nanochat:sessions` with the returned `session_id`. | ||
|
|
||
| **nanochat.chat.stream** - `POST /nanochat/chat/stream` | ||
|
|
||
| Same as `chat.complete` but generates tokens one at a time internally. Currently returns the full text (not SSE streaming). Thetoken-by-token generation prevents the model from generating past `<|assistant_end|>` tokens, matching nanochat's original behavior. | ||
|
|
||
| **nanochat.chat.history** - `GET /nanochat/chat/history` | ||
|
|
||
| Reads conversation history from iii state. Pass `session_id` to get a specific session, or omit it to list all sessions. | ||
|
|
||
| **nanochat.model.load** - `POST /nanochat/model/load` | ||
|
|
||
| Loads a nanochat checkpoint into GPU memory. Accepts `source` ("base", "sft", or "rl"), optional `model_tag`, `step`, and `device`. After loading, writes model metadata to `nanochat:models` state scope. The loaded model is immediately available to all chat and eval functions. | ||
|
|
||
| **nanochat.model.status** - `GET /nanochat/model/status` | ||
|
|
||
| Returns current model state: whether a model is loaded, its source, device, architecture config (`n_layer`, `n_embd`, `vocab_size`, `sequence_len`), and total parameter count. | ||
|
|
||
| **nanochat.tokenizer.encode** - `POST /nanochat/tokenizer/encode` | ||
|
|
||
| Encodes text (string or list of strings) to BPE token IDs using nanochat's RustBPE tokenizer. Prepends BOS token automatically. Returns the token list and count. | ||
|
|
||
| **nanochat.tokenizer.decode** - `POST /nanochat/tokenizer/decode` | ||
|
|
||
| Decodes a list of token IDs back to text. | ||
|
|
||
| **nanochat.tools.execute** - `POST /nanochat/tools/execute` | ||
|
|
||
| Executes Python code in-process via `exec()`. Not sandboxed. Returns stdout, stderr, success status, and any errors. This mirrors nanochat's built-in tool use (calculator, code execution) that models learn during SFT training. Do not expose to untrusted input without additional isolation. | ||
|
|
||
| **nanochat.eval.core** - `POST /nanochat/eval/core` | ||
|
|
||
| Runs the CORE benchmark (DCLM paper) on the loaded model. Results are stored to `nanochat:evals` state scope with timestamps. | ||
|
|
||
| **nanochat.eval.loss** - `POST /nanochat/eval/loss` | ||
|
|
||
| Evaluates bits-per-byte on the validation set. This is the vocab-size-invariant loss metric nanochat uses to compare models across different tokenizers. | ||
|
|
||
| **nanochat.train.sft**:Queue `nanochat-training` | ||
|
|
||
| Runs supervised fine-tuning. This is a long-running function designed to be triggered via queue (`TriggerAction.Enqueue(queue="nanochat-training")`). Reports step-by-step progress and loss values to `nanochat:training` state scope. Other workers can poll `nanochat.train.status` to monitor progress. | ||
|
|
||
| **nanochat.train.status** - `GET /nanochat/train/status` | ||
|
|
||
| Reads training run status from iii state. Pass `run_id` to get a specific run, or omit it to list all runs. | ||
|
|
||
| **nanochat.health** - `GET /nanochat/health` | ||
|
|
||
| Returns worker health, model loaded status, device, and source. | ||
|
|
||
| ## State scopes | ||
|
|
||
| All persistent state goes through iii `state::get/set` primitives. The worker uses four scopes: | ||
|
|
||
| - **nanochat:sessions**:Conversation history keyed by session_id. Each entry contains the full message list, model source used, and token count. | ||
| - **nanochat:models**:Model metadata. The `current` key always reflects the loaded model's config. | ||
| - **nanochat:training**:Training run progress keyed by run_id. Contains status (running/complete/failed), step count, loss values, and device info. | ||
| - **nanochat:evals**:Evaluation results keyed by `core-{timestamp}` or `loss-{timestamp}`. Contains metric values and model source. | ||
|
|
||
| ## Testing | ||
|
|
||
| Tested against a live iii engine (v0.10.0) on macOS with Python 3.11. All 13 functions and 13 triggers register on connect. Functions that need a loaded model return clear error messages when none is loaded. The worker stays alive through all error cases. | ||
|
|
||
| ```text | ||
| OK nanochat.health {"status": "ok", "model_loaded": false} | ||
| OK nanochat.model.status {"loaded": false} | ||
| OK nanochat.chat.history {"sessions": []} | ||
| OK nanochat.train.status {"runs": []} | ||
| OK nanochat.tools.execute {"success": true, "stdout": "3628800\n"} | ||
| WARN nanochat.tokenizer.encode {"error": "tokenizer.pkl not found"} | ||
| WARN nanochat.tokenizer.decode {"error": "tokenizer.pkl not found"} | ||
| WARN nanochat.chat.complete {"error": "No model loaded"} | ||
| WARN nanochat.eval.core {"error": "No model loaded"} | ||
| OK nanochat.health {"status": "ok"} (still alive after errors) | ||
|
|
||
| 10/10 responded, 0 crashes | ||
| ``` | ||
|
|
||
| The WARN results are expected. `tokenizer.encode`/`decode` need a trained tokenizer (run `tok_train.py` first or load a model), and `chat.complete`/`eval.core` need a loaded model via `nanochat.model.load`. | ||
|
|
||
| ### Known issues | ||
|
|
||
| **Null payloads time out.** The iii-sdk v0.10.0 Python SDK drops invocations with `payload: None`. Always pass `payload: {}` for functions that don't need input. | ||
|
|
||
| **Unhandled handler exceptions crash the WebSocket.** If a handler raises without catching, the SDK's connection state corrupts and all subsequent calls fail with `function_not_found` until the worker reconnects. Every handler in this worker is wrapped with `safe()` to prevent this. | ||
|
|
||
| **`multiprocessing.Process` breaks the connection.** nanochat's original code execution sandbox uses `multiprocessing.Process`, but `fork()` in a multi-threaded Python process corrupts the SDK's asyncio event loop. We use in-process `exec()` with stdout/stderr capture instead. | ||
|
|
||
| ## Calling from other workers | ||
|
|
||
| Any worker on the same engine can invoke nanochat functions: | ||
|
|
||
| ```python | ||
| # Python | ||
| from iii import register_worker | ||
| iii = register_worker("ws://localhost:49134") | ||
|
|
||
| result = iii.trigger({ | ||
| "function_id": "nanochat.chat.complete", | ||
| "payload": { | ||
| "messages": [{"role": "user", "content": "What is the capital of France?"}], | ||
| "temperature": 0.8, | ||
| } | ||
| }) | ||
| print(result["content"]) | ||
| ``` | ||
|
|
||
| ```typescript | ||
| // TypeScript | ||
| import { registerWorker } from 'iii-sdk' | ||
| const iii = registerWorker('ws://localhost:49134') | ||
|
|
||
| const result = await iii.trigger({ | ||
| function_id: 'nanochat.chat.complete', | ||
| payload: { | ||
| messages: [{ role: 'user', content: 'What is the capital of France?' }], | ||
| temperature: 0.8, | ||
| }, | ||
| }) | ||
| ``` | ||
|
|
||
| ```rust | ||
| // Rust | ||
| let result = iii.trigger("nanochat.chat.complete", json!({ | ||
| "messages": [{"role": "user", "content": "What is the capital of France?"}], | ||
| "temperature": 0.8 | ||
| })).await?; | ||
| ``` | ||
|
|
||
| ## License | ||
|
|
||
| Apache-2.0 |
Submodule nanochat-upstream
added at
a44514
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| [build-system] | ||
| requires = ["setuptools>=64", "wheel"] | ||
| build-backend = "setuptools.backends._legacy:_Backend" | ||
|
|
||
| [project] | ||
| name = "iii-nanochat" | ||
| version = "0.1.0" | ||
| description = "nanochat LLM worker for iii-engine" | ||
| license = "Apache-2.0" | ||
| requires-python = ">=3.10" | ||
| dependencies = [ | ||
| "iii-sdk>=0.10.0", | ||
| "torch>=2.0", | ||
| "pydantic>=2.0", | ||
| "tiktoken", | ||
| "tokenizers", | ||
| "datasets", | ||
| "pyarrow", | ||
| "psutil", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| train = ["wandb"] | ||
|
|
||
| [project.scripts] | ||
| iii-nanochat = "worker:main" | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: iii-hq/workers
Length of output: 76
🏁 Script executed:
Repository: iii-hq/workers
Length of output: 385
Create
nanochat/__init__.pyand fix console script entry point.The nanochat package is missing
__init__.py, which means nanochat is not a proper Python package. This will cause the console script entry point"worker:main"to fail at runtime. Create an__init__.pyfile in the nanochat directory and update the entry point to"nanochat.worker:main".Additionally, add the missing
[build-system]section for PEP 517/518 compliance:🔧 Required fixes
Create
nanochat/__init__.py(can be empty or with version info):In
pyproject.toml:Update the console script entry point:
📝 Committable suggestion
🤖 Prompt for AI Agents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Added [build-system] section with setuptools backend (PEP 517/518).