From f1ac5cf2146264f733d1764fb0a067a8f96acd7f Mon Sep 17 00:00:00 2001 From: gb250e <71205769+gb250e@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:12:57 -0700 Subject: [PATCH 1/4] docs: add TPI-003 environment plan --- notes/tpi_003_environment_plan.md | 37 +++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 notes/tpi_003_environment_plan.md diff --git a/notes/tpi_003_environment_plan.md b/notes/tpi_003_environment_plan.md new file mode 100644 index 0000000000..ce3c1559de --- /dev/null +++ b/notes/tpi_003_environment_plan.md @@ -0,0 +1,37 @@ +# TPI-003 Environment Plan + +## Objective + +Select the smallest runnable environment path that can produce one real baseline/candidate evidence pair for the existing eval-first monkey-model policy. + +## Public-facing name + +`MonkeyModel_EvalFirst_MinRunnableEnv` + +## Candidate environment paths + +1. `local_repair` + - install missing dependencies locally + - acquire published dataset/tokenizer assets locally + - verify GPU/runtime availability locally + +2. `remote_gpu_small` + - use a minimal remote CUDA environment + - run one baseline/candidate pair with the same public-safe branch + +3. `runpod_parameter_golf` + - use the challenge-aligned Runpod path + - fetch the published dataset/tokenizer assets there + - execute one baseline/candidate pair under a more official environment shape + +## Selection criteria + +- shortest path to one real evidence pair +- command simplicity +- reproducibility +- fit with current monkey-model eval-first branch +- lowest setup overhead that still yields runtime + val_bpb + +## Current recommendation + +Prefer `runpod_parameter_golf` unless an already-usable remote CUDA environment exists. The local path is currently the weakest candidate because torch, assets, and GPU availability are all blocked at once. From bc7fdc55a1764bdf583ca972b8904b6b7047e80f Mon Sep 17 00:00:00 2001 From: gb250e <71205769+gb250e@users.noreply.github.com> Date: Fri, 20 Mar 2026 22:13:28 -0700 Subject: [PATCH 2/4] docs: add TPI-003 execution contract --- notes/tpi_003_execution_contract.md | 38 +++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 notes/tpi_003_execution_contract.md diff --git a/notes/tpi_003_execution_contract.md b/notes/tpi_003_execution_contract.md new file mode 100644 index 0000000000..d4e853387d --- /dev/null +++ b/notes/tpi_003_execution_contract.md @@ -0,0 +1,38 @@ +# TPI-003 Execution Contract + +## Objective + +Fix one executable baseline/candidate command contract for the existing eval-first monkey-model branch. + +## Baseline contract + +- branch: `exp/eval-first-003` +- mode: non-sliding validation behavior +- effective setting: `EVAL_STRIDE=TRAIN_SEQ_LEN` + +## Candidate contract + +- branch: `exp/eval-first-003` +- mode: eval-first sliding validation +- primary candidate: `EVAL_STRIDE=128` +- optional secondary candidate: `EVAL_STRIDE=64` + +## Minimum required assets + +- Python environment with `torch`, `datasets`, and `sentencepiece` +- accessible CUDA runtime for `train_gpt.py` +- published FineWeb cached shards or equivalent challenge-provided dataset path +- published tokenizer model path + +## Minimum required capture + +- commit SHA +- command line +- runtime notes +- whether eval path was reached +- final runtime summary +- final val_bpb summary + +## Environment decision rule + +Choose the environment path that can produce one real baseline/candidate pair with the least additional setup while remaining reproducible and public-safe. From af2ce3dc6b9fdebacc2efa1837b6754bee2617e2 Mon Sep 17 00:00:00 2001 From: eb24516 Date: Sat, 21 Mar 2026 14:18:29 +0900 Subject: [PATCH 3/4] docs: fix TPI-003 environment contract --- notes/tpi_003_environment_decision.md | 71 ++++++++++++++++++++++++ notes/tpi_003_environment_plan.md | 7 +++ notes/tpi_003_execution_contract.md | 78 +++++++++++++++++++++++++++ 3 files changed, 156 insertions(+) create mode 100644 notes/tpi_003_environment_decision.md diff --git a/notes/tpi_003_environment_decision.md b/notes/tpi_003_environment_decision.md new file mode 100644 index 0000000000..fca3f82cd2 --- /dev/null +++ b/notes/tpi_003_environment_decision.md @@ -0,0 +1,71 @@ +# TPI-003 Environment Decision + +## Chosen environment + +- `runpod_parameter_golf` + +## Why chosen + +- The repository README already defines this path concretely. +- It is the smallest runnable path with clear dependency assumptions. +- It is closer to challenge conditions than an unspecified remote machine. +- It keeps the same monkey-model eval-first branch and only changes environment readiness. + +## Rejected alternatives + +### `local_repair` + +- Rejected as primary path because three blockers stack at once: + - `torch` missing + - dataset/tokenizer assets missing + - GPU access blocked + +### `remote_gpu_small` + +- Rejected as primary path because it is less specific than the Runpod route. +- It risks wasting time on ad hoc package, path, and logging setup that the documented Runpod path already solves more cleanly. + +## Required assets + +- repo checkout at `exp/eval-first-003` +- Python environment with `torch`, `datasets`, `sentencepiece` +- `/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/` +- `/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model` +- writable `logs/` and `runs/TPI-003/` + +## Baseline env vars + +- `DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/` +- `TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model` +- `VOCAB_SIZE=1024` +- `TRAIN_SEQ_LEN=1024` +- `EVAL_STRIDE=1024` +- `MAX_WALLCLOCK_SECONDS=600` +- `TRAIN_LOG_EVERY=50` +- `VAL_LOSS_EVERY=200` + +## Candidate env vars + +- same as baseline except `EVAL_STRIDE=128` +- optional second candidate: `EVAL_STRIDE=64` + +## Command skeleton + +```bash +RUN_ID= \ +DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \ +TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \ +VOCAB_SIZE=1024 \ +TRAIN_SEQ_LEN=1024 \ +EVAL_STRIDE= \ +MAX_WALLCLOCK_SECONDS=600 \ +TRAIN_LOG_EVERY=50 \ +VAL_LOSS_EVERY=200 \ +torchrun --standalone --nproc_per_node=1 train_gpt.py +``` + +## First command to run next turn + +```bash +python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1 +``` diff --git a/notes/tpi_003_environment_plan.md b/notes/tpi_003_environment_plan.md index ce3c1559de..ea6290d850 100644 --- a/notes/tpi_003_environment_plan.md +++ b/notes/tpi_003_environment_plan.md @@ -35,3 +35,10 @@ Select the smallest runnable environment path that can produce one real baseline ## Current recommendation Prefer `runpod_parameter_golf` unless an already-usable remote CUDA environment exists. The local path is currently the weakest candidate because torch, assets, and GPU availability are all blocked at once. + +## Selection outcome for this turn + +- Chosen path: `runpod_parameter_golf` +- Reason: it is the most explicit public-safe path already documented in the repo, with the least ambiguity about dependency readiness and challenge-compatible execution shape. +- Deferred path: `remote_gpu_small` +- Rejected primary path: `local_repair` diff --git a/notes/tpi_003_execution_contract.md b/notes/tpi_003_execution_contract.md index d4e853387d..81db30b9f3 100644 --- a/notes/tpi_003_execution_contract.md +++ b/notes/tpi_003_execution_contract.md @@ -9,6 +9,15 @@ Fix one executable baseline/candidate command contract for the existing eval-fir - branch: `exp/eval-first-003` - mode: non-sliding validation behavior - effective setting: `EVAL_STRIDE=TRAIN_SEQ_LEN` +- chosen environment: `runpod_parameter_golf` +- target host shape for first pass: `1xH100` Runpod pod +- tokenizer path: `/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model` +- dataset path: `/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/` +- logs: + - script-native log: `logs/${RUN_ID}.txt` + - turn note archive: `runs/TPI-003/` +- commit SHA capture: + - `git rev-parse HEAD > runs/TPI-003/.commit.txt` ## Candidate contract @@ -17,6 +26,75 @@ Fix one executable baseline/candidate command contract for the existing eval-fir - primary candidate: `EVAL_STRIDE=128` - optional secondary candidate: `EVAL_STRIDE=64` +## Baseline command + +```bash +cd /workspace +git clone https://github.com/gb250e/parameter-golf.git +cd parameter-golf +git checkout exp/eval-first-003 +python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1 +mkdir -p runs/TPI-003 +git rev-parse HEAD > runs/TPI-003/tpi003_baseline.commit.txt +RUN_ID=tpi003_baseline_stride1024 \ +DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \ +TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \ +VOCAB_SIZE=1024 \ +TRAIN_SEQ_LEN=1024 \ +EVAL_STRIDE=1024 \ +MAX_WALLCLOCK_SECONDS=600 \ +TRAIN_LOG_EVERY=50 \ +VAL_LOSS_EVERY=200 \ +torchrun --standalone --nproc_per_node=1 train_gpt.py | tee runs/TPI-003/tpi003_baseline.stdout.log +``` + +## Candidate command + +```bash +cd /workspace/parameter-golf +git checkout exp/eval-first-003 +mkdir -p runs/TPI-003 +git rev-parse HEAD > runs/TPI-003/tpi003_candidate_128.commit.txt +RUN_ID=tpi003_candidate_stride128 \ +DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \ +TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \ +VOCAB_SIZE=1024 \ +TRAIN_SEQ_LEN=1024 \ +EVAL_STRIDE=128 \ +MAX_WALLCLOCK_SECONDS=600 \ +TRAIN_LOG_EVERY=50 \ +VAL_LOSS_EVERY=200 \ +torchrun --standalone --nproc_per_node=1 train_gpt.py | tee runs/TPI-003/tpi003_candidate_128.stdout.log +``` + +## Required env vars for both runs + +- `DATA_PATH` +- `TOKENIZER_PATH` +- `VOCAB_SIZE=1024` +- `TRAIN_SEQ_LEN=1024` +- `EVAL_STRIDE` +- `MAX_WALLCLOCK_SECONDS=600` +- `TRAIN_LOG_EVERY=50` +- `VAL_LOSS_EVERY=200` + +## GPU assumption + +- first runnable path assumes `1xH100` +- this is for evidence collection, not final record-track timing + +## Log policy + +- keep terminal stdout in `runs/TPI-003/*.stdout.log` +- keep script-native logs in `logs/` +- summarize runtime and `val_bpb` back into notes after the run + +## Next-turn first command + +```bash +python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1 +``` + ## Minimum required assets - Python environment with `torch`, `datasets`, and `sentencepiece` From 409b3491753aa7ec035b4cadd139861c30684027 Mon Sep 17 00:00:00 2001 From: eb24516 Date: Sat, 21 Mar 2026 14:18:35 +0900 Subject: [PATCH 4/4] docs: summarize PR #2 runnable path selection --- notes/pr2_update_summary.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 notes/pr2_update_summary.md diff --git a/notes/pr2_update_summary.md b/notes/pr2_update_summary.md new file mode 100644 index 0000000000..4203893d1f --- /dev/null +++ b/notes/pr2_update_summary.md @@ -0,0 +1,21 @@ +# PR #2 Update Summary + +## Chosen environment path + +- `runpod_parameter_golf` + +## Why chosen + +- It is the most concrete runnable path already documented in the repo. +- It reduces ambiguity around dependencies, dataset placement, and GPU shape. + +## What remains before the evidence run + +- create or access the Runpod environment +- clone the repo and check out `exp/eval-first-003` +- download the published `sp1024` assets +- execute the fixed baseline and candidate commands + +## Review state + +- review comments: none observed during this turn