Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions notes/pr2_update_summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# PR #2 Update Summary

## Chosen environment path

- `runpod_parameter_golf`

## Why chosen

- It is the most concrete runnable path already documented in the repo.
- It reduces ambiguity around dependencies, dataset placement, and GPU shape.

## What remains before the evidence run

- create or access the Runpod environment
- clone the repo and check out `exp/eval-first-003`
- download the published `sp1024` assets
- execute the fixed baseline and candidate commands

## Review state

- review comments: none observed during this turn
71 changes: 71 additions & 0 deletions notes/tpi_003_environment_decision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# TPI-003 Environment Decision

## Chosen environment

- `runpod_parameter_golf`

## Why chosen

- The repository README already defines this path concretely.
- It is the smallest runnable path with clear dependency assumptions.
- It is closer to challenge conditions than an unspecified remote machine.
- It keeps the same monkey-model eval-first branch and only changes environment readiness.

## Rejected alternatives

### `local_repair`

- Rejected as primary path because three blockers stack at once:
- `torch` missing
- dataset/tokenizer assets missing
- GPU access blocked

### `remote_gpu_small`

- Rejected as primary path because it is less specific than the Runpod route.
- It risks wasting time on ad hoc package, path, and logging setup that the documented Runpod path already solves more cleanly.

## Required assets

- repo checkout at `exp/eval-first-003`
- Python environment with `torch`, `datasets`, `sentencepiece`
- `/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/`
- `/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model`
- writable `logs/` and `runs/TPI-003/`

## Baseline env vars

- `DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/`
- `TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model`
- `VOCAB_SIZE=1024`
- `TRAIN_SEQ_LEN=1024`
- `EVAL_STRIDE=1024`
- `MAX_WALLCLOCK_SECONDS=600`
- `TRAIN_LOG_EVERY=50`
- `VAL_LOSS_EVERY=200`

## Candidate env vars

- same as baseline except `EVAL_STRIDE=128`
- optional second candidate: `EVAL_STRIDE=64`

## Command skeleton

```bash
RUN_ID=<run_id> \
DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \
TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \
VOCAB_SIZE=1024 \
TRAIN_SEQ_LEN=1024 \
EVAL_STRIDE=<stride> \
MAX_WALLCLOCK_SECONDS=600 \
TRAIN_LOG_EVERY=50 \
VAL_LOSS_EVERY=200 \
torchrun --standalone --nproc_per_node=1 train_gpt.py
```

## First command to run next turn

```bash
python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1
```
44 changes: 44 additions & 0 deletions notes/tpi_003_environment_plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# TPI-003 Environment Plan

## Objective

Select the smallest runnable environment path that can produce one real baseline/candidate evidence pair for the existing eval-first monkey-model policy.

## Public-facing name

`MonkeyModel_EvalFirst_MinRunnableEnv`

## Candidate environment paths

1. `local_repair`
- install missing dependencies locally
- acquire published dataset/tokenizer assets locally
- verify GPU/runtime availability locally

2. `remote_gpu_small`
- use a minimal remote CUDA environment
- run one baseline/candidate pair with the same public-safe branch

3. `runpod_parameter_golf`
- use the challenge-aligned Runpod path
- fetch the published dataset/tokenizer assets there
- execute one baseline/candidate pair under a more official environment shape

## Selection criteria

- shortest path to one real evidence pair
- command simplicity
- reproducibility
- fit with current monkey-model eval-first branch
- lowest setup overhead that still yields runtime + val_bpb

## Current recommendation

Prefer `runpod_parameter_golf` unless an already-usable remote CUDA environment exists. The local path is currently the weakest candidate because torch, assets, and GPU availability are all blocked at once.

## Selection outcome for this turn

- Chosen path: `runpod_parameter_golf`
- Reason: it is the most explicit public-safe path already documented in the repo, with the least ambiguity about dependency readiness and challenge-compatible execution shape.
- Deferred path: `remote_gpu_small`
- Rejected primary path: `local_repair`
116 changes: 116 additions & 0 deletions notes/tpi_003_execution_contract.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# TPI-003 Execution Contract

## Objective

Fix one executable baseline/candidate command contract for the existing eval-first monkey-model branch.

## Baseline contract

- branch: `exp/eval-first-003`
- mode: non-sliding validation behavior
- effective setting: `EVAL_STRIDE=TRAIN_SEQ_LEN`
- chosen environment: `runpod_parameter_golf`
- target host shape for first pass: `1xH100` Runpod pod
- tokenizer path: `/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model`
- dataset path: `/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/`
- logs:
- script-native log: `logs/${RUN_ID}.txt`
- turn note archive: `runs/TPI-003/`
- commit SHA capture:
- `git rev-parse HEAD > runs/TPI-003/<run_id>.commit.txt`

## Candidate contract

- branch: `exp/eval-first-003`
- mode: eval-first sliding validation
- primary candidate: `EVAL_STRIDE=128`
- optional secondary candidate: `EVAL_STRIDE=64`

## Baseline command

```bash
cd /workspace
git clone https://github.com/gb250e/parameter-golf.git
cd parameter-golf
git checkout exp/eval-first-003
python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1
mkdir -p runs/TPI-003
git rev-parse HEAD > runs/TPI-003/tpi003_baseline.commit.txt
RUN_ID=tpi003_baseline_stride1024 \
DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \
TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \
VOCAB_SIZE=1024 \
TRAIN_SEQ_LEN=1024 \
EVAL_STRIDE=1024 \
MAX_WALLCLOCK_SECONDS=600 \
TRAIN_LOG_EVERY=50 \
VAL_LOSS_EVERY=200 \
torchrun --standalone --nproc_per_node=1 train_gpt.py | tee runs/TPI-003/tpi003_baseline.stdout.log
```

## Candidate command

```bash
cd /workspace/parameter-golf
git checkout exp/eval-first-003
mkdir -p runs/TPI-003
git rev-parse HEAD > runs/TPI-003/tpi003_candidate_128.commit.txt
RUN_ID=tpi003_candidate_stride128 \
DATA_PATH=/workspace/parameter-golf/data/datasets/fineweb10B_sp1024/ \
TOKENIZER_PATH=/workspace/parameter-golf/data/tokenizers/fineweb_1024_bpe.model \
VOCAB_SIZE=1024 \
TRAIN_SEQ_LEN=1024 \
EVAL_STRIDE=128 \
MAX_WALLCLOCK_SECONDS=600 \
TRAIN_LOG_EVERY=50 \
VAL_LOSS_EVERY=200 \
torchrun --standalone --nproc_per_node=1 train_gpt.py | tee runs/TPI-003/tpi003_candidate_128.stdout.log
```

## Required env vars for both runs

- `DATA_PATH`
- `TOKENIZER_PATH`
- `VOCAB_SIZE=1024`
- `TRAIN_SEQ_LEN=1024`
- `EVAL_STRIDE`
- `MAX_WALLCLOCK_SECONDS=600`
- `TRAIN_LOG_EVERY=50`
- `VAL_LOSS_EVERY=200`

## GPU assumption

- first runnable path assumes `1xH100`
- this is for evidence collection, not final record-track timing

## Log policy

- keep terminal stdout in `runs/TPI-003/*.stdout.log`
- keep script-native logs in `logs/`
- summarize runtime and `val_bpb` back into notes after the run

## Next-turn first command

```bash
python3 data/cached_challenge_fineweb.py --variant sp1024 --train-shards 1
```

## Minimum required assets

- Python environment with `torch`, `datasets`, and `sentencepiece`
- accessible CUDA runtime for `train_gpt.py`
- published FineWeb cached shards or equivalent challenge-provided dataset path
- published tokenizer model path

## Minimum required capture

- commit SHA
- command line
- runtime notes
- whether eval path was reached
- final runtime summary
- final val_bpb summary

## Environment decision rule

Choose the environment path that can produce one real baseline/candidate pair with the least additional setup while remaining reproducible and public-safe.