File tree Expand file tree Collapse file tree 3 files changed +2071
-0
lines changed
Expand file tree Collapse file tree 3 files changed +2071
-0
lines changed Original file line number Diff line number Diff line change 1+ # Competitive Entry
2+
3+ Base: PR #761 (Score-First TTT + Multi-Order N-gram Backoff, 0.9581 BPB)
4+
5+ ## Architecture
6+ - 11L, 512d, GQA (8H/4KV), MLP 3x
7+ - LeakyReLU(0.9)², XSA on all 11 layers
8+ - Value Residual, Gated Attention, SmearGate
9+ - BigramHash(4096), Partial RoPE (16/64), LN Scale
10+ - EMA(0.997), warmdown=3000, int6 per-row + zstd-16
11+
12+ ## Eval
13+ - Sliding window stride=64
14+ - Multi-order n-gram backoff (orders 2-7)
15+ - Entropy-adaptive alpha
16+ - Score-first TTT (4 epochs, AdamW lr=0.0001, freeze first 2 blocks)
17+
18+ ## Run
19+ ``` bash
20+ SEED=1337 NGRAM_CACHE=1 TTT_ENABLED=1 \
21+ torchrun --standalone --nproc_per_node=8 train_gpt.py
22+ ```
Original file line number Diff line number Diff line change 1+ #! /bin/bash
2+ set -e
3+ echo " === COMPETITIVE ENTRY DEPLOY ==="
4+ echo " base: PR #761 (0.9581 BPB, 3-seed validated)"
5+ echo " start: $( date) "
6+
7+ cd /workspace
8+ git clone https://github.com/openai/parameter-golf.git pg 2>&1 | tail -1
9+ cd pg
10+ pip install sentencepiece flash-attn zstandard huggingface_hub -q 2>&1 | tail -1
11+ python3 data/cached_challenge_fineweb.py --variant sp1024 2>&1 | tail -3
12+
13+ # use PR #761 train_gpt.py (fetched from our fork)
14+ git clone -b ppmd-submission https://github.com/pablinga19/parameter-golf.git /workspace/ours 2>&1 | tail -1
15+ cp /workspace/ours/competitive_entry/train_gpt.py .
16+
17+ echo " --- TRAINING (seed 1337) ---"
18+ SEED=1337 NGRAM_CACHE=1 torchrun --standalone --nproc_per_node=8 train_gpt.py 2>&1 | tail -15
19+
20+ echo " --- TRAINING (seed 42) ---"
21+ SEED=42 NGRAM_CACHE=1 torchrun --standalone --nproc_per_node=8 train_gpt.py 2>&1 | tail -15
22+
23+ echo " --- TRAINING (seed 7) ---"
24+ SEED=7 NGRAM_CACHE=1 torchrun --standalone --nproc_per_node=8 train_gpt.py 2>&1 | tail -15
25+
26+ echo " === DONE ==="
27+ echo " end: $( date) "
You can’t perform that action at this time.
0 commit comments