Record: SP8192 + QK-Gain 5 + Legal Score-First TTT — val_bpb 1.08279 (3-seed mean)#1413
Open
dexhunter wants to merge 1 commit intoopenai:mainfrom
Open
Record: SP8192 + QK-Gain 5 + Legal Score-First TTT — val_bpb 1.08279 (3-seed mean)#1413dexhunter wants to merge 1 commit intoopenai:mainfrom
dexhunter wants to merge 1 commit intoopenai:mainfrom
Conversation
…(3-seed mean) On PR openai#1394 (@clarkkev): added single-knob QK_GAIN_INIT=5.0 and a legal score-first TTT eval pass (TTT_LR=0.005, epochs=3, freeze=0) on top of the clean sp8192 base. Three independent seeds (0, 42, 1234) on 8xH100 SXM, all fitting 16MB with 7-11K margin. Per-seed (post-TTT): - seed 0 : 1.08210 (val_loss 2.79517) - seed 42 : 1.08315 (val_loss 2.79788) - seed 1234: 1.08314 (val_loss 2.79785) - mean : 1.08279 (2.79697 nats per token) Improvement vs PR openai#1394 (1.08563 mean): -0.00284 bpb = -0.00731 nats/token, clearing the 0.005 nats record threshold by 0.00231 nats per seed. No SLOT, no pre-quant TTT, no ETLB, no n-gram cache, no tokenizer change. Score-first TTT matches PR openai#549 precedent: every chunk scored under inference_mode() before any parameter update.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
On top of PR #1394 (@clarkkev) — the current clean sp8192 benchmark — this submission adds a single-knob
QK_GAIN_INIT=5.0and a legal score-first TTT eval pass (TTT_LR=0.005, 3 epochs, freeze=0).inference_mode()before any parameter update.Hardware: 8×H100 80GB SXM, PyTorch 2.9.1+cu128. See the README in the new folder for the full two-table results + diagnostics layout per repo
SUBMISSION_GUIDE.md.Per-seed (post-TTT)
Lineage / change from PR #1394
QK_GAIN_INITraised from 4.0 → 5.0; (2) added a legal score-first TTT sliding pass (LR=0.005, 3 epochs, freeze_blocks=0) as an additional eval mode.Compliance (Issue #1017 four conditions)
torch.inference_mode()BEFORE any parameter update. Training on a chunk only happens AFTER its scoring has been accumulated intoloss_sum. Matches the PR Record: LeakyReLU² + Legal Score-First TTT + Parallel Muon — val_bpb 1.1194 (3-seed mean) #549 pattern.Additional flags:
--frontier-bpp 1.08563 --merged-sota-nats 2.80428.Reproduction
Credits
Files
Only adds
records/track_10min_16mb/2026-04-06_SP8192_QK5_LegalTTT_1.0828/with README, submission.json, train_gpt.py, and 3 seed logs.