Record Submission: 0.9258 BPB — Kitchen Sink (7-gram + XSA6 + BigramHash4K + Cosine TTT)#776
Record Submission: 0.9258 BPB — Kitchen Sink (7-gram + XSA6 + BigramHash4K + Cosine TTT)#776agalimova wants to merge 2 commits intoopenai:mainfrom
Conversation
Built on PR openai#700 with hyperparameter improvements found via autoresearch-multi combinatorial search: - XSA_LAST_N=6 (extended from 4 to 6 layers) - BIGRAM_VOCAB_SIZE=4096 (doubled from 2048) 3-seed mean: 1.1078 (std 0.0045) Seeds: 42=1.1045, 1337=1.1061, 2025=1.1129 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ash4K) Built on PR openai#741 with hyperparameter improvements found via autoresearch-multi combinatorial search: - XSA_LAST_N=6, BIGRAM_VOCAB_SIZE=4096, NGRAM_ORDER=7, NGRAM_ALPHA_HIGH=0.50 2-seed mean: 0.9258 (seeds 1337=0.9249, 42=0.9266) Eval time: ~520s (under 10-min budget) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Community Review — Kitchen Sink (7-gram + XSA6 + BigramHash4K + Cosine TTT)BPB: 0.9258 (2 of 3 seeds) | Track: 10min/16MB | Compliance: FLAG What this does (per the README + records What I found in the code (records/track_10min_16mb/2026-03-25_KitchenSink_7gram_CosineTTT_0.9258/train_gpt.py):
Smoke test (CT2038 proteus-engine, 2026-04-11): Questions/flags:
Verdict: COMPLIANCE FLAG — n-gram-target-in-key family pattern present in Recommendation to @cocohearts @valerio-oai @0hq @yuzhougu-oai @notapplica:
Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): submission_train_gpt.py imports OK, Hyperparameters/GPT resolve, code size 97,234 bytes; no GPU forward attempted. AI tooling: review drafted with Claude Code (Sonnet/Opus) using an internal review template; all citations, file paths, and compliance audits were verified against the PR's actual code at SHA |
Summary
Changes from PR #741
XSA_LAST_NBIGRAM_VOCAB_SIZENGRAM_ORDERNGRAM_ALPHA_HIGHTest plan
🤖 Generated with Claude Code