Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354) by simonbissonnette · Pull Request #466 · openai/parameter-golf

simonbissonnette · 2026-03-22T22:08:07Z

Summary

This PR adds a main-track submission attempt for the Parameter Golf challenge based on an 11-layer, 512-dim model with:

EMA (0.997)
BigramHash (12288, dim 128)
mixed low-bit quantization
stride-64 sliding evaluation

3-Seed Results

Seed 42: 1.13593695 val_bpb, 15,967,704 bytes total
Seed 471: 1.13389376 val_bpb, 15,663,365 bytes total
Seed 777: 1.13626774 val_bpb, 15,660,237 bytes total
Mean: 1.135366
Std: 0.001286

Notes

All three runs use the same train_gpt.py snapshot and the same hyperparameter recipe, differing only by seed.
This draft explicitly discloses that the current FA3 path uses kernels-community/flash-attn3, which fetches the FA3 kernel package at runtime.
No external model weights, prompts, or user code are fetched; the concern is only the runtime acquisition of the FA3 kernel package itself.
All three archived logs end with the final exact metric line.
I understand this may not beat the current open-PR SOTA, but I still wanted to submit a clean, reproducible main-track attempt.

mohosy · 2026-03-23T00:14:04Z

12288 buckets is a nice bump over 10240, did you ablate that or just go bigger for the hell of it lol. also the fa3 kernel fetch disclosure is appreciated thats good practice

simonbissonnette · 2026-03-23T00:59:25Z

12288 buckets is a nice bump over 10240, did you ablate that or just go bigger for the hell of it lol. also the fa3 kernel fetch disclosure is appreciated thats good practice

Thanks!

After some initial quantization work, I ended up with a bit of spare artifact budget, so I used part of it to increase BigramHash and improve BPB.

12288 ended up being the best practical tradeoff for this submission after a few trial-and-error runs. It gave a real gain over the smaller setting while still keeping all 3 submission seeds under the 16 MB cap.

MatoTeziTanka · 2026-04-11T20:16:42Z

[RETRACTED 2026-04-11] — This IMPORT_FAIL was a false positive. Root cause: sibling module exists in same records/ folder; runner sys.path bug. Your code is not broken. See correction below: #466 (comment)

Community Review — Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)

Compliance: NEEDS AUTHOR ACTION — train_gpt.py fails to import on CT2038 (Python 3.10 / torch 2.10.0+cpu)

What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with:

ModuleNotFoundError: No module named 'env_utils'

A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:

PEP 701 f-string nesting — e.g. log(f" {cat}: {", ".join(...)}") is valid Python 3.12+ but invalid Python 3.10 because the inner ", " re-enters the outer double-quote context. One-character fix: ', ' instead of ", ". See PR Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + LR 0.03 + Legal TTT — val_bpb 1.07785 (3-seed mean) #1541 / Record: SP8192 + Triple Recurrence + Banking + Fused MLP + Muon 0.97 — val_bpb 1.0778 (3-seed mean) #1523 for reference.
Missing flash_attn variants — e.g. from flash_attn_interface import flash_attn_varlen_func when the wrapper script only stubs flash_attn_func. Not a PR defect on H100s, but the eval image / CPU preflight path needs a guarded import.
Local compiled extension — e.g. import cutlass_evt_fusion from a records/*/cutlass_evt_fusion/ subfolder that isn't on the import path at smoke time. Usually an import-order issue inside the script.
Actual syntax error — typo, missing bracket, etc.

Recommendation: Could you run python3 -c "import py_compile; py_compile.compile('train_gpt.py')" on your records-folder train_gpt.py under Python 3.10 specifically? The eval image is Python 3.10 per Issue #17 / the README, so any parse error on 3.10 blocks the submission at import time before any of the scored-eval logic runs.

Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step.

Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'env_utils'. Classification via classify_prs.py AST-based classifier; full compliance audit deferred until the import issue is resolved. Auto-drafted from a template and spot-checked before posting.

simonbissonnette · 2026-04-11T20:43:37Z

Community Review — Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)

Compliance: NEEDS AUTHOR ACTION — train_gpt.py fails to import on CT2038 (Python 3.10 / torch 2.10.0+cpu)

What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with:
IMPORT_FAIL error=ModuleNotFoundError("No module named 'env_utils'")
A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:
* **PEP 701 f-string nesting** — e.g. `log(f"  {cat}: {", ".join(...)}")` is valid Python 3.12+ but invalid Python 3.10 because the inner `", "` re-enters the outer double-quote context. One-character fix: `', '` instead of `", "`. See PR [Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + LR 0.03 + Legal TTT — val_bpb 1.07785 (3-seed mean) #1541](https://github.com/openai/parameter-golf/pull/1541) / [Record: SP8192 + Triple Recurrence + Banking + Fused MLP + Muon 0.97 — val_bpb 1.0778 (3-seed mean) #1523](https://github.com/openai/parameter-golf/pull/1523) for reference.

* **Missing flash_attn variants** — e.g. `from flash_attn_interface import flash_attn_varlen_func` when the wrapper script only stubs `flash_attn_func`. Not a PR defect on H100s, but the eval image / CPU preflight path needs a guarded import.

* **Local compiled extension** — e.g. `import cutlass_evt_fusion` from a `records/*/cutlass_evt_fusion/` subfolder that isn't on the import path at smoke time. Usually an import-order issue inside the script.

* **Actual syntax error** — typo, missing bracket, etc.
Recommendation: Could you run python3 -c "import py_compile; py_compile.compile('train_gpt.py')" on your records-folder train_gpt.py under Python 3.10 specifically? The eval image is Python 3.10 per Issue #17 / the README, so any parse error on 3.10 blocks the submission at import time before any of the scored-eval logic runs.

Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step.

Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — IMPORT_FAIL error=ModuleNotFoundError("No module named 'env_utils'"). Classification via classify_prs.py AST-based classifier; full compliance audit deferred until the import issue is resolved. Auto-drafted from a template and spot-checked before posting.

Hey, thanks for the review!
I used env_util to help keep track of my parameters while iterating, and I forgot to remove it before submitting the PR.
I won’t be working on that version anymore since I’ve moved on to something new, but I’ll make sure to fix this issue before pushing my next PR.
Thanks again for pointing it out!

MatoTeziTanka · 2026-04-11T21:49:43Z

Retraction — this IMPORT_FAIL was a bug in my smoke runner

Sorry @simonbissonnette, this one's on me. I re-audited the IMPORT_FAIL I posted above and it was a false positive — the fault is in how my CPU smoke runner set up sys.path, not in your code.

What happened:

The runner imported your records/track_10min_16mb/2026-03-22_11L_EMA_Bigram12288_MixedInt5_FA3/train_gpt.py with only the script's folder implicitly on sys.path, so when your file did from env_utils import ... it couldn't resolve the sibling env_utils.py that lives in the same 2026-03-22_11L_EMA_Bigram12288_MixedInt5_FA3/ directory. The error I reported — ModuleNotFoundError: No module named 'env_utils' — looked like a missing file, but I re-checked the head SHA bdd21c4 and records/track_10min_16mb/2026-03-22_11L_EMA_Bigram12288_MixedInt5_FA3/env_utils.py is right there, committed to the PR, next to train_gpt.py.

Verified at head bdd21c4:

records/track_10min_16mb/2026-03-22_11L_EMA_Bigram12288_MixedInt5_FA3/env_utils.py   ← sibling module, exists
records/track_10min_16mb/2026-03-22_11L_EMA_Bigram12288_MixedInt5_FA3/train_gpt.py   ← imports it

On the real eval image (Python 3.10, records/*/ as the working dir), this import resolves correctly because the records folder ends up on sys.path via the standard cwd-driven import or via the eval harness's per-record entry point.

Your PR is not broken by this error. I'm retracting the IMPORT_FAIL classification. I'll re-queue the full compliance audit (BPB check, n-gram / TTT / SLOT flags, etc.) on the current head and post findings separately.

Again — sorry for the noise. These community reviews only work if I actually read what I'm reviewing, and I didn't in this case.

Add 11L EMA BigramHash(12288) mixed Int5 FA3 submission

bdd21c4

notapplica mentioned this pull request Mar 22, 2026

Parameter Golf Formerly Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes. Now disabled #140

Closed

yaowubarbara mentioned this pull request Mar 29, 2026

Non-record: LeakyReLU(0.9)² slope study — 1.1001 BPB (SLOT), pending credits for competitive run #1062

Open

simonbissonnette closed this Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)#466

Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)#466
simonbissonnette wants to merge 1 commit intoopenai:mainfrom
simonbissonnette:submission/11l-ema-bigram12288-mixed-int5-fa3

simonbissonnette commented Mar 22, 2026

Uh oh!

mohosy commented Mar 23, 2026

Uh oh!

simonbissonnette commented Mar 23, 2026

Uh oh!

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading

Uh oh!

simonbissonnette commented Apr 11, 2026

Community Review — Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)

Uh oh!

MatoTeziTanka commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

simonbissonnette commented Mar 22, 2026

Summary

3-Seed Results

Notes

Uh oh!

mohosy commented Mar 23, 2026

Uh oh!

simonbissonnette commented Mar 23, 2026

Uh oh!

MatoTeziTanka commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Community Review — Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)

Uh oh!

simonbissonnette commented Apr 11, 2026

Community Review — Record: 11L EMA + BigramHash(12288) + Mixed Int5 + FA3 (1.1354)

Uh oh!

MatoTeziTanka commented Apr 11, 2026

Retraction — this IMPORT_FAIL was a bug in my smoke runner

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading