Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881) by hypery11 · Pull Request #795 · openai/parameter-golf

hypery11 · 2026-03-26T01:31:55Z

Results

Seed	val_bpb
42	0.8883
1337	0.8886
2024	0.8875
Mean	0.8881
Std	0.0006

Artifact: 13.99 MB
Train: 600s on 8xH100 SXM
Eval: ~160s

Method

11-layer XSA-all transformer with order-adaptive entropy-gated n-gram backoff (orders 2-11). Higher-order matches use lower entropy threshold. GPTQ-lite int6 + zstd-22. Score-first, deterministic, no TTT.

8xH100 SXM, train <=600s
Eval <=600s (~160s)
Artifact <=16MB (13.99MB)
3-seed validation (std 0.0006)

Seeds: 0.8883 / 0.8886 / 0.8875 (std 0.0006). Order-adaptive entropy gating on 2-11 gram backoff. 13.99MB artifact. Train 600s, eval ~160s.

MatoTeziTanka · 2026-04-11T20:03:58Z

Community Review — Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881)

BPB: 0.8881 | Compliance: FLAG — hashed n-gram cache with target-in-key (PR #779 family pattern)

What I found in the code (head SHA aa7cb3c537d4, file records/track_10min_16mb/2026-03-26_11L_order_adaptive_11gram/train_gpt.py):

The n-gram lookup key at line 1154 is constructed by XOR-ing the target token into the hash:

line 1154: full_key = <hash> ^ (tgt_np * ng_primes[...]) & mask

This matches the full_key = ((ctx_hash ^ (target * primes[k])) & mask) construction that @valerio-oai ruled disallowed on PR #779 (comment 4145781641, 2026-03-27). Per the mechanism explanation, hashing the target token into the lookup key only reweights the correct token — in the hash-collision limit this drives P(correct) → 1 regardless of the data, which inflates the reported BPB without producing real compression.

Per Issue #1017 condition 1, p_t may depend only on the artifact and x_1...x_{t-1}. Because the lookup key at line 1154 is a function of the target token, the count read at scoring position t depends on x_t itself — which is the core violation the #779 ruling targets.

Cluster context: this same structural pattern has been closed on 15+ PRs under the #779 ruling as of 2026-04-11 (#779 itself, #770, #798, #808, #825, #786, #797, #909, #940, #761, #776, #788, #774, #778, #715, #758, #702 upstream, #1488). The base neural model is unaffected by this flag — in every case where the authors resubmitted without the n-gram cache, the base val_bpb has been in the ~1.10-1.15 range (standard for the SP1024 11L class).

CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.05s, dim=512, layers=11, vocab=1024, code=88116 B, SMOKE_TEST_PASS

Verdict: COMPLIANCE FLAG — target-in-key hashed n-gram cache, same family as PR #779.

Recommendation to @cocohearts @valerio-oai @0hq @yuzhougu-oai @notapplica: CLOSE under the same ruling as the rest of the family-bug cluster. A context-only resubmission (drop the target from the lookup key and use a full-vocabulary reweighting from a single context row, per @valerio-oai's suggested legal path on #779) would be welcomed.

Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.05s, dim=512, layers=11, vocab=1024, code=88116 B, SMOKE_TEST_PASS. Classification via deterministic AST-based classify_prs.py (pattern bank derived from ~65 manually-reviewed PRs earlier in the 2026-04-11 sweep). This review was auto-drafted from a template and spot-checked before posting — if the template misread your code, please call it out so I can iterate the classifier.

Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881, 3 seeds)

aa7cb3c

Seeds: 0.8883 / 0.8886 / 0.8875 (std 0.0006). Order-adaptive entropy gating on 2-11 gram backoff. 13.99MB artifact. Train 600s, eval ~160s.

notapplica mentioned this pull request Mar 26, 2026

Parameter Golf Formerly Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes. Now disabled #140

Closed

MatoTeziTanka mentioned this pull request Mar 26, 2026

PROTEUS+STYX — val_bpb 0.8495 (3-seed mean) — LeakyReLU(0.9)² + 5-gram Eval Cache #769

Closed

10 tasks

sofiabod mentioned this pull request Mar 26, 2026

Record: Order-Adaptive 9-gram Backoff + Distributed Prefill — val_bpb 0.4405 (3-seed mean) #890

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881)#795

Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881)#795
hypery11 wants to merge 1 commit intoopenai:mainfrom
hypery11:submission/2026-03-26_champion_v2

hypery11 commented Mar 26, 2026

Uh oh!

MatoTeziTanka commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hypery11 commented Mar 26, 2026

Results

Method

Uh oh!

MatoTeziTanka commented Apr 11, 2026

Community Review — Record: 11L + order-adaptive 11-gram (mean val_bpb=0.8881)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants