Non-record: KNN Hidden State Retrieval — Scale Deception from Weak to Strong Models (8xH100) by himanshudongre · Pull Request #1259 · openai/parameter-golf

himanshudongre · 2026-04-02T16:13:22Z

Summary

Novel eval-time technique: KNN Hidden State Retrieval. Zero artifact cost, score-first protocol.

Key finding: helps weak models (-2 to -4%), HURTS strong competition-quality models (+1.5%).

Model Quality	KNN Effect
Weak (1xH100, 2K steps)	-1.57% (helps)
Strong (8xH100, 5665 steps, SOTA)	+1.47% (hurts)

Second confirmed case of scale deception in this competition (first: SSM in PR #1013/PR #1227).

8xH100 Results

Neural roundtrip BPB: 1.1533
KNN BPB: 1.1702 (+1.47% worse)
KNN eval time: 168s (fits in 600s)
Artifact: 15.8MB (KNN adds 0 bytes)

Implication

Eval-time prediction mixing has negative returns on strong models. Techniques that adapt the model (TTT) may work better than mixing external distributions.

Full scaling analysis in README.

… Strong Models Novel eval-time technique validated on 8xH100. Helps weak models (-2 to -4%), hurts competition-quality models (+1.5%). Definitive scale deception finding.

Non-record: KNN Hidden State Retrieval — Scale Deception from Weak to…

257b492

… Strong Models Novel eval-time technique validated on 8xH100. Helps weak models (-2 to -4%), hurts competition-quality models (+1.5%). Definitive scale deception finding.

himanshudongre mentioned this pull request Apr 4, 2026

Non-Record: Learning Adapters on Random Linear Maps — Selective Freeze, Progressive Freeze, and 7 Architecture Variants (H100 + A40 Validated) #1301

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: KNN Hidden State Retrieval — Scale Deception from Weak to Strong Models (8xH100)#1259

Non-record: KNN Hidden State Retrieval — Scale Deception from Weak to Strong Models (8xH100)#1259
himanshudongre wants to merge 1 commit intoopenai:mainfrom
himanshudongre:nonrecord/knn-scale-deception-8xh100

himanshudongre commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

himanshudongre commented Apr 2, 2026

Summary

8xH100 Results

Implication

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant