Commit cd7405e
FiLM+SLOT implementation + SLOT24 baseline for comparison
SLOT (Scored-position Learnable Optimization at Test-time):
- Per-sample delta [bsz,1,dim] + logit_bias [bsz,1,vocab]
- 24 AdamW steps with cosine LR on frozen hidden states
- Architecture-agnostic — works on any model with _encode()
PR openai#1313 (SLOT-24) achieves 0.8637 BPB on 8×H100.
PR openai#1229 achieves 0.9300 BPB. Both use SLOT on SOTA architecture.
Running SLOT24 baseline on our 1×H100 for fair comparison.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 6601b83 commit cd7405e
File tree
3 files changed
+3109
-0
lines changed- experiments
- film_slot
- slot24_baseline
3 files changed
+3109
-0
lines changed
0 commit comments