Skip to content

Commit cd7405e

Browse files
yuyeonclaude
andcommitted
FiLM+SLOT implementation + SLOT24 baseline for comparison
SLOT (Scored-position Learnable Optimization at Test-time): - Per-sample delta [bsz,1,dim] + logit_bias [bsz,1,vocab] - 24 AdamW steps with cosine LR on frozen hidden states - Architecture-agnostic — works on any model with _encode() PR openai#1313 (SLOT-24) achieves 0.8637 BPB on 8×H100. PR openai#1229 achieves 0.9300 BPB. Both use SLOT on SOTA architecture. Running SLOT24 baseline on our 1×H100 for fair comparison. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 6601b83 commit cd7405e

File tree

3 files changed

+3109
-0
lines changed

3 files changed

+3109
-0
lines changed

0 commit comments

Comments
 (0)