Commit e391487
SLOT debugging: implementation works, needs 8×H100 for proper testing
FiLM+SLOT eval started successfully (39.6 GB VRAM, 100% GPU util).
But stride=64 SLOT eval on full val set takes 30+ min on 1 GPU.
Models are undertrained on 1 GPU (EMA diverges, GPTQ bad).
Killed after confirming SLOT runs — proper test needs 8×H100.
openai#1313's SLOT eval failed on 1 GPU due to double torch.compile
on eval_model (compiled_eval + compiled_logits inside eval_val_sliding).
SLOT is architecture-agnostic. If FiLM provides better hidden states
(evidence: 0.090 BPP pre-quant advantage), FiLM+SLOT could beat openai#1313.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 156e623 commit e391487
1 file changed
Binary file not shown.
0 commit comments