Skip to content

Record: Causal SLOT + Pre-quant TTT — val_bpb 1.0846 (3-seed)#3

Closed
resouer wants to merge 1 commit intomainfrom
submission/causal-slot-1.0846
Closed

Record: Causal SLOT + Pre-quant TTT — val_bpb 1.0846 (3-seed)#3
resouer wants to merge 1 commit intomainfrom
submission/causal-slot-1.0846

Conversation

@resouer
Copy link
Copy Markdown
Owner

@resouer resouer commented Apr 3, 2026

Summary

  • 3-seed mean val_bpb: 1.0846 (std 0.0007)
  • Beats merged SOTA (1.1147) by 0.030
  • Artifact: ~15.95 MB (all seeds < 16MB)
  • Eval: ~551s / 600s budget

Novel Mechanism: Causal SLOT

Provably causal per-chunk delta optimization. Unlike standard SLOT (PR openai#1240 proved 100% causal violation), our delta is optimized using ONLY backward-looking loss from already-scored positions. Passes strict causality tests.

Stack

  • Coprime-stride multi-shard loader (-0.003)
  • 6-epoch pre-quant AdamW TTT (-0.022)
  • Causal SLOT (-0.009)
  • Training-data GPTQ calibration
  • Full Hessian GPTQ int6 + LZMA

Test plan

Generated with Claude Code

3-seed mean 1.0846 (std 0.0007). Beats merged SOTA (1.1147) by 0.030.

Novel: provably causal eval-time delta optimization (causal SLOT).
Unlike standard SLOT (PR openai#1240 proved 100% causal violation), delta
is optimized using only backward-looking loss from already-scored
positions. Combined with 6-epoch pre-quant AdamW TTT and
coprime-stride multi-shard data loading.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@resouer resouer force-pushed the submission/causal-slot-1.0846 branch from 8930d5a to d43a0f3 Compare April 3, 2026 16:34
@resouer resouer closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant