Commit 8490b8c
sota: Port SOTA PR openai#1493 code (1.0810 BPB) for 5090 testing
Decoded LZMA-compressed SOTA train_gpt.py. Replaced flash_attn_3_func
with PyTorch SDPA (transpose to B,H,T,D format + enable_gqa).
Full stack: 11L, 4xMLP, LeakyReLU², XSA, depth recurrence, parallel
residuals, LN Scale, partial RoPE, EMA, GPTQ SDClip, TTT, brotli.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 8db9e14 commit 8490b8c
1 file changed
+461
-1375
lines changed
0 commit comments