Commit ade3e2f
committed
R20v20: Triple recurrence NUM_LOOPS=3 + Muon 0.97 + TTT LR 0.01
20 virtual layers from 11 physical (was 17 with NUM_LOOPS=2).
Layers 3-5 looped 4x total. May be slower per step but deeper model.
PR openai#1523 reports this as the biggest single improvement.1 parent 17883bc commit ade3e2f
1 file changed
+1
-1
lines changed
0 commit comments