Non-record: 5L MLP4x + SlidingWindow + SWA + QAT — val_bpb 1.33 (1xH100)#842
Non-record: 5L MLP4x + SlidingWindow + SWA + QAT — val_bpb 1.33 (1xH100)#842JUSTSUJAY wants to merge 2 commits intoopenai:mainfrom
Conversation
…H100) Autonomous AI-driven exploration of 16 experiments using autoresearch framework. Key discovery: 5L MLP4x significantly outperforms deeper narrower architectures on single-GPU compute budgets. Techniques: BigramHash(4096), SmearGate, OrthoInit, QAT (int8 STE), SWA (18 checkpoints), sliding window eval (stride=64). 14.6MB artifact, 15.5M params, 4573 steps in 300s on 1xH100 80GB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Doubled training time to 10 minutes (challenge limit). 9353 steps with 38 SWA checkpoints. val_bpb improved from 1.3827 to 1.3380. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Community Review — Non-record: 5L MLP4x + SlidingWindow + SWA + QAT — val_bpb 1.33 (1xH100)Compliance: NEEDS AUTHOR ACTION — What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with: A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:
Recommendation: Could you run Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step. Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'prepare_pgolf'. Classification via |
Autonomous AI-driven exploration of 16 experiments using autoresearch framework. Key discovery: 5L MLP4x significantly outperforms deeper narrower architectures on single-GPU compute budgets.
Techniques: BigramHash(4096), SmearGate, OrthoInit, QAT (int8 STE), SWA (18 checkpoints), sliding window eval (stride=64).
14.6MB artifact, 15.5M params, 4573 steps in 300s on 1xH100 80GB.