Non-record: 10L MLP3x int6 baseline (MLX smoke test)#404
Non-record: 10L MLP3x int6 baseline (MLX smoke test)#404yashward001 wants to merge 1 commit intoopenai:mainfrom
Conversation
|
Non-record submission. Local MLX smoke test confirming the full pipeline works end-to-end on Apple Silicon. 10-layer, 3× MLP, int6 quantization, zlib-9 compression. 200 iterations only (val_bpb 2.3517) — not a competitive score. Full H100 run planned once compute grant is approved. Planned improvements: zstd-22, sliding window eval, Muon WD, SmearGate, BigramHash, SWA. |
Approach summaryArchitecture: 10 layers, 512 dim, 8 heads (4 KV heads GQA), 3× MLP expansion (hidden=1536), relu² activation, U-Net skip connections, tied embeddings. Compression: int6 per-row quantization on all block weights, fp16 tied embeddings, zlib-9. Estimated artifact ~14.6MB. Current result: val_bpb 2.3517 at 200 iterations (MLX smoke test on Apple Silicon — not competitive, pipeline validation only). Planned improvements in order:
Research directions: Asymmetric MLP expansion (early layers 2×, late layers 4×); SWA+QAT interaction (whether averaged checkpoints quantize better than final checkpoint). Full H100 runs pending compute grant. |
Community Review — Non-record: 10L MLP3x int6 baseline (MLX smoke test)Compliance: NEEDS AUTHOR ACTION — What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with: A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:
Recommendation: Could you run Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step. Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'mlx'. Classification via |
No description provided.