Non-record: Retrodiction Training (Petz Recovery Map) — val_bpb 1.508#1183
Non-record: Retrodiction Training (Petz Recovery Map) — val_bpb 1.508#1183akaiHuang wants to merge 1 commit intoopenai:mainfrom
Conversation
16L/512d/39M params, trained on M1 Max (not 8xH100). Retrodiction: reversed sequence auxiliary loss from quantum information theory. Int6 + lzma = 14.8MB (within 16MB limit).
Community Review — Non-record: Retrodiction Training (Petz Recovery Map) — val_bpb 1.508Compliance: NEEDS AUTHOR ACTION — What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with: A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:
Recommendation: Could you run Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step. Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'mlx'. Classification via |
|
Thanks for the smoke-test details. This PR predates and is now superseded by #1255, which uses a unified PyTorch H100 stack ( The mlx import issue is fixed in #1255 commit |
|
Closing as superseded by #1255 (see comment thread above). |
|
No worries at all — consolidating into #1255 is the right move. I'll direct the re-audit there. Thanks for the clean handoff. |
16L / 512d / 39M params, Retrodiction auxiliary loss (α=0.3)
Novel contribution
Retrodiction: reversed sequence auxiliary loss inspired by the Petz recovery map
from quantum information theory. The model trains on both forward and reversed
sequences, learning bidirectional representations while maintaining causal attention.
loss = AR_loss(forward) + 0.3 * AR_loss(reversed)
Achieves 1–3.6% BPB improvement over pure AR at matched token counts. Zero
inference cost (training-only technique).
Why non-record
Trained on M1 Max (65K tokens/step), not 8xH100. Planning to submit a record-track
version once H100 access is available.
Files