Skip to content

Minimal recurrent motif (sb1 rs2 g0.18) – non-record submission#323

Open
megnat05-tmm wants to merge 10 commits intoopenai:mainfrom
megnat05-tmm:main
Open

Minimal recurrent motif (sb1 rs2 g0.18) – non-record submission#323
megnat05-tmm wants to merge 10 commits intoopenai:mainfrom
megnat05-tmm:main

Conversation

@megnat05-tmm
Copy link
Copy Markdown

This is a non-record submission.

Summary

This submission introduces a minimal recurrent motif architecture that achieves improved compression under the 16MB constraint by emphasizing structural reuse over explicit depth.

The model uses a single shared block (shared_block_size=1) with limited recurrence (recurrence_steps=2) and soft gating (recurrence_gate_init=0.18). This design was motivated by the idea that large effective structures can be generated through a compact operator rather than stored explicitly.

Results

Final (roundtrip):

  • val_loss: 4.7062
  • val_bpb: 2.7873

Artifact:

  • compressed size: ~1.92 MB
  • raw size: ~8.47 MB

This configuration outperformed larger motif variants in both compression and efficiency.

Approach

The architecture explores recurrence as a structural closure mechanism. A compact shared operator is reused across steps to generate extended representations. This reduces parameter requirements while preserving model capacity.

Notes

  • Validation timing on local hardware reflects evaluation chunking and logging cadence, but does not affect correctness of metrics.
  • The submission is fully self-contained and reproducible.

@MatoTeziTanka
Copy link
Copy Markdown

MatoTeziTanka commented Apr 11, 2026

[RETRACTED 2026-04-11] — This IMPORT_FAIL was a false positive. Root cause: runner fetched a path marked deleted in the PR diff. Your code is not broken. See correction below: #323 (comment)


Community Review — Minimal recurrent motif (sb1 rs2 g0.18) – non-record submission

Compliance: NEEDS AUTHOR ACTION — train_gpt.py fails to import on CT2038 (Python 3.10 / torch 2.10.0+cpu)

What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with:

SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x9e in position 0: invalid start byte (line 1)

A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:

Recommendation: Could you run python3 -c "import py_compile; py_compile.compile('train_gpt.py')" on your records-folder train_gpt.py under Python 3.10 specifically? The eval image is Python 3.10 per Issue #17 / the README, so any parse error on 3.10 blocks the submission at import time before any of the scored-eval logic runs.

Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step.


Reviewed by @MatoTeziTankaThe Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x9e in position 0: invalid start byte (line 1). Classification via classify_prs.py AST-based classifier; full compliance audit deferred until the import issue is resolved. Auto-drafted from a template and spot-checked before posting.

@MatoTeziTanka
Copy link
Copy Markdown

Retraction — this IMPORT_FAIL was a deleted-file artifact in my smoke runner

Sorry @megnat05-tmm, this one's on me. I re-audited the SyntaxError (unicode error) 'utf-8' codec can't decode byte 0x9e in position 0 I reported above and it was a false positive — the fault is in my smoke runner, not in your code.

What happened:

Your PR deletes 16 old records/*/train_gpt.py path(s) while editing a different file, and my bulk smoke runner iterated the diff's file list and fetched one of the paths that's already marked for deletion. The raw GitHub content endpoint returned either a binary stub or a non-UTF8 response, and my runner tried to import it as Python source, producing the byte 0x9e at position 0 error. That error was about the deleted/non-existent file, not the train_gpt.py you're actually submitting.

Verified at head bff5e2d:

The real train_gpt.py you're editing parses cleanly under Python 3.10:

py_compile.compile('train_gpt.py') → PARSES OK
71409 bytes

Your PR is not broken by this error. I'm retracting the IMPORT_FAIL classification. I'll re-queue the full compliance audit and post findings separately.

Again — sorry for the noise. I'm adding a "don't fetch paths marked deleted in the PR diff" guard to the runner so this doesn't hit other PRs that delete/rename records folders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants