Skip to content

Add Stack Integration + Legal TTT submission package#1050

Open
Taleef7 wants to merge 1 commit intoopenai:mainfrom
Taleef7:s05-submission
Open

Add Stack Integration + Legal TTT submission package#1050
Taleef7 wants to merge 1 commit intoopenai:mainfrom
Taleef7:s05-submission

Conversation

@Taleef7
Copy link
Copy Markdown

@Taleef7 Taleef7 commented Mar 29, 2026

Summary

  • add the 2026-03-28 Stack Integration + Legal TTT + Parallel Muon submission folder
  • include README.md, submission.json, train.log, and the three audited train_seed*.log evidence files
  • keep the PR diff limited to records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/

Verification

  • python experiments/audit_submission_package.py records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed1337.log records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed42.log records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed2025.log
  • reviewed clean-worktree diff scope against origin/main before push

Copilot AI review requested due to automatic review settings March 29, 2026 02:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Track (10min / 16MB) submission package under records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/, bundling the promoted training script plus audited 3-seed evidence logs and metadata for reviewer consumption.

Changes:

  • Add promoted train_gpt.py submission script and canonical train.log plus 3 audited train_seed*.log evidence logs.
  • Add README.md describing provenance, run contract, and evidence inventory.
  • Add submission.json capturing audited metrics and artifact size metadata.

Reviewed changes

Copilot reviewed 3 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/README.md Reviewer-facing provenance + run/verification instructions for the promoted submission package.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/submission.json Audit-derived metadata (metrics, sizes, seeds) for the submission package.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_gpt.py The promoted training/eval script included as the submission’s executable artifact.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train.log Canonical log (intended alias of seed 2025) for quick review.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed42.log Seeded evidence log (42) for the audited 3-seed package.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed2025.log Seeded evidence log (2025) used as the canonical train.log content.
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed1337.log Seeded evidence log (1337) for the audited 3-seed package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +54 to +56
python experiments/verify_run.py \
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train.log
```
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README references experiments/verify_run.py, but there is no experiments/ directory in this repo. This makes the documented run/verification command unusable; please update the instructions to point at an existing verifier in the repository, or include the verifier script in this submission folder.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +82
```bash
python experiments/audit_submission_package.py \
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed1337.log \
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed42.log \
records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_seed2025.log
```
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README suggests regenerating metadata via python experiments/audit_submission_package.py, but that script/path does not exist in this repository. Either update the documentation to the correct script location/name or add the referenced audit helper to keep the package reproducible.

Copilot uses AI. Check for mistakes.
"val_bpb": 1.1193666666666668,
"val_bpb_std": 0.0004642796092395205,
"bytes_total": 15990006,
"bytes_code": 91356,
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes_code appears inconsistent with the evidence logs: in train_seed2025.log, Total submission size int6+lzma (15,990,006) minus Serialized model int6+lzma (15,900,548) implies a code size of 89,458 bytes, but this JSON records 91,356. Please correct bytes_code (and any derived fields) to match the audited artifact size computation.

Suggested change
"bytes_code": 91356,
"bytes_code": 89458,

Copilot uses AI. Check for mistakes.
Taleef7 added a commit to Taleef7/parameter-golf that referenced this pull request Apr 6, 2026
…th a dif…

- ".git/config"
- "records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/README.md"
- ".gsd/KNOWLEDGE.md"
- ".gsd/milestones/M001/slices/S05/tasks/T03-SUMMARY.md"

GSD-Task: S05/T03
@MatoTeziTanka
Copy link
Copy Markdown

Community Review — Add Stack Integration + Legal TTT submission package

BPB: (not parsed — see PR title) | Compliance: LOOKS CLEAN — score-first-per-chunk TTT (legal #1416/#1423 pattern)

What I found in the code (head SHA ef4d3f3ae3ee, file records/track_10min_16mb/2026-03-28_StackIntegration_LegalTTT_ParallelMuon/train_gpt.py):

The TTT path at line 1074 implements the score-first-per-chunk pattern: each chunk is scored under torch.no_grad() / inference_mode() before the base_model.train() + SGD adaptation runs on that same chunk, with an is_last_chunk guard so the final chunk gets no adaptation pass. This is the structural shape the legal frontier uses (PRs #1416 erichroepke, #1423 aryanbhosale).

Per Issue #402 and Issue #677, TTT is legal when each token is scored before the adapter updates on it, and that's what the code does here — chunk ci is scored under weights adapted only on chunks 0..ci-1. No prequant_ttt_adapt_adamw(val_tokens, ...) multi-epoch fine-tune, no scored-region SLOT, no target-in-key n-gram cache.

CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.05s, dim=512, layers=11, vocab=1024, code=89458 B, SMOKE_TEST_PASS

Verdict: LOOKS CLEAN.

Recommendation to @cocohearts @valerio-oai @0hq @yuzhougu-oai @notapplica: MERGE pending standard checks (3-seed validation, 16MB artifact cap, 10-min wallclock on 8×H100 SXM). The compliance picture matches the legal reference frontier and no flags were raised by the classification pass.

Auto-classification caveat: this review was drafted by the AST-based classifier against a template derived from manually-reviewed cluster PRs (#1420, #1450, #1487, #1541, #1529, #1533, #1518). If I've misread a subtlety in your eval path — e.g., multi-epoch TTT that I mistook for single-pass, or a target-in-key lookup I missed in a helper function — please flag it and I'll re-run the audit manually.


Reviewed by @MatoTeziTankaThe Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.05s, dim=512, layers=11, vocab=1024, code=89458 B, SMOKE_TEST_PASS. Classification via deterministic AST-based classify_prs.py (pattern bank derived from ~65 manually-reviewed PRs earlier in the 2026-04-11 sweep). This review was auto-drafted from a template and spot-checked before posting — if the template misread your code, please call it out so I can iterate the classifier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants