Skip to content

Record: GDN-Hybrid + Sliding Window Attention + compressed-code warmdown1000 (cold-cache, 1.01671 BPB)#1575

Closed
joshkmartinez wants to merge 1 commit intoopenai:mainfrom
joshkmartinez:submission-run051-safe031-1.0167
Closed

Record: GDN-Hybrid + Sliding Window Attention + compressed-code warmdown1000 (cold-cache, 1.01671 BPB)#1575
joshkmartinez wants to merge 1 commit intoopenai:mainfrom
joshkmartinez:submission-run051-safe031-1.0167

Conversation

@joshkmartinez
Copy link
Copy Markdown

Summary

Joshua-owned SAFE_SUBMISSION update for the GDN-Hybrid family. This supersedes upstream PR #1564 with a better clean-lane mean from the repaired compressed-code warmdown1000 bundle.

Authoritative result

  • evaluation: quantized_bpb
  • 3-seed mean: 1.01671233 BPB
  • 3-seed std: 0.00134386 BPB
  • best seed: 1.015700 BPB (seed 1337)
  • artifact bytes max: 15,903,365
  • legality lane: SAFE_SUBMISSION (all pulled artifacts under 16,000,000 bytes)

Per-seed authoritative results

Seed Steps EMA BPB Quantized BPB XSA BPB Artifact bytes
42 2227 1.007164 1.016200 1.021202 15,733,879
1337 2242 1.007164 1.015700 1.020105 15,903,365
2024 2227 1.009032 1.018237 1.024111 15,713,422

Delta vs PR #1564

Notes

  • same SAFE_SUBMISSION / fixed-predictor Track-A legality lane
  • same GDN-Hybrid family, but with the compressed-code warmdown1000 repair that already proved itself on seed 1337 in run050-safe030
  • final authority comes from pulled TensorPool artifacts, not live logs

@joshkmartinez
Copy link
Copy Markdown
Author

Closing this stale submission after run055-reeval-gdn-bpbfix showed the advertised 1.01671233 BPB used non-canonical SentencePiece byte-accounting. A corrected clean restage of the strongest still-authoritative Joshua-owned artifact is now open as PR #1622 (run039-safe019, 1.01710033 BPB).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant