With 1,500+ submissions now in the queue, I built a small static-analysis tool that might help narrow down review workloads: parameter-golf-checker.
Repo: https://github.com/Bortlesboat/parameter-golf-checker
What it is
A pure-Python AST-based checker that reads a submission's train_gpt.py and surfaces patterns that commonly correspond to compliance issues from #1017. No GPU, no execution — it just parses the file and walks the tree. It handles the lzma+base85 compressed wrapper shape used in several recent records.
What it checks (8 checks total)
- C1 Causality, C2 Normalization, C3 Score-before-update, C4 Single-pass
- N-gram cache / oracle-selection patterns
- SLOT / LBFGS /
torch.linalg.solve / second-order updates on val
- 16 MB artifact estimate
- Network imports
What it is NOT
It is explicitly a reviewer aid, not an authoritative PASS/FAIL judge. Every PASS is a weak signal, every WARN means "human should look at this," and every FAIL is a hypothesis. Aliasing, helper-indirected scoring, and metaprogramming can all fool it. I tightened it after a harsh code review until it emits zero false FAILs on the 10 most recent merged records I tested (including #1555, #1560, #1561, #1572, #1576), but I'm sure there are still patterns it misses in both directions.
Usage
```bash
python pgolf_checker.py --pr 1576
python pgolf_checker.py --file path/to/train_gpt.py
python pgolf_checker.py --pr 1576 --format json
```
MIT licensed. Happy to fold in whatever the organizers find useful — if there's a violation pattern I'm missing or a check you'd want added, issues and PRs welcome. And if you'd rather it not exist, just say so and I'll take it down.
With 1,500+ submissions now in the queue, I built a small static-analysis tool that might help narrow down review workloads: parameter-golf-checker.
Repo: https://github.com/Bortlesboat/parameter-golf-checker
What it is
A pure-Python AST-based checker that reads a submission's
train_gpt.pyand surfaces patterns that commonly correspond to compliance issues from #1017. No GPU, no execution — it just parses the file and walks the tree. It handles the lzma+base85 compressed wrapper shape used in several recent records.What it checks (8 checks total)
torch.linalg.solve/ second-order updates on valWhat it is NOT
It is explicitly a reviewer aid, not an authoritative PASS/FAIL judge. Every PASS is a weak signal, every WARN means "human should look at this," and every FAIL is a hypothesis. Aliasing, helper-indirected scoring, and metaprogramming can all fool it. I tightened it after a harsh code review until it emits zero false FAILs on the 10 most recent merged records I tested (including #1555, #1560, #1561, #1572, #1576), but I'm sure there are still patterns it misses in both directions.
Usage
```bash
python pgolf_checker.py --pr 1576
python pgolf_checker.py --file path/to/train_gpt.py
python pgolf_checker.py --pr 1576 --format json
```
MIT licensed. Happy to fold in whatever the organizers find useful — if there's a violation pattern I'm missing or a check you'd want added, issues and PRs welcome. And if you'd rather it not exist, just say so and I'll take it down.