[Non-Record] Whirlpool v5b — Non-Euclidean Lorentzian Attention on the Hyperboloid Manifold by tmancino · Pull Request #1239 · openai/parameter-golf

tmancino · 2026-04-01T23:55:33Z

Summary

Non-record submission exploring non-Euclidean geometry in attention. To our knowledge, the first submission to replace dot-product attention with Minkowski inner products on a hyperboloid manifold.

val_bpb: 1.5918 (SEED=314) | 12.2 MB artifact | 8xH100 SXM
Key finding: Lorentzian attention is trainable with proper stabilization (scale clamping + 20% warmup = -0.88 BPB)
Custom Flash Lorentz Attention Triton kernel — fused hyperboloid projection + Minkowski inner + centroid aggregation
3 shared blocks x 8 curvature-progressive orbits (23.9M stored params, 8x depth via weight sharing)

What's Novel

Lorentzian attention: Queries/keys projected onto the hyperboloid, scored via Minkowski inner product instead of dot product
Flash Lorentz Attention kernel: Custom Triton kernel fusing the full non-Euclidean pipeline — registered as custom_op for torch.compile compatibility
Progressive curvature orbits: Each orbit sees a different curvature (0.1→2.0), from nearly-flat local patterns to highly-curved hierarchical dependencies
Parallel GPU eval: After training, each GPU independently runs a different TTT hyperparameter — best reported

Architecture

Component	Setting
d_model	768
Attention	Lorentzian (Minkowski inner product on hyperboloid)
Heads	12 GQA 6:1, head_dim=64
MLP	5x LeakyReLU(0.5)² (fused Triton kernel)
Depth	8 orbits through 3 shared blocks
Curvature	0.1 to 2.0 (progressive)
Optimizer	MuonAdamW (lr=0.04, wd=0.12)

Files

submission.json — leaderboard metadata
README.md — full architecture writeup, results, limitations, development journey
train_gpt.py — single-file submission (73KB)
train_seed314.log — full training + eval log

…e Hyperboloid Manifold Non-record submission exploring non-Euclidean geometry in attention. First submission to replace dot-product attention with Minkowski inner products on a hyperboloid manifold. val_bpb: 1.5918 (SEED=314) | 12.2 MB artifact | 8xH100 SXM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Non-Record] Whirlpool v5b — Non-Euclidean Lorentzian Attention on the Hyperboloid Manifold#1239

[Non-Record] Whirlpool v5b — Non-Euclidean Lorentzian Attention on the Hyperboloid Manifold#1239
tmancino wants to merge 1 commit intoopenai:mainfrom
tmancino:whirlpool-v5b-submission

tmancino commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmancino commented Apr 1, 2026

Summary

What's Novel

Architecture

Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant