Add Muonremez and test #2285

WhenWen · 2026-01-06T06:39:54Z

This pull request introduces a new optimizer, MuonRemez proposed by @mahyarjn80 , and sets up an experiment to perform learning rate sweeps on small Llama models using this optimizer. The changes span the addition of the optimizer implementation, its integration into the codebase, and the creation of an experiment script to test its performance.

Addition of MuonRemez optimizer:

Implemented the MuonRemezConfig optimizer, which uses a coupled Newton-Schulz iteration to compute matrix square roots for weight updates, along with supporting functions and state management in muonremez.py.
Registered and imported MuonRemezConfig in the optimizer module's public interface (__init__.py). [1] [2]

Experiment setup for MuonRemez:

Added exp2284_test_remez.py, which defines and runs a learning rate sweep experiment comparing MuonRemez and non-MuP variants on a small Llama model using the new optimizer configuration.## Description

Fixes #2284

Copilot

Pull request overview

This PR adds the MuonRemez optimizer, a variant of the Muon optimizer that uses coupled Newton-Schulz iteration to compute matrix square roots (U·Σ^(1/2)·V^T) instead of Newton-Schulz orthogonalization. The implementation includes configuration, state management, and supporting mathematical functions, along with an experiment script to test the optimizer through learning rate sweeps on small Llama models.

Key changes:

Implemented MuonRemez optimizer with coupled Newton-Schulz iteration for computing matrix square roots
Integrated optimizer into the public API with proper registration
Created experiment script for learning rate sweep testing on 300M Llama model

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
`lib/levanter/src/levanter/optim/muonremez.py`	Complete MuonRemez optimizer implementation with config class, gradient transformation, and coupled Newton-Schulz quintic algorithm for matrix square root computation
`lib/levanter/src/levanter/optim/__init__.py`	Registered and exported MuonRemezConfig in the optimizer module's public interface
`experiments/exp2284_test_remez.py`	Experiment setup for testing MuonRemez with learning rate sweeps on a 300M parameter Llama model

lib/levanter/src/levanter/optim/muonremez.py

experiments/exp2284_test_remez.py

lib/levanter/src/levanter/optim/muonremez.py

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

add remez and test

945c95b

Copilot AI review requested due to automatic review settings January 6, 2026 06:39

Copilot started reviewing on behalf of WhenWen January 6, 2026 06:40 View session

Copilot AI reviewed Jan 6, 2026

View reviewed changes

Kaiyue Wen added 2 commits January 5, 2026 22:45

make linter happy

9b7231d

address copilot comment

f7615ef

WhenWen requested a review from Copilot January 6, 2026 06:47

Copilot started reviewing on behalf of WhenWen January 6, 2026 06:48 View session

Copilot AI reviewed Jan 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Muonremez and test #2285

Add Muonremez and test #2285

Uh oh!

WhenWen commented Jan 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Muonremez and test #2285

Are you sure you want to change the base?

Add Muonremez and test #2285

Uh oh!

Conversation

WhenWen commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WhenWen commented Jan 6, 2026 •

edited

Loading