add basic support for the optimi adamw optimizer #1727

winglian · 2024-07-05T13:31:03Z

https://optimi.benjaminwarner.dev/kahan_summation/

Kahan Summation¶

Kahan summation 2 is a technique to reduce the numerical error of adding multiple low precision numbers by accumulating errors in a separate compensation buffer. The addition of the compensation buffer increases the effective summation precision by the precision of the compensation buffer.

Using Kahan summation to improve low precision model training was first introduced by Zamirai et al in Revisiting BFloat16 Training. Zamirai et al discovered the primary source of numerical error from low precision training is during the optimizer’s model weight update step. They add Kahan summation to the SGD & AdamW weight update steps to reduce the update’s numerical inaccuracy, increasing low precision training to the equivalent of full precision training across tested models.

winglian force-pushed the optimi-optimizer branch 3 times, most recently from dfb58f9 to 5c36692 Compare July 13, 2024 01:24

winglian added the ready to merge label Jul 13, 2024

winglian added 5 commits July 13, 2024 14:35

add support for optimi_adamw optimizer w kahan summation

3545486

pydantic validator for optimi_adamw

903eff2

workaround for setting optimizer for fsdp

251f15c

make sure to install optimizer packages

b137d00

make sure to have parity for model parameters passed to optimizer

a3cc744

winglian force-pushed the optimi-optimizer branch from 5c36692 to a3cc744 Compare July 13, 2024 18:35

winglian added 2 commits July 13, 2024 14:43

add smoke test for optimi_adamw optimizer

7d89b05

don't use foreach optimi by default

449e25f

winglian merged commit 78e12f8 into main Jul 14, 2024
8 checks passed

winglian deleted the optimi-optimizer branch July 14, 2024 23:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add basic support for the optimi adamw optimizer #1727

add basic support for the optimi adamw optimizer #1727

winglian commented Jul 5, 2024 •

edited

Loading

add basic support for the optimi adamw optimizer #1727

add basic support for the optimi adamw optimizer #1727

Conversation

winglian commented Jul 5, 2024 • edited Loading

winglian commented Jul 5, 2024 •

edited

Loading