DeepSpeed v0.17.0
What's Changed
- Update next version in version.txt after 0.16.9 release. by @loadams in #7306
- Update COMMITTERS.md by @PKUWZP in #7305
- Fix AutoTP gathering replaced layer params when bias is not None by @HollowMan6 in #7257
- Fix the GPU memory usage of ZeRO-Offload (only update stage_1_and_2.py) by @arminzhu in #7309
- Fix: Update grad norm calculation for CPU offload by @therealnaveenkamal in #7302
- CI: prefer bf16 over fp16 by @stas00 in #7304
tests/conftest.py
: automatically add local deepspeed repo when running tests by @stas00 in #7317- Update gaudi2 nightly,ci to latest 1.21.0 build by @raza-sikander in #7313
- anchor transformers version by @stas00 in #7316
- fix asymmetric in dequantize by @pencil-hub in #7283
- Ulysses SP for HF Integration by @stas00 in #7268
- Fix ci hang in torch2.7& improve ut by @inkcherry in #7321
- Bump to v0.17.0 by @sfc-gh-mwyatt in #7324
New Contributors
- @PKUWZP made their first contribution in #7305
- @arminzhu made their first contribution in #7309
- @therealnaveenkamal made their first contribution in #7302
- @pencil-hub made their first contribution in #7283
- @sfc-gh-mwyatt made their first contribution in #7324
Full Changelog: v0.16.9...v0.17.0