v0.14.2
版本发布时间: 2024-04-24 07:25:20
microsoft/DeepSpeed最新发布版本:v0.15.1(2024-09-05 09:30:51)
What's Changed
- Update version.txt after 0.14.1 release by @mrwyattii in https://github.com/microsoft/DeepSpeed/pull/5413
- Remove dtype(fp16) condition check for residual_add unit test by @raza-sikander in https://github.com/microsoft/DeepSpeed/pull/5329
- [XPU] Use non_daemonic_proc by default on XPU device by @ys950902 in https://github.com/microsoft/DeepSpeed/pull/5412
- Fix a convergence issues in TP topology caused by incorrect grad_norm. by @inkcherry in https://github.com/microsoft/DeepSpeed/pull/5411
- Update 'create-pr' action in release workflow to latest by @loadams in https://github.com/microsoft/DeepSpeed/pull/5415
- Update engine.py to avoid torch warning by @etiennebonnafoux in https://github.com/microsoft/DeepSpeed/pull/5408
- Update _sidebar.scss by @fasterinnerlooper in https://github.com/microsoft/DeepSpeed/pull/5293
- Add more tests into XPU CI by @Liangliang-Ma in https://github.com/microsoft/DeepSpeed/pull/5427
- [CPU] Support SHM based inference_all_reduce in TorchBackend by @delock in https://github.com/microsoft/DeepSpeed/pull/5391
- Add required paths to trigger AMD tests on PRs by @loadams in https://github.com/microsoft/DeepSpeed/pull/5406
- Bug fix in
split_index
method by @bm-synth in https://github.com/microsoft/DeepSpeed/pull/5292 - Parallel map step for
DistributedDataAnalyzer
map-reduce by @bm-synth in https://github.com/microsoft/DeepSpeed/pull/5291 - Selective dequantization by @RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/5375
- Fix sorting of shard optimizer states files for universal checkpoint by @tohtana in https://github.com/microsoft/DeepSpeed/pull/5395
- add device config env for the accelerator by @shiyuan680 in https://github.com/microsoft/DeepSpeed/pull/5396
- 64bit indexing fused adam by @garrett4wade in https://github.com/microsoft/DeepSpeed/pull/5187
- Improve parallel process of universal checkpoint conversion by @tohtana in https://github.com/microsoft/DeepSpeed/pull/5343
- set the default to use set_to_none for clearing gradients in BF16 optimizer. by @inkcherry in https://github.com/microsoft/DeepSpeed/pull/5434
- OptimizedLinear implementation by @jeffra in https://github.com/microsoft/DeepSpeed/pull/5355
- Update README.md by @Jhonso7393 in https://github.com/microsoft/DeepSpeed/pull/5453
- Update PyTest torch version to match PyTorch latest official (2.3.0) by @loadams in https://github.com/microsoft/DeepSpeed/pull/5454
New Contributors
- @etiennebonnafoux made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5408
- @fasterinnerlooper made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5293
- @shiyuan680 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5396
- @garrett4wade made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5187
- @Jhonso7393 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/5453
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.14.1...v0.14.2