v0.15.1
版本发布时间: 2024-09-05 09:30:51
microsoft/DeepSpeed最新发布版本:v0.15.1(2024-09-05 09:30:51)
What's Changed
- Update version.txt after 0.15.0 release by @loadams in https://github.com/microsoft/DeepSpeed/pull/6403
- Fix Type Mismatch by @jomayeri in https://github.com/microsoft/DeepSpeed/pull/6410
- Fix redundant seq data parallel grp argument in Z3/MiCS by @samadejacobs in https://github.com/microsoft/DeepSpeed/pull/5352
- add Huawei Ascend NPU setup guide by @xuedinge233 in https://github.com/microsoft/DeepSpeed/pull/6445
- Add documentation for launcher without SSH by @dogacancolak-kensho in https://github.com/microsoft/DeepSpeed/pull/6455
- Dtype support check for accelerator in UTs by @raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6360
- Store/Load CIFAR from local/offline by @raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6390
- Add the accelerator setup guide link in Getting Started page by @rogerxfeng8 in https://github.com/microsoft/DeepSpeed/pull/6452
- Allow triton==3.0.x for fp_quantizer by @siddartha-RE in https://github.com/microsoft/DeepSpeed/pull/6447
- Change GDS to 1 AIO thread by @jomayeri in https://github.com/microsoft/DeepSpeed/pull/6459
- [CCL] fix condition issue in ccl.py by @YizhouZ in https://github.com/microsoft/DeepSpeed/pull/6443
- Avoid gds build errors on ROCm by @rraminen in https://github.com/microsoft/DeepSpeed/pull/6456
- TestLowCpuMemUsage UT get device by device_name by @raza-sikander in https://github.com/microsoft/DeepSpeed/pull/6397
- Add workflow to build DS without torch to better test before releases by @loadams in https://github.com/microsoft/DeepSpeed/pull/6450
- Fix patch for parameter partitioning in zero.Init() by @tohtana in https://github.com/microsoft/DeepSpeed/pull/6388
- Add default value to "checkpoint_folder" in "load_state_dict" of bf16_optimizer by @ljcc0930 in https://github.com/microsoft/DeepSpeed/pull/6446
- DeepNVMe tutorial by @tjruwase in https://github.com/microsoft/DeepSpeed/pull/6449
- bf16_optimizer: fixes to different grad acc dtype by @nelyahu in https://github.com/microsoft/DeepSpeed/pull/6485
- print warning if actual triton cache dir is on NFS, not just for default by @jrandall in https://github.com/microsoft/DeepSpeed/pull/6487
- DS_BUILD_OPS should build only compatible ops by @tjruwase in https://github.com/microsoft/DeepSpeed/pull/6489
- Safe usage of popen by @tjruwase in https://github.com/microsoft/DeepSpeed/pull/6490
- Handle an edge case where
CUDA_HOME
is not defined on ROCm systems by @amorehead in https://github.com/microsoft/DeepSpeed/pull/6488
New Contributors
- @xuedinge233 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6445
- @siddartha-RE made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6447
- @ljcc0930 made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6446
- @jrandall made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6487
- @amorehead made their first contribution in https://github.com/microsoft/DeepSpeed/pull/6488
Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.15.0...v0.15.1