v0.7.4

microsoft/DeepSpeed

版本发布时间: 2022-10-22 05:11:28

microsoft/DeepSpeed最新发布版本:v0.15.1(2024-09-05 09:30:51)

What's Changed

MOE residual matmult unit test by @samadejacobs in https://github.com/microsoft/DeepSpeed/pull/2323
MOE matmult with memaccess by @samadejacobs in https://github.com/microsoft/DeepSpeed/pull/2336
Refactor residual add kernels by @arashb in https://github.com/microsoft/DeepSpeed/pull/2333
mem access for quantize kernel by @GuanhuaWang in https://github.com/microsoft/DeepSpeed/pull/2331
increase min pre-commit versions by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2346
Extend scratch buffer for long prompts by @cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/2212
[docs] fix zero docs by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2350
Staging profile inference v1 (#2348) by @awan-10 in https://github.com/microsoft/DeepSpeed/pull/2349
Kernel Data Conversion Utility by @cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/2327
Add Onebit Optimizers in init by @l4d2boomer in https://github.com/microsoft/DeepSpeed/pull/2340
docs(mixture-of-experts-inference): fix typo in tuto by @jqueguiner in https://github.com/microsoft/DeepSpeed/pull/2345
Use blob storage for datasets in unit tests by @mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2342
Refactor gptj_residual_add kernels for better readability by @arashb in https://github.com/microsoft/DeepSpeed/pull/2358
Updated issue templates by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2363
fix cuda invalid config error in dequant kernel by @GuanhuaWang in https://github.com/microsoft/DeepSpeed/pull/2362
Add missing pytest fixture scope by @arashb in https://github.com/microsoft/DeepSpeed/pull/2353
Extend residual_add kernel tests to cover pre_attn_norm by @arashb in https://github.com/microsoft/DeepSpeed/pull/2354
Refactor fused_bias_residual kernels for better readability by @arashb in https://github.com/microsoft/DeepSpeed/pull/2356
Capture error message during sweep tests by @molly-smith in https://github.com/microsoft/DeepSpeed/pull/2351
Fix an exception when auto-casting dicts to fp16 by @mjksmith in https://github.com/microsoft/DeepSpeed/pull/2370
Refactor remaining distributed tests by @mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2216
Fix the MLP output tensor's shape by @arashb in https://github.com/microsoft/DeepSpeed/pull/2380
add 11.8 to cuda_minor_mismatch_ok to allow building with current CUDA by @Thomas-MMJ in https://github.com/microsoft/DeepSpeed/pull/2390
Pin Transformers test version by @mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2402
Change type to tuple in replace_wo_policy isinstance check by @lekurile in https://github.com/microsoft/DeepSpeed/pull/2387
Checkpoint backwards-compatbility workaround by @tjruwase in https://github.com/microsoft/DeepSpeed/pull/2384
Add Predicated Global Load to Memory Access Utils by @cmikeh2 in https://github.com/microsoft/DeepSpeed/pull/2373
MII blog post by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2418
Fix figure reference by @awan-10 in https://github.com/microsoft/DeepSpeed/pull/2419
Add SLURM Multinode Runner by @dashstander in https://github.com/microsoft/DeepSpeed/pull/2404
Fix issue with corrupted output on long generation for GPT by @andrewchernyh in https://github.com/microsoft/DeepSpeed/pull/2359
Fix GPT Neo-X multi-gpu inference by @andrewchernyh in https://github.com/microsoft/DeepSpeed/pull/2401
CI fixes related to triton by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2422
[docs] update mii blog title by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2423
add SD injection policy by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2381
Fix checkpoint loading when it is a dictionary by @RezaYazdaniAminabadi in https://github.com/microsoft/DeepSpeed/pull/2425
Make error regex more generic in collect_results.py by @molly-smith in https://github.com/microsoft/DeepSpeed/pull/2415
fixes #2389 by @clumsy in https://github.com/microsoft/DeepSpeed/pull/2411
Fix for inference gpt-j test by @mrwyattii in https://github.com/microsoft/DeepSpeed/pull/2430
Fixing bug 2361 by @jomayeri in https://github.com/microsoft/DeepSpeed/pull/2410
Universal checkpoint for zero stage 1 by @tjruwase in https://github.com/microsoft/DeepSpeed/pull/2284
only add deps if extra is explicitly called by @jeffra in https://github.com/microsoft/DeepSpeed/pull/2432
Add TestInjectionPolicy inference unittest class for testing custom injection policies by @lekurile in https://github.com/microsoft/DeepSpeed/pull/2426
[memory estimators] new config args sync by @stas00 in https://github.com/microsoft/DeepSpeed/pull/2431
parallelize writing of layer checkpoint files across data parallel instances by @adammoody in https://github.com/microsoft/DeepSpeed/pull/1419
Fix broken link to DeepSpeed Megatron fork by @lekurile in https://github.com/microsoft/DeepSpeed/pull/2440

New Contributors

@l4d2boomer made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2340
@jqueguiner made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2345
@mjksmith made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2370
@Thomas-MMJ made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2390
@lekurile made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2387
@dashstander made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2404
@andrewchernyh made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2359
@clumsy made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2411
@jomayeri made their first contribution in https://github.com/microsoft/DeepSpeed/pull/2410

Full Changelog: https://github.com/microsoft/DeepSpeed/compare/v0.7.3...v0.7.4

相关地址：原始地址下载(tar) 下载(zip)

查看：2022-10-22发行的版本