v2.4.0

modelscope/ms-swift

版本发布时间: 2024-09-13 12:50:56

modelscope/ms-swift最新发布版本:v2.5.0(2024-10-10 10:21:04)

English Version

New Features:

Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
Support for custom loss function training using a registration mechanism.
Training now supports pushing models to ModelScope and HuggingFace.
Support for the freeze_vit parameter to control the behavior of full parameter training for multimodal models.

New Models:

Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see here.
InternVL2 AWQ quantized models.

New Datasets:

qwen2-pro series

中文版

新特性：

支持 Liger训练LLaMA、Qwen、Mistral 等模型，内存使用降低 10% 至 60%。
支持使用注册机制进行自定义损失函数的训练。
训练支持将模型推送至 ModelScope 和 HuggingFace。
支持 freeze_vit 参数，以控制多模态模型全参数训练的行为。

新模型：

Qwen2-VL 系列包括 GPTQ/AWQ 量化模型，最佳实践可以查看这里。
InternVL2 AWQ 量化模型。

新数据集：

qwen2-pro 系列

What's Changed

compat with vllm==0.5.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1812
Support zero2 offload by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1814
fix mp+ddp & resume_from_checkpoint by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1815
fix preprocess_num_proc by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1818
Support liger by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1819
fix dora deployment by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1821
Support register loss func by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1822
use default-lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1823
fix minicpm-v 2.6 infer device_map by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1832
Fix code by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1824
fix inject by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1835
support qwen2-pro dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1834
add ddp_timeout parameter by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1836
fix internlm-xcomposer rlhf by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1838
Support eval_nproc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1843
support qwen2-vl by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1842
Add internvl2 awq models by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1846
Fix some datasets for streaming by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1848
Fix Pissa and OLoRA by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1852
Support qwen2 vl grounding by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1854
support qwen2-vl & video finetune by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1849
Update new datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1855
update qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1856
update qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1858
fix qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1861
fix requirements by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1864
update docs qwen2-vl by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1869
Support faster data map by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1871
[TorchAcc] fix serveral bugs for torchacc FSDP. by @baoleai in https://github.com/modelscope/ms-swift/pull/1872
Add train record by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1873
Fix num_proc by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1874
Fix neftune doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1875
add duet by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1877
use model.generation_config by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1850
Support freeze vit by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1880
support qwen2-vl gptq awq by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1884
Refactor push_to_hub by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1883
Fix push to hub logic by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1888
add vllm lmdeploy benchmark by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1889
Add some warnings and fix RLHF by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1890

Full Changelog: https://github.com/modelscope/ms-swift/compare/v2.3.2...v2.4.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-09-13发行的版本