v2.4.0
版本发布时间: 2024-09-13 12:50:56
modelscope/ms-swift最新发布版本:v2.5.0(2024-10-10 10:21:04)
English Version
New Features:
- Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
- Support for custom loss function training using a registration mechanism.
- Training now supports pushing models to ModelScope and HuggingFace.
- Support for the
freeze_vit
parameter to control the behavior of full parameter training for multimodal models.
New Models:
- Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see here.
- InternVL2 AWQ quantized models.
New Datasets:
- qwen2-pro series
中文版
新特性:
- 支持 Liger训练LLaMA、Qwen、Mistral 等模型,内存使用降低 10% 至 60%。
- 支持使用注册机制进行自定义损失函数的训练。
- 训练支持将模型推送至 ModelScope 和 HuggingFace。
- 支持 freeze_vit 参数,以控制多模态模型全参数训练的行为。
新模型:
- Qwen2-VL 系列包括 GPTQ/AWQ 量化模型,最佳实践可以查看这里。
- InternVL2 AWQ 量化模型。
新数据集:
- qwen2-pro 系列
What's Changed
- compat with vllm==0.5.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1812
- Support zero2 offload by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1814
- fix mp+ddp & resume_from_checkpoint by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1815
- fix preprocess_num_proc by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1818
- Support liger by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1819
- fix dora deployment by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1821
- Support register loss func by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1822
- use default-lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1823
- fix minicpm-v 2.6 infer device_map by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1832
- Fix code by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1824
- fix inject by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1835
- support qwen2-pro dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1834
- add ddp_timeout parameter by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1836
- fix internlm-xcomposer rlhf by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1838
- Support eval_nproc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1843
- support qwen2-vl by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1842
- Add internvl2 awq models by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1846
- Fix some datasets for streaming by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1848
- Fix Pissa and OLoRA by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1852
- Support qwen2 vl grounding by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1854
- support qwen2-vl & video finetune by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1849
- Update new datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1855
- update qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1856
- update qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1858
- fix qwen2-vl docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1861
- fix requirements by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1864
- update docs qwen2-vl by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1869
- Support faster data map by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1871
- [TorchAcc] fix serveral bugs for torchacc FSDP. by @baoleai in https://github.com/modelscope/ms-swift/pull/1872
- Add train record by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1873
- Fix num_proc by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1874
- Fix neftune doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1875
- add duet by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1877
- use model.generation_config by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1850
- Support freeze vit by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1880
- support qwen2-vl gptq awq by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1884
- Refactor push_to_hub by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1883
- Fix push to hub logic by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1888
- add vllm lmdeploy benchmark by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1889
- Add some warnings and fix RLHF by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1890
Full Changelog: https://github.com/modelscope/ms-swift/compare/v2.3.2...v2.4.0