v2.3.2
版本发布时间: 2024-08-24 12:42:26
modelscope/ms-swift最新发布版本:v2.5.0(2024-10-10 10:21:04)
English Version
New Features:
- ReFT support: achieves parameter efficiency that is 15× to 65× greater than LoRA.
- Multimodal model supports zero3.
- Supports using environment variables to control parameters such as hd_num, max_num, and video_segments.
New Models:
- longwriter-glm4-9b, longwriter-llama3_1-8b
- phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
- llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov
New Datasets:
- longwriter-6k
- rlaif-v
- latex-ocr-print, latex-ocr-handwrite
中文版
新功能:
- 支持ReFT,实现了比 LoRA 高 15 倍到 65 倍的参数效率。
- 多模态模型支持 zero3。
- 支持使用环境变量控制模型特有的参数,如 hd_num、max_num 和 video_segments。
新模型:
- longwriter-glm4-9b, longwriter-llama3_1-8b
- phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
- llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov
新数据集:
- longwriter-6k
- rlaif-v
- latex-ocr-print, latex-ocr-handwrite
What's Changed
- fix imports by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1748
- compat with torch=1.12/1.13 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1752
- update rlaif-v hf dataset by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1755
- fix lmdeploy: AssertionError: failed to match chat template, please explicit set chat_template_config by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1759
- use eager -> sdpa by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1764
- Fix GLM4 agent toolcall by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1767
- Support LongWriter-llama3.1-8b and LongWriter-glm4-9b. by @DaozeZhang in https://github.com/modelscope/ms-swift/pull/1762
- Support llava onevision by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1761
- [TorchAcc] fix: fix saving and loading checkpoint for full sft FSDP by @baoleai in https://github.com/modelscope/ms-swift/pull/1765
- Fix deepseek-coder-v2-lite template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1771
- Fix qwen2-audio & zero3 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1774
- Fix zero3 & minicpm-v/internvl2/xcomposer by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1772
- fix infer dataset_test_ratio by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1779
- fix moe & gradient_checkpointing by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1782
- support phi3.5-vision by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1780
- ReFT by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1785
- update doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1789
- support qwen-vl & base64 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1790
- fix yi-vl template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1793
- fix bugs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1794
- fix imports by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1796
- fix history_roles by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1798
- fix mllm rlhf with full sft type by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1800
- fix CI by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1797
- fix megatron_patch_path by @wning13 in https://github.com/modelscope/ms-swift/pull/1804
- Support hd num by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1801
- Support Latex OCR dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1810
- fix offline export by @wning13 in https://github.com/modelscope/ms-swift/pull/1805
- fix by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1811
New Contributors
- @wning13 made their first contribution in https://github.com/modelscope/ms-swift/pull/1804
Full Changelog: https://github.com/modelscope/ms-swift/compare/v2.3.1...v2.3.2