v2.3.0
版本发布时间: 2024-08-09 23:43:07
modelscope/ms-swift最新发布版本:v2.5.0(2024-10-10 10:21:04)
English Version
New Features
- Support for readthedocs documentation site at: https://swift.readthedocs.io/en/latest
- Support Megatron architecture training for QianWen series models, and added new
pt
command for pretraining. See docs: https://swift.readthedocs.io/en/latest/LLM/Megatron-training.html - Support LMDeploy for inference and deployment, improving inference acceleration for multi-modal models. See: https://swift.readthedocs.io/en/latest/Multi-Modal/LmDeploy-inference-acceleration.html
- Support passing lora target modules via regular expressions
- Support configuring max_memory usage for each GPU in device_map
-
export
command supports BitsAndBytes quantization -
export
command supports Ollama export: https://swift.readthedocs.io/en/latest/LLM/OLLaMA-Export.html - Support Q-GaLore algorithm
- Support RLHF training for multi-modal models: https://swift.readthedocs.io/en/latest/Multi-Modal/human-preference-alignment-training-documentation.html
- Support evaluation on 100+ datasets for multi-modal models: https://swift.readthedocs.io/en/latest/LLM/LLM-eval.html
- Support resizing input images when memory usage is too high for multi-modal models
- Modified default lora injection for multi-modal model training. Now takes effect on LLM and projector, results are better without significantly increasing training memory.
- Support PEFT 0.12, and added new tuner: fourierft
- Support rope-scaling for multi-modal models
- Support streaming processing of datasets to reduce memory usage, enable with
--streaming
- Support vLLM multi-modal inference and deployment
- Support grounding task for popular multi-modal models.
New Models
- qwen2-audio series
- qwen2-math
- codegeex4
- internvl2 series
- llava video
- xcomposer2.5
- cogvlm2-video
- numina-math
- mistral-nemo
- llama3.1 series
- mistral-large
- gemma-2-2b
- internlm2.5 1.8b 20b
- minicpm-v-v2_6-chat
Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html
New Datasets
- zhihu-kol and zhihu-kol-filtered
- SA1B series multi-modal zh datasets
Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html
中文版本
新功能
- 支持readthedocs文档库, 地址:https://swift.readthedocs.io/zh-cn/latest
- 支持千问系列模型的Megatron结构训练,并支持了新的pt命令用于预训练,详见文档:https://swift.readthedocs.io/zh-cn/latest/LLM/Megatron%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
- 支持LMDeploy的推理和部署,更好地支持了多模态模型的推理加速,详见:https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/LmDeploy%E6%8E%A8%E7%90%86%E5%8A%A0%E9%80%9F%E6%96%87%E6%A1%A3.html
- 支持以正则表达式方式传入lora target模块
- 支持配置device_map各GPU用量的max_memory
- export命令支持BitsAndBytes量化
- export命令支持Ollama导出:https://swift.readthedocs.io/zh-cn/latest/LLM/OLLAMA%E5%AF%BC%E5%87%BA%E6%96%87%E6%A1%A3.html
- 支持Q-GaLore算法
- 支持多模态模型的RLHF训练:https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/%E4%BA%BA%E7%B1%BB%E5%81%8F%E5%A5%BD%E5%AF%B9%E9%BD%90%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
- 支持多模态模型100+数据集的评测能力:https://swift.readthedocs.io/zh-cn/latest/LLM/LLM%E8%AF%84%E6%B5%8B%E6%96%87%E6%A1%A3.html
- 支持多模态模型显存占用过高时对输入图片进行缩放
- 修改了多模态模型训练的默认lora注入,目前对LLM和projector生效,不显著提高训练显存情况下效果更好
- 支持PEFT0.12,并支持了新的tuner:fourierft
- 支持多模态模型的rope-scaling
- 支持数据集的流式处理,降低显存消耗,使用--streaming开启
- 支持了vLLM的多模态推理部署能力
- 对部分多模态模型支持了grounding任务
新模型
- qwen2-audio系列模型
- qwen2-math
- codegeex4
- internvl2系列模型
- llava video
- xcomposer2.5
- cogvlm2-video
- numina-math
- mistral-nemo
- llama3.1系列
- mistral-large
- gemma-2-2b
- internlm2.5 1.8b 20b
- minicpm-v-v2_6-chat
参考:https://swift.readthedocs.io/zh-cn/latest/LLM/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86.html
新数据集
- zhihu-kol和zhihu-kol-filtered数据集
- SA1B系列中文多模态数据集
参考:https://swift.readthedocs.io/zh-cn/latest/LLM/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86.html
What's Changed
- fix dependency by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1306
- support codegeex4 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1305
- support internvl2 by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1304
- support llava video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1307
- fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1309
- support lr_scheduler_kwargs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1310
- Fix internvl2 template by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1308
- Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1311
- support warmup_stable_decay by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1312
- Support xcomposer2.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1287
- Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1319
- fix bug by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1320
- fix template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1321
- support cogvlm2-video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1318
- Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1325
- fix web-ui by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1327
- compatible with trl 0.9.6 by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1326
- compat with vllm==0.5.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1329
- Update qrcode by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1332
- fix florence model by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1334
- Relaxing requirements for trl by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1342
- fix xcomposer2.5 device_map by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1343
- support generation_info by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1344
- fix requirements by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1347
- readthedocs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1345
- fix sequence parallel get labels by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1352
- fix filelock by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1354
- Add pt command by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1356
- fix generation_info efficiency by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1359
- fix sh ddp_backend by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1360
- support LLM & lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1272
- fix a file path by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1363
- Internvl2 support video by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1366
- fix openai api by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1367
- fix internvl2-40b by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1369
- fix vlm deploy lora & agent by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1371
- Support lora regex by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1375
- Fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1380
- Fix FSDP; Add training percentage to jsonl logging; Add a web-ui component by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1381
- Support max memory args by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1382
- fix max_memory by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1383
- Fix gpu assert calculation by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1384
- fix dataset_sample & deploy stop_words by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1385
- fix internvl doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1394
- Fix link by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1397
- fix vllm==0.5.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1404
- [TorchAcc] update accelerate API and add llama3-70B by @baoleai in https://github.com/modelscope/ms-swift/pull/1400
- Support Ollama and BNB for export by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1407
- Fix glm4v merge lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1410
- [TorchAcc] fix model download when using TorchAcc distributed training by @baoleai in https://github.com/modelscope/ms-swift/pull/1408
- Support padding left by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1414
- Fix ollama export by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1416
- fix web-ui params by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1417
- fix hub_token by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1420
- Update ms hub token by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1424
- Add numina math model by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1421
- fix internvl template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1433
- Internvl series models update by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1426
- fix internvl2 template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1436
- Fix bug and make lazydataset more stable by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1438
- Fix llava-hf by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1439
- [WIP]Support Q-Galore by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1440
-
- support deepspeed on ui 2. add tools to client_utils by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1446
- fix read csv (float) by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1447
- fix dataset by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1448
- update internvl doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1449
- Support api key by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1452
- Support mistral nemo series models by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1454
- fix minicpm-v2.5 lora_target_modules by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1455
- Add two datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1459
- Update trl dependency version by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1463
- fix bugs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1464
- fix yi1.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1465
- Fix yi1.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1467
- add activate and deactivate for part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1470
- support llama3.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1475
- support megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1365
- fix megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1476
- Support internvl2 grounding by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1473
- update doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1477
- Support alignment algorithm for vision MLLM by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1474
- fix doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1481
- Fix visual cpo by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1482
- support llama3.1-quant by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1478
- fix part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1483
- fix import by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1488
- Fix GLM4V by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1490
- support mistral large by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1485
- fix resume_only_model & zero3 & full by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1497
- Fix resume_from_checkpoint & full by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1498
- fix part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1495
- fix cogvlm2-video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1500
- [TorchAcc] add script for qwen2 in torchacc by @Zhikaiiii in https://github.com/modelscope/ms-swift/pull/1492
- Fix CI by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1501
- fix vlm template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1503
- fix internvl-4b by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1505
- support zero3 & freeze by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1508
- fix part mix with lora by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1509
- fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1511
- Update README.md by @ArtificialZeng in https://github.com/modelscope/ms-swift/pull/1516
- fix kto custom data by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1515
- Fix KTO doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1517
- Rescale image by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1512
- fix pretrain dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1518
- fix deepseek-vl template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1521
- Support exporting of llama3.1, and awq-batch-size by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1522
- support lmdeploy & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1364
- fix tf 4.43 llava by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1525
- fix llamapro by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1527
- fix template & docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1529
- fix lmdeploy & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1530
- update doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1531
- fix lmdeploy & minicpm-v-2.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1534
- fix internvl-phi3 batch infer by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1539
- Support SA1B series datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1542
- fix bug in _prepare_inputs by @guihonghao in https://github.com/modelscope/ms-swift/pull/1543
- Support lmdeploy infer deploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1541
- add lmdeploy link by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1545
- support lmdeploy & app-ui by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1546
- fix lmdeploy bug by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1550
- support more models by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1552
- fix multi node by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1554
- support lmdeploy awq by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1555
- support quant_policy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1556
- fix xcomposer lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1559
- Update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1558
- fix minicpm-v by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1562
- add result_dir paramerter to InferArgument & fix a minor bug by @starxhong in https://github.com/modelscope/ms-swift/pull/1561
- fix some bugs in dpo by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1565
- Fix bugs 0801 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1566
- fix dataset copy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1569
- fix qwen-vl-merged lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1572
- Change multi-modal default lora to llm&projector by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1571
- fix quant by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1573
- fix kto by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1575
- update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1578
- Fix huge model saving by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1579
- Fix/0802 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1581
- Peft 0.12 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1586
- fix bugs in gemma2-2b-it by @DaozeZhang in https://github.com/modelscope/ms-swift/pull/1587
- [TorchAcc] Update patch for transformers>=4.41.0 by @baoleai in https://github.com/modelscope/ms-swift/pull/1584
- fix agent deployment by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1592
- support swift deploy stats by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1593
- Fix megatron convert by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1597
- add gemma-2-2b by @DaozeZhang in https://github.com/modelscope/ms-swift/pull/1595
- support max_batch_size by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1599
- update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1600
- support multi modal evaluation by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1540
- support internlm2.5 1.8b 20b by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1551
- support qwen1.5 megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1564
- compat with peft==0.11 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1604
- Fix InternVL2 doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1607
- Fix rope scaling by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1610
- Fix/rope by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1612
- support minicpm-v-v2_6-chat by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1609
- Fix InternVL2-LLaMA3 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1614
- fix rope scaling bug by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1620
- fix florence template by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1618
- support internlm_xcomposer2_4khd by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1622
- support vllm & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1630
- Fix ci by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1634
- Compat transformers 4.44 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1640
- fix xcomposer lora_target_modules by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1645
- fix TypeError: 'NoneType' object is not iterable, when only have video data the image is none by @Wondersui in https://github.com/modelscope/ms-swift/pull/1637
- support qwen2-math by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1644
- fix peft patch by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1647
- fix oom test in rlhf by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1651
- Fix peft patch by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1650
- [TorchAcc] fix bcast of output_dir by @baoleai in https://github.com/modelscope/ms-swift/pull/1652
- fix tp lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1654
- fix transformers==4.33 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1655
- support qwen2-audio by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1633
- update aishell1 dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1657
- Add OLLaMA doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1660
- Support IterableDataset by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1596
New Contributors
- @ArtificialZeng made their first contribution in https://github.com/modelscope/ms-swift/pull/1516
- @guihonghao made their first contribution in https://github.com/modelscope/ms-swift/pull/1543
- @DaozeZhang made their first contribution in https://github.com/modelscope/ms-swift/pull/1587
- @Wondersui made their first contribution in https://github.com/modelscope/ms-swift/pull/1637
Full Changelog: https://github.com/modelscope/ms-swift/compare/v2.2.0...v2.3.0