v2.3.0

modelscope/ms-swift

版本发布时间: 2024-08-09 23:43:07

modelscope/ms-swift最新发布版本:v2.5.0(2024-10-10 10:21:04)

English Version

New Features

Support for readthedocs documentation site at: https://swift.readthedocs.io/en/latest
Support Megatron architecture training for QianWen series models, and added new pt command for pretraining. See docs: https://swift.readthedocs.io/en/latest/LLM/Megatron-training.html
Support LMDeploy for inference and deployment, improving inference acceleration for multi-modal models. See: https://swift.readthedocs.io/en/latest/Multi-Modal/LmDeploy-inference-acceleration.html
Support passing lora target modules via regular expressions
Support configuring max_memory usage for each GPU in device_map
export command supports BitsAndBytes quantization
export command supports Ollama export: https://swift.readthedocs.io/en/latest/LLM/OLLaMA-Export.html
Support Q-GaLore algorithm
Support RLHF training for multi-modal models: https://swift.readthedocs.io/en/latest/Multi-Modal/human-preference-alignment-training-documentation.html
Support evaluation on 100+ datasets for multi-modal models: https://swift.readthedocs.io/en/latest/LLM/LLM-eval.html
Support resizing input images when memory usage is too high for multi-modal models
Modified default lora injection for multi-modal model training. Now takes effect on LLM and projector, results are better without significantly increasing training memory.
Support PEFT 0.12, and added new tuner: fourierft
Support rope-scaling for multi-modal models
Support streaming processing of datasets to reduce memory usage, enable with --streaming
Support vLLM multi-modal inference and deployment
Support grounding task for popular multi-modal models.

New Models

qwen2-audio series
qwen2-math
codegeex4
internvl2 series
llava video
xcomposer2.5
cogvlm2-video
numina-math
mistral-nemo
llama3.1 series
mistral-large
gemma-2-2b
internlm2.5 1.8b 20b
minicpm-v-v2_6-chat

Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html

New Datasets

zhihu-kol and zhihu-kol-filtered
SA1B series multi-modal zh datasets

Check: https://swift.readthedocs.io/en/latest/LLM/Supported-models-datasets.html

中文版本

新功能

支持readthedocs文档库，地址：https://swift.readthedocs.io/zh-cn/latest
支持千问系列模型的Megatron结构训练，并支持了新的pt命令用于预训练，详见文档：https://swift.readthedocs.io/zh-cn/latest/LLM/Megatron%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
支持LMDeploy的推理和部署，更好地支持了多模态模型的推理加速，详见：https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/LmDeploy%E6%8E%A8%E7%90%86%E5%8A%A0%E9%80%9F%E6%96%87%E6%A1%A3.html
支持以正则表达式方式传入lora target模块
支持配置device_map各GPU用量的max_memory
export命令支持BitsAndBytes量化
export命令支持Ollama导出：https://swift.readthedocs.io/zh-cn/latest/LLM/OLLAMA%E5%AF%BC%E5%87%BA%E6%96%87%E6%A1%A3.html
支持Q-GaLore算法
支持多模态模型的RLHF训练：https://swift.readthedocs.io/zh-cn/latest/Multi-Modal/%E4%BA%BA%E7%B1%BB%E5%81%8F%E5%A5%BD%E5%AF%B9%E9%BD%90%E8%AE%AD%E7%BB%83%E6%96%87%E6%A1%A3.html
支持多模态模型100+数据集的评测能力：https://swift.readthedocs.io/zh-cn/latest/LLM/LLM%E8%AF%84%E6%B5%8B%E6%96%87%E6%A1%A3.html
支持多模态模型显存占用过高时对输入图片进行缩放
修改了多模态模型训练的默认lora注入，目前对LLM和projector生效，不显著提高训练显存情况下效果更好
支持PEFT0.12，并支持了新的tuner：fourierft
支持多模态模型的rope-scaling
支持数据集的流式处理，降低显存消耗，使用--streaming开启
支持了vLLM的多模态推理部署能力
对部分多模态模型支持了grounding任务

新模型

qwen2-audio系列模型
qwen2-math
codegeex4
internvl2系列模型
llava video
xcomposer2.5
cogvlm2-video
numina-math
mistral-nemo
llama3.1系列
mistral-large
gemma-2-2b
internlm2.5 1.8b 20b
minicpm-v-v2_6-chat

参考：https://swift.readthedocs.io/zh-cn/latest/LLM/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86.html

新数据集

zhihu-kol和zhihu-kol-filtered数据集
SA1B系列中文多模态数据集

参考：https://swift.readthedocs.io/zh-cn/latest/LLM/%E6%94%AF%E6%8C%81%E7%9A%84%E6%A8%A1%E5%9E%8B%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86.html

What's Changed

fix dependency by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1306
support codegeex4 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1305
support internvl2 by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1304
support llava video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1307
fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1309
support lr_scheduler_kwargs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1310
Fix internvl2 template by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1308
Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1311
support warmup_stable_decay by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1312
Support xcomposer2.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1287
Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1319
fix bug by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1320
fix template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1321
support cogvlm2-video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1318
Fix bugs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1325
fix web-ui by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1327
compatible with trl 0.9.6 by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1326
compat with vllm==0.5.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1329
Update qrcode by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1332
fix florence model by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1334
Relaxing requirements for trl by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1342
fix xcomposer2.5 device_map by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1343
support generation_info by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1344
fix requirements by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1347
readthedocs by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1345
fix sequence parallel get labels by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1352
fix filelock by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1354
Add pt command by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1356
fix generation_info efficiency by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1359
fix sh ddp_backend by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1360
support LLM & lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1272
fix a file path by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1363
Internvl2 support video by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1366
fix openai api by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1367
fix internvl2-40b by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1369
fix vlm deploy lora & agent by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1371
Support lora regex by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1375
Fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1380
Fix FSDP; Add training percentage to jsonl logging; Add a web-ui component by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1381
Support max memory args by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1382
fix max_memory by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1383
Fix gpu assert calculation by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1384
fix dataset_sample & deploy stop_words by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1385
fix internvl doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1394
Fix link by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1397
fix vllm==0.5.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1404
[TorchAcc] update accelerate API and add llama3-70B by @baoleai in https://github.com/modelscope/ms-swift/pull/1400
Support Ollama and BNB for export by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1407
Fix glm4v merge lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1410
[TorchAcc] fix model download when using TorchAcc distributed training by @baoleai in https://github.com/modelscope/ms-swift/pull/1408
Support padding left by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1414
Fix ollama export by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1416
fix web-ui params by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1417
fix hub_token by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1420
Update ms hub token by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1424
Add numina math model by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1421
fix internvl template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1433
Internvl series models update by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1426
fix internvl2 template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1436
Fix bug and make lazydataset more stable by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1438
Fix llava-hf by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1439
[WIP]Support Q-Galore by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1440
1. support deepspeed on ui 2. add tools to client_utils by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1446
fix read csv (float) by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1447
fix dataset by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1448
update internvl doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1449
Support api key by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1452
Support mistral nemo series models by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1454
fix minicpm-v2.5 lora_target_modules by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1455
Add two datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1459
Update trl dependency version by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1463
fix bugs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1464
fix yi1.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1465
Fix yi1.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1467
add activate and deactivate for part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1470
support llama3.1 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1475
support megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1365
fix megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1476
Support internvl2 grounding by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1473
update doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1477
Support alignment algorithm for vision MLLM by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1474
fix doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1481
Fix visual cpo by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1482
support llama3.1-quant by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1478
fix part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1483
fix import by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1488
Fix GLM4V by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1490
support mistral large by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1485
fix resume_only_model & zero3 & full by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1497
Fix resume_from_checkpoint & full by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1498
fix part tuner by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1495
fix cogvlm2-video by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1500
[TorchAcc] add script for qwen2 in torchacc by @Zhikaiiii in https://github.com/modelscope/ms-swift/pull/1492
Fix CI by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1501
fix vlm template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1503
fix internvl-4b by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1505
support zero3 & freeze by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1508
fix part mix with lora by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1509
fix docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1511
Update README.md by @ArtificialZeng in https://github.com/modelscope/ms-swift/pull/1516
fix kto custom data by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1515
Fix KTO doc by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1517
Rescale image by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1512
fix pretrain dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1518
fix deepseek-vl template by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1521
Support exporting of llama3.1, and awq-batch-size by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1522
support lmdeploy & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1364
fix tf 4.43 llava by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1525
fix llamapro by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1527
fix template & docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1529
fix lmdeploy & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1530
update doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1531
fix lmdeploy & minicpm-v-2.5 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1534
fix internvl-phi3 batch infer by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1539
Support SA1B series datasets by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1542
fix bug in _prepare_inputs by @guihonghao in https://github.com/modelscope/ms-swift/pull/1543
Support lmdeploy infer deploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1541
add lmdeploy link by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1545
support lmdeploy & app-ui by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1546
fix lmdeploy bug by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1550
support more models by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1552
fix multi node by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1554
support lmdeploy awq by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1555
support quant_policy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1556
fix xcomposer lora by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1559
Update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1558
fix minicpm-v by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1562
add result_dir paramerter to InferArgument & fix a minor bug by @starxhong in https://github.com/modelscope/ms-swift/pull/1561
fix some bugs in dpo by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1565
Fix bugs 0801 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1566
fix dataset copy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1569
fix qwen-vl-merged lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1572
Change multi-modal default lora to llm&projector by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1571
fix quant by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1573
fix kto by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1575
update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1578
Fix huge model saving by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1579
Fix/0802 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1581
Peft 0.12 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1586
fix bugs in gemma2-2b-it by @DaozeZhang in https://github.com/modelscope/ms-swift/pull/1587
[TorchAcc] Update patch for transformers>=4.41.0 by @baoleai in https://github.com/modelscope/ms-swift/pull/1584
fix agent deployment by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1592
support swift deploy stats by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1593
Fix megatron convert by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1597
add gemma-2-2b by @DaozeZhang in https://github.com/modelscope/ms-swift/pull/1595
support max_batch_size by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1599
update docs by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1600
support multi modal evaluation by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1540
support internlm2.5 1.8b 20b by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1551
support qwen1.5 megatron by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1564
compat with peft==0.11 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1604
Fix InternVL2 doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1607
Fix rope scaling by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1610
Fix/rope by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1612
support minicpm-v-v2_6-chat by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1609
Fix InternVL2-LLaMA3 by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1614
fix rope scaling bug by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1620
fix florence template by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1618
support internlm_xcomposer2_4khd by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1622
support vllm & vlm by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1630
Fix ci by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1634
Compat transformers 4.44 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1640
fix xcomposer lora_target_modules by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1645
fix TypeError: 'NoneType' object is not iterable, when only have video data the image is none by @Wondersui in https://github.com/modelscope/ms-swift/pull/1637
support qwen2-math by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1644
fix peft patch by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1647
fix oom test in rlhf by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1651
Fix peft patch by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1650
[TorchAcc] fix bcast of output_dir by @baoleai in https://github.com/modelscope/ms-swift/pull/1652
fix tp lmdeploy by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1654
fix transformers==4.33 by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1655
support qwen2-audio by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1633
update aishell1 dataset by @Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1657
Add OLLaMA doc by @tastelikefeet in https://github.com/modelscope/ms-swift/pull/1660
Support IterableDataset by @hjh0119 in https://github.com/modelscope/ms-swift/pull/1596

New Contributors

@ArtificialZeng made their first contribution in https://github.com/modelscope/ms-swift/pull/1516
@guihonghao made their first contribution in https://github.com/modelscope/ms-swift/pull/1543
@DaozeZhang made their first contribution in https://github.com/modelscope/ms-swift/pull/1587
@Wondersui made their first contribution in https://github.com/modelscope/ms-swift/pull/1637

Full Changelog: https://github.com/modelscope/ms-swift/compare/v2.2.0...v2.3.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-08-09发行的版本