v0.7.11

huggingface/trl

版本发布时间: 2024-02-16 16:22:47

huggingface/trl最新发布版本:v0.11.1(2024-09-25 00:13:05)

DPO important fixes

We fixed issues with respect to IPO loss, leading to consistent results according to newest experiements:

[DPO] average_log_prob when loss is IPO by @kashif in https://github.com/huggingface/trl/pull/1265

We also fixed important bugs with respect to DPO / PEFT and Flash Attention

[DPOTrainer] Fix DPO trainer + mistral + FA2 by @younesbelkada in https://github.com/huggingface/trl/pull/1290

Data processing is now faster for multi-GPU envs

[DPOTrainer] Load data only on main process + fix dpo example test by @younesbelkada in https://github.com/huggingface/trl/pull/1291
Add multiprocessing in the DPO trainer. by @imraviagrawal in https://github.com/huggingface/trl/pull/1286

Other DPO bugfixes:

[PEFT + DPO] Raise value error if one passes a ref_model and a peft_config by @younesbelkada in https://github.com/huggingface/trl/pull/1289
Fix wrong variable name in DPOTrainer documentation example by @ouhenio in https://github.com/huggingface/trl/pull/1280
fix padding in dpo trainer by @pacman100 in https://github.com/huggingface/trl/pull/1284
Fix AttributeError in dpo_trainer for reference_free case in dpo_loss function by @maliozer in https://github.com/huggingface/trl/pull/1313
[DPOTrainer] Add multiprocessing for the eval_dataset map by @esceptico in https://github.com/huggingface/trl/pull/1307

Faster data processing and other enhancements:

Only load data on main process by @JohnGiorgi in https://github.com/huggingface/trl/pull/1255
Remove tyro by @vwxyzjn in https://github.com/huggingface/trl/pull/1176

Automatic tagging for all models

Models now gets tagged correctly even if users do not call trainer.push_to_hub()

[core / xxxTrainer] Automatic tagging by @younesbelkada in https://github.com/huggingface/trl/pull/1329

What's Changed

set dev version by @younesbelkada in https://github.com/huggingface/trl/pull/1254
Update Model Generation config to reflect new special tokens by @philschmid in https://github.com/huggingface/trl/pull/1256
Fix a typo in variable name by @otlaitil in https://github.com/huggingface/trl/pull/1269
FIx SFTTrainer bugs on TRL main by @younesbelkada in https://github.com/huggingface/trl/pull/1276
Fix SFT tuner in CI by @vwxyzjn in https://github.com/huggingface/trl/pull/1278
Fix sft ci by @vwxyzjn in https://github.com/huggingface/trl/pull/1279
Fix DPO slow tests by @younesbelkada in https://github.com/huggingface/trl/pull/1292
Fix sft trainer when args is None by @younesbelkada in https://github.com/huggingface/trl/pull/1295
Fix DPOTrainer docstrings by @alvarobartt in https://github.com/huggingface/trl/pull/1298
Types: Fix PEP 484 implicit-optional compliance by @akx in https://github.com/huggingface/trl/pull/1297
Update sft_trainer.mdx to add note on launching DDP training by @johnowhitaker in https://github.com/huggingface/trl/pull/1308
Codemod Unittest assertions to bare asserts by @akx in https://github.com/huggingface/trl/pull/1301
ENH: Run CI only if relevant files are modified by @younesbelkada in https://github.com/huggingface/trl/pull/1309
Fix typos in docs for Multi Adapter RL (MARL). by @elhusseiniali in https://github.com/huggingface/trl/pull/1312
Fix doc snippet PPOTrainer argument train_dataset -> dataset by @j-cb in https://github.com/huggingface/trl/pull/1321
Best practice recommendation update for dpo_trainer.mdx by @R-seny in https://github.com/huggingface/trl/pull/1325
pre-commit: replace linters + formatters with Ruff; fix some issues by @akx in https://github.com/huggingface/trl/pull/1300
Update README.md to clarify model requirement by @markstur in https://github.com/huggingface/trl/pull/1315
[core / DDPO] Fix diffusers import issue by @younesbelkada in https://github.com/huggingface/trl/pull/1314
[CI] Add tests on transformers peft main on push main by @younesbelkada in https://github.com/huggingface/trl/pull/1328
Release: v0.7.11 by @younesbelkada in https://github.com/huggingface/trl/pull/1331

New Contributors

@otlaitil made their first contribution in https://github.com/huggingface/trl/pull/1269
@JohnGiorgi made their first contribution in https://github.com/huggingface/trl/pull/1255
@ouhenio made their first contribution in https://github.com/huggingface/trl/pull/1280
@imraviagrawal made their first contribution in https://github.com/huggingface/trl/pull/1286
@akx made their first contribution in https://github.com/huggingface/trl/pull/1297
@esceptico made their first contribution in https://github.com/huggingface/trl/pull/1307
@johnowhitaker made their first contribution in https://github.com/huggingface/trl/pull/1308
@elhusseiniali made their first contribution in https://github.com/huggingface/trl/pull/1312
@maliozer made their first contribution in https://github.com/huggingface/trl/pull/1313
@j-cb made their first contribution in https://github.com/huggingface/trl/pull/1321
@R-seny made their first contribution in https://github.com/huggingface/trl/pull/1325
@markstur made their first contribution in https://github.com/huggingface/trl/pull/1315

Full Changelog: https://github.com/huggingface/trl/compare/v0.7.10...v0.7.11

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-02-16发行的版本