v0.7.11
版本发布时间: 2024-02-16 16:22:47
huggingface/trl最新发布版本:v0.11.1(2024-09-25 00:13:05)
DPO important fixes
We fixed issues with respect to IPO loss, leading to consistent results according to newest experiements:
- [DPO] average_log_prob when loss is IPO by @kashif in https://github.com/huggingface/trl/pull/1265
We also fixed important bugs with respect to DPO / PEFT and Flash Attention
- [
DPOTrainer
] Fix DPO trainer + mistral + FA2 by @younesbelkada in https://github.com/huggingface/trl/pull/1290
Data processing is now faster for multi-GPU envs
- [
DPOTrainer
] Load data only on main process + fix dpo example test by @younesbelkada in https://github.com/huggingface/trl/pull/1291 - Add multiprocessing in the DPO trainer. by @imraviagrawal in https://github.com/huggingface/trl/pull/1286
Other DPO bugfixes:
- [
PEFT
+DPO
] Raise value error if one passes a ref_model and a peft_config by @younesbelkada in https://github.com/huggingface/trl/pull/1289 - Fix wrong variable name in DPOTrainer documentation example by @ouhenio in https://github.com/huggingface/trl/pull/1280
- fix padding in dpo trainer by @pacman100 in https://github.com/huggingface/trl/pull/1284
- Fix AttributeError in dpo_trainer for reference_free case in dpo_loss function by @maliozer in https://github.com/huggingface/trl/pull/1313
- [DPOTrainer] Add multiprocessing for the eval_dataset map by @esceptico in https://github.com/huggingface/trl/pull/1307
Faster data processing and other enhancements:
- Only load data on main process by @JohnGiorgi in https://github.com/huggingface/trl/pull/1255
- Remove tyro by @vwxyzjn in https://github.com/huggingface/trl/pull/1176
Automatic tagging for all models
Models now gets tagged correctly even if users do not call trainer.push_to_hub()
- [
core
/xxxTrainer
] Automatic tagging by @younesbelkada in https://github.com/huggingface/trl/pull/1329
What's Changed
- set dev version by @younesbelkada in https://github.com/huggingface/trl/pull/1254
- Update Model Generation config to reflect new special tokens by @philschmid in https://github.com/huggingface/trl/pull/1256
- Fix a typo in variable name by @otlaitil in https://github.com/huggingface/trl/pull/1269
- FIx SFTTrainer bugs on TRL main by @younesbelkada in https://github.com/huggingface/trl/pull/1276
- Fix SFT tuner in CI by @vwxyzjn in https://github.com/huggingface/trl/pull/1278
- Fix sft ci by @vwxyzjn in https://github.com/huggingface/trl/pull/1279
- Fix DPO slow tests by @younesbelkada in https://github.com/huggingface/trl/pull/1292
- Fix sft trainer when args is None by @younesbelkada in https://github.com/huggingface/trl/pull/1295
- Fix
DPOTrainer
docstrings by @alvarobartt in https://github.com/huggingface/trl/pull/1298 - Types: Fix PEP 484 implicit-optional compliance by @akx in https://github.com/huggingface/trl/pull/1297
- Update sft_trainer.mdx to add note on launching DDP training by @johnowhitaker in https://github.com/huggingface/trl/pull/1308
- Codemod Unittest assertions to bare asserts by @akx in https://github.com/huggingface/trl/pull/1301
- ENH: Run CI only if relevant files are modified by @younesbelkada in https://github.com/huggingface/trl/pull/1309
- Fix typos in docs for Multi Adapter RL (MARL). by @elhusseiniali in https://github.com/huggingface/trl/pull/1312
- Fix doc snippet PPOTrainer argument train_dataset -> dataset by @j-cb in https://github.com/huggingface/trl/pull/1321
- Best practice recommendation update for dpo_trainer.mdx by @R-seny in https://github.com/huggingface/trl/pull/1325
- pre-commit: replace linters + formatters with Ruff; fix some issues by @akx in https://github.com/huggingface/trl/pull/1300
- Update README.md to clarify model requirement by @markstur in https://github.com/huggingface/trl/pull/1315
- [
core
/DDPO
] Fix diffusers import issue by @younesbelkada in https://github.com/huggingface/trl/pull/1314 - [
CI
] Add tests on transformers peft main on push main by @younesbelkada in https://github.com/huggingface/trl/pull/1328 - Release: v0.7.11 by @younesbelkada in https://github.com/huggingface/trl/pull/1331
New Contributors
- @otlaitil made their first contribution in https://github.com/huggingface/trl/pull/1269
- @JohnGiorgi made their first contribution in https://github.com/huggingface/trl/pull/1255
- @ouhenio made their first contribution in https://github.com/huggingface/trl/pull/1280
- @imraviagrawal made their first contribution in https://github.com/huggingface/trl/pull/1286
- @akx made their first contribution in https://github.com/huggingface/trl/pull/1297
- @esceptico made their first contribution in https://github.com/huggingface/trl/pull/1307
- @johnowhitaker made their first contribution in https://github.com/huggingface/trl/pull/1308
- @elhusseiniali made their first contribution in https://github.com/huggingface/trl/pull/1312
- @maliozer made their first contribution in https://github.com/huggingface/trl/pull/1313
- @j-cb made their first contribution in https://github.com/huggingface/trl/pull/1321
- @R-seny made their first contribution in https://github.com/huggingface/trl/pull/1325
- @markstur made their first contribution in https://github.com/huggingface/trl/pull/1315
Full Changelog: https://github.com/huggingface/trl/compare/v0.7.10...v0.7.11