v0.8.4
版本发布时间: 2024-04-17 23:22:10
huggingface/trl最新发布版本:v0.11.1(2024-09-25 00:13:05)
This patch release includes important fixes for the CLI and KTO & CPO trainers
What's Changed
- set dev version by @younesbelkada in https://github.com/huggingface/trl/pull/1529
- [CPO] fix memory leak due to retained value by @kashif in https://github.com/huggingface/trl/pull/1531
- VSFT hotfix - adds gen prompt to template and processor to hub by @edbeeching in https://github.com/huggingface/trl/pull/1532
- save_model -> save_pretrained in ppo_trainer.mdx by @ejmejm in https://github.com/huggingface/trl/pull/1537
- [KTO] support to load the adapter twice by @claralp in https://github.com/huggingface/trl/pull/1542
- CLI: Set
dataset_text_field
toNone
to allow ChatML automatic template by @younesbelkada in https://github.com/huggingface/trl/pull/1545 - FIX: Fix slow test by @younesbelkada in https://github.com/huggingface/trl/pull/1546
- Fixed ref model not used in PPO generation by @ejmejm in https://github.com/huggingface/trl/pull/1534
- Release: v0.8.4 by @younesbelkada in https://github.com/huggingface/trl/pull/1547
New Contributors
- @ejmejm made their first contribution in https://github.com/huggingface/trl/pull/1537
Full Changelog: https://github.com/huggingface/trl/compare/v0.8.3...v0.8.4