v0.7.9

版本发布时间: 2024-01-09 20:06:13

huggingface/trl最新发布版本:v0.11.1(2024-09-25 00:13:05)

v0.7.9: Patch release for DPO & SFTTrainer

This is a patch release that fixes critical issues with SFTTrainer & DPOTrainer, together with minor fixes for PPOTrainer and DataCollatorForCompletionOnlyLM

What's Changed

Release: v0.7.8 by @younesbelkada in https://github.com/huggingface/trl/pull/1200
set dev version by @younesbelkada in https://github.com/huggingface/trl/pull/1201
Fix instruction token masking by @mgerstgrasser in https://github.com/huggingface/trl/pull/1185
Fix reported KL in PPO trainer by @mgerstgrasser in https://github.com/huggingface/trl/pull/1180
[DPOTrainer] Fix peft + DPO + bf16 if one uses generate_during_eval or pre-computed logits by @younesbelkada in https://github.com/huggingface/trl/pull/1203
Revert "Address issue #1122" by @younesbelkada in https://github.com/huggingface/trl/pull/1205
Release: v0.7.9 by @younesbelkada in https://github.com/huggingface/trl/pull/1206

Full Changelog: https://github.com/huggingface/trl/compare/v0.7.8...v0.7.9

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-01-09发行的版本