v0.3.0
版本发布时间: 2023-03-01 20:45:06
huggingface/trl最新发布版本:v0.11.1(2024-09-25 00:13:05)
What's Changed
- fix style, typos, license by @natolambert in https://github.com/lvwerra/trl/pull/103
- fix re-added file by @natolambert in https://github.com/lvwerra/trl/pull/116
- add citation by @natolambert in https://github.com/lvwerra/trl/pull/124
- add manual seeding for RL experiments by @natolambert in https://github.com/lvwerra/trl/pull/118
- add
set_seed
to init.py by @lvwerra in https://github.com/lvwerra/trl/pull/127 - update docs with Seq2seq models, set_seed, and create_reference_model by @lvwerra in https://github.com/lvwerra/trl/pull/128
- [
bug
] Update gpt2-sentiment.py by @younesbelkada in https://github.com/lvwerra/trl/pull/132 - Fix Sentiment control notebook by @lvwerra in https://github.com/lvwerra/trl/pull/126
- realign values by @lvwerra in https://github.com/lvwerra/trl/pull/137
- Change unclear variables & fix typos by @natolambert in https://github.com/lvwerra/trl/pull/134
- Feat/reward summarization example by @TristanThrush in https://github.com/lvwerra/trl/pull/115
- [
core
] Small refactor of forward pass by @younesbelkada in https://github.com/lvwerra/trl/pull/136 - [
tests
] Add correct repo name by @younesbelkada in https://github.com/lvwerra/trl/pull/138 - fix forward batching for seq2seq and right padding models. by @lvwerra in https://github.com/lvwerra/trl/pull/139
- fix bug in batched_forward_pass by @ArvinZhuang in https://github.com/lvwerra/trl/pull/144
- [
core
] Addtorch_dtype
support by @younesbelkada in https://github.com/lvwerra/trl/pull/147 - [
core
] Fix dataloader issue by @younesbelkada in https://github.com/lvwerra/trl/pull/154 - [
core
] enablebf16
training by @younesbelkada in https://github.com/lvwerra/trl/pull/156 - [
core
] fix saving multi-gpu by @younesbelkada in https://github.com/lvwerra/trl/pull/157 - Added imports by @BirgerMoell in https://github.com/lvwerra/trl/pull/159
- Add CITATION.cff by @kashif in https://github.com/lvwerra/trl/pull/169
- [Doc] Add how to use Lion optimizer by @younesbelkada in https://github.com/lvwerra/trl/pull/152
- policy kl [old | new] by @kashif in https://github.com/lvwerra/trl/pull/168
- add minibatching by @lvwerra in https://github.com/lvwerra/trl/pull/153
- fix bugs in tutorial by @shizhediao in https://github.com/lvwerra/trl/pull/175
- [
core
] Addmax_grad_norm
support by @younesbelkada in https://github.com/lvwerra/trl/pull/177 - Add toxcitiy example by @younesbelkada in https://github.com/lvwerra/trl/pull/162
- [
Docs
] Fix barplot by @younesbelkada in https://github.com/lvwerra/trl/pull/181
New Contributors
- @natolambert made their first contribution in https://github.com/lvwerra/trl/pull/103
- @ArvinZhuang made their first contribution in https://github.com/lvwerra/trl/pull/144
- @BirgerMoell made their first contribution in https://github.com/lvwerra/trl/pull/159
- @kashif made their first contribution in https://github.com/lvwerra/trl/pull/169
- @shizhediao made their first contribution in https://github.com/lvwerra/trl/pull/175
Full Changelog: https://github.com/lvwerra/trl/compare/v0.2.1...v0.3.0