v0.5.0

CarperAI/trlx

版本发布时间: 2023-02-23 07:50:34

CarperAI/trlx最新发布版本:v0.7.0(2023-06-24 06:21:52)

Highlights

Initial NeMo ILQL integration leading way to large-scale RLHF efforts. See https://github.com/CarperAI/trlx/blob/main/trlx/models/README.md to get started.
In-depth example showcasing trlx usage on AnthropicAI's Helpful & Harmless dataset https://github.com/CarperAI/trlx/tree/main/examples/hh
Improved ILQL modeling integration with Hugging Face transformers. Users can now work with AutoModelForCausalLMWithILQLHeads objects to generate samples and save/load fine-tuned ILQL models that can be quickly pushed to the Hub.

What's Changed

Add wandb group naming by @jon-tow in https://github.com/CarperAI/trlx/pull/188
Update reward_fn signatures in examples by @jon-tow in https://github.com/CarperAI/trlx/pull/190
Add tokenizer config by @reciprocated in https://github.com/CarperAI/trlx/pull/189
Fix extraction of mixed_precision option for deepspeed by @reciprocated in https://github.com/CarperAI/trlx/pull/197
Fix summarize_rlhf inference checkpoint paths by @jon-tow in https://github.com/CarperAI/trlx/pull/194
Make the config loading consistent across all example scripts. by @shermansiu in https://github.com/CarperAI/trlx/pull/192
Make Trainer.save_pretrained sub-directory optional by @jon-tow in https://github.com/CarperAI/trlx/pull/201
Update Readme to include T5 models by @aaronrmm in https://github.com/CarperAI/trlx/pull/198
Make make_head accept dtype parameter by @reciprocated in https://github.com/CarperAI/trlx/pull/213
Enable training with Tensorboard tracking by @marcobellagente93 in https://github.com/CarperAI/trlx/pull/209
Support nested updates in merge by @cat-state in https://github.com/CarperAI/trlx/pull/219
Fix typo reward normalize summarize by @PhungVanDuy in https://github.com/CarperAI/trlx/pull/221
Update stale comment from results table by @jon-tow in https://github.com/CarperAI/trlx/pull/222
Fix undefined trackers property by @alan-cooney in https://github.com/CarperAI/trlx/pull/224
Fix tokenizer missing form config.to_dict() by @alan-cooney in https://github.com/CarperAI/trlx/pull/228
Make experiment tracking optional by @jon-tow in https://github.com/CarperAI/trlx/pull/226
read tokenizer path from config correctly by @JustinAWei in https://github.com/CarperAI/trlx/pull/230
Add devcontainer support by @alan-cooney in https://github.com/CarperAI/trlx/pull/196
fix: change lora_a:float to lora_r:int by @aaronrmm in https://github.com/CarperAI/trlx/pull/235
Bump isort to hotfix CI code quality workflow by @jon-tow in https://github.com/CarperAI/trlx/pull/237
Fix optional tracking in accelerator.log by @jon-tow in https://github.com/CarperAI/trlx/pull/233
Improve documentation/comments on the random walk example by @alan-cooney in https://github.com/CarperAI/trlx/pull/208
Update link to "Learning to Summarize from Human Feedback" by @jon-tow in https://github.com/CarperAI/trlx/pull/241
Fix deepspeed state saving under save_best condition by @reciprocated in https://github.com/CarperAI/trlx/pull/242
added colab notebook by @smellslikeml in https://github.com/CarperAI/trlx/pull/244
[style] Increase black's line length by @reciprocated in https://github.com/CarperAI/trlx/pull/250
Add help string to get_advantages_and_returns by @pesvut in https://github.com/CarperAI/trlx/pull/225
Filter out empty responses by @reciprocated in https://github.com/CarperAI/trlx/pull/265
NeMo Integrate by @cat-state in https://github.com/CarperAI/trlx/pull/125
Add multi-process logger utility for status monitoring by @jon-tow in https://github.com/CarperAI/trlx/pull/254
Add NeMo support info to README by @jon-tow in https://github.com/CarperAI/trlx/pull/275
Fix distributed dataloaders & deduplicate eval by @reciprocated in https://github.com/CarperAI/trlx/pull/276
Improve PPO readability by @alan-cooney in https://github.com/CarperAI/trlx/pull/210
Add T5 to delta modifier map by @aaronrmm in https://github.com/CarperAI/trlx/pull/234
[fix] Set deepspeed's fp16 auto_cast to false by @reciprocated in https://github.com/CarperAI/trlx/pull/279
Rename remaining logprobs_from_logits call by @jon-tow in https://github.com/CarperAI/trlx/pull/281
[feat] Add Accelerate SFT Trainer by @reciprocated in https://github.com/CarperAI/trlx/pull/280
Add Colab Notebook for Sentiment by @zswitten in https://github.com/CarperAI/trlx/pull/285
Remove pylance installs from devcontainer by @jon-tow in https://github.com/CarperAI/trlx/pull/296
Move notebooks to examples dir by @jon-tow in https://github.com/CarperAI/trlx/pull/294
[fix] Summarize config discrepancy by @reciprocated in https://github.com/CarperAI/trlx/pull/293
Make Git check optional by @cat-state in https://github.com/CarperAI/trlx/pull/299
refactor: remove orchestrator abstraction from API by @jon-tow in https://github.com/CarperAI/trlx/pull/289
Set add_special_tokens=False to not add EOS unexpectedly by @cat-state in https://github.com/CarperAI/trlx/pull/287
[feat] Gather experience samples by @reciprocated in https://github.com/CarperAI/trlx/pull/305
[fix] Make gather_for_metrics usage more strict by @reciprocated in https://github.com/CarperAI/trlx/pull/315
Add helpful and harmless example by @reciprocated in https://github.com/CarperAI/trlx/pull/128
Adopt PreTrainedModelWrapper for Hugging Face models by @jon-tow in https://github.com/CarperAI/trlx/pull/215

New Contributors

@shermansiu made their first contribution in https://github.com/CarperAI/trlx/pull/192
@aaronrmm made their first contribution in https://github.com/CarperAI/trlx/pull/198
@marcobellagente93 made their first contribution in https://github.com/CarperAI/trlx/pull/209
@alan-cooney made their first contribution in https://github.com/CarperAI/trlx/pull/224
@JustinAWei made their first contribution in https://github.com/CarperAI/trlx/pull/230
@smellslikeml made their first contribution in https://github.com/CarperAI/trlx/pull/244
@pesvut made their first contribution in https://github.com/CarperAI/trlx/pull/225
@zswitten made their first contribution in https://github.com/CarperAI/trlx/pull/285

Full Changelog: https://github.com/CarperAI/trlx/compare/v0.4...v0.5.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2023-02-23发行的版本