v0.6.3

hiyouga/LLaMA-Factory

版本发布时间: 2024-04-21 23:43:07

hiyouga/LLaMA-Factory最新发布版本:v0.9.0(2024-09-09 01:14:03)

New features

Support Meta Llama-3 (8B/70B) models
Support UnslothAI's long-context QLoRA optimization (56,000 context length for Llama-2 7B in 24GB)
Support previewing local datasets in directories in LlamaBoard by @codemayq in #3291

New algorithms

Support BAdam algorithm by @Ledzy in #3287
Support Mixture-of-Depths training by @mlinmg in #3338

New models

Base models
- CodeGemma (2B/7B)
- CodeQwen1.5-7B
- Llama-3 (8B/70B)
- Mixtral-8x22B-v0.1
Instruct/Chat models
- CodeGemma-7B-it
- CodeQwen1.5-7B-Chat
- Llama-3-Instruct (8B/70B)
- Command R (35B) by @marko1616 in #3254
- Command R+ (104B) by @marko1616 in #3254
- Mixtral-8x22B-Instruct-v0.1

Bug fix

Fix full-tuning batch prediction examples by @khazic in #3261
Fix output_router_logits of Mixtral by @liu-zichen in #3276
Fix automodel from pretrained with attn implementation (see https://github.com/huggingface/transformers/issues/30298)
Fix unable to convergence issue in the layerwise galore optimizer (see https://github.com/huggingface/transformers/issues/30371)
Fix #3184 #3238 #3247 #3273 #3316 #3317 #3324 #3348 #3352 #3365 #3366

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-04-21发行的版本