February-Gemma-2024

版本发布时间: 2024-02-27 00:10:54

unslothai/unsloth最新发布版本:September-2024(2024-09-24 05:32:53)

You can now finetune Gemma 7b 2.43x faster than HF + Flash Attention 2 with 57.5% less VRAM use. When compared to vanilla HF, Unsloth is 2.53x faster and uses 70% less VRAM. Blog post: https://unsloth.ai/blog/gemma. On local machines, update Unsloth via pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

On 1x A100 80GB GPU, Unsloth can fit 40K total tokens (8192 * bsz of 5), whilst FA2 can fit ~15K tokens and vanilla HF can fit 9K tokens. gemma reddit

Gemma 7b Colab Notebook free Tesla T4: https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing

Gemma 2b Colab Notebook free Tesla T4: https://colab.research.google.com/drive/15gGm7x_jTm017_Ic8e317tdIpDG53Mtu?usp=sharing

To use Gemma, simply use FastLanguageModel:

# Load Llama model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/gemma-7b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = None,
    load_in_4bit = True,
)

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-02-27发行的版本