v1.9
版本发布时间: 2024-07-05 11:24:05
oobabooga/text-generation-webui最新发布版本:v1.14(2024-08-20 12:29:43)
Backend updates
-
4-bit and 8-bit kv cache options have been added to llama.cpp and llamacpp_HF. They reuse the existing
--cache_8bit
and--cache_4bit
flags. Thanks @GodEmperor785 for figuring out what values to pass to llama-cpp-python. - Transformers:
- Add eager attention option to make Gemma-2 work correctly (#6188). Thanks @GralchemOz.
- Automatically detect bfloat16/float16 precision when loading models in 16-bit precision.
- Automatically apply eager attention to models with
Gemma2ForCausalLM
architecture. -
Gemma-2 support: Automatically detect and apply the optimal settings for this model with the two changes above. No need to set
--bf16 --use_eager_attention
manually.
- Automatically obtain the EOT token from Jinja2 templates and add it to the stopping strings, fixing Llama-3-Instruct not stopping. No need to add
<eot>
to the custom stopping strings anymore.
UI updates
- Whisper STT overhaul: this extension has been rewritten, replacing the Gradio microphone component with a custom microphone element that is much more reliable (#6194). Thanks @RandomInternetPreson, @TimStrauven, and @mamei16.
- Make the character dropdown menu coexist in the "Chat" tab and the "Parameters > Character" tab, after some people pointed out that moving it entirely to the Chat tab makes it harder to edit characters.
- Colors in the light theme have been improved, making it a bit more aesthetic.
- Increase the chat area on mobile devices.
Bug fixes
- Fix the API request to AUTOMATIC1111 in the sd-api-pictures extension.
- Fix a glitch when switching tabs with "Show controls" unchecked in the chat tab and extensions loaded.
Library updates
- llama-cpp-python: bump to 0.2.81 (adds Gemma-2 support).
- Transformers: bump to 4.42 (adds Gemma-2 support).
Support
- GitHub Sponsors: https://github.com/sponsors/oobabooga
- ko-fi: https://ko-fi.com/oobabooga