v0.9.1
版本发布时间: 2023-07-06 22:09:32
huggingface/text-generation-inference最新发布版本:v2.3.1(2024-10-03 21:01:49)
Highlights
- server: Non flash MPT
- server: decrease memory fragmentation
Features
- server: use latest flash attention
- router: add argument for hostname in router
-
docs: Adding some help for the options in
text-generation-benchmark
Fix
- makefile: Update server/Makefile to include Makefile-vllm
- server: Handle loading from local files for MPT
- server: avoid errors for very small top_p values
Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v0.9.0...v0.9.1