v0.4.0
版本发布时间: 2023-03-09 23:10:44
huggingface/text-generation-inference最新发布版本:v2.3.1(2024-10-03 21:01:49)
Features
- router: support best_of sampling
- router: support left truncation
- server: support typical sampling
- launcher: allow local models
- clients: add text-generation Python client
- launcher: allow parsing num_shard from CUDA_VISIBLE_DEVICES
Fix
- server: do not warp prefill logits
- server: fix formatting issues in generate_stream tokens
- server: fix galactica batch
- server: fix index out of range issue with watermarking