v1.4.4
版本发布时间: 2024-03-23 01:45:55
huggingface/text-generation-inference最新发布版本:v2.0.1(2024-04-18 23:22:51)
Highlights
- CohereForAI/c4ai-command-r-v01 model support
What's Changed
- Handle concurrent grammar requests by @drbh in https://github.com/huggingface/text-generation-inference/pull/1610
- Fix idefics default. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1614
- Fix async client timeout by @hugoabonizio in https://github.com/huggingface/text-generation-inference/pull/1617
- accept legacy request format and response by @drbh in https://github.com/huggingface/text-generation-inference/pull/1527
- add missing stop parameter for chat request by @drbh in https://github.com/huggingface/text-generation-inference/pull/1619
- correctly index into mask when applying grammar by @drbh in https://github.com/huggingface/text-generation-inference/pull/1618
- Use a better model for the quick tour by @lewtun in https://github.com/huggingface/text-generation-inference/pull/1639
- Upgrade nix version from 0.27.1 to 0.28.0 by @yuanwu2017 in https://github.com/huggingface/text-generation-inference/pull/1638
- Update peft + transformers + accelerate + bnb + safetensors by @abhishekkrthakur in https://github.com/huggingface/text-generation-inference/pull/1646
- Fix index in ChatCompletionChunk by @Wauplin in https://github.com/huggingface/text-generation-inference/pull/1648
- Fixing minor typo in documentation: supported hardware section by @SachinVarghese in https://github.com/huggingface/text-generation-inference/pull/1632
- bump minijina and add test for core templates by @drbh in https://github.com/huggingface/text-generation-inference/pull/1626
- support force downcast after FastRMSNorm multiply for Gemma by @drbh in https://github.com/huggingface/text-generation-inference/pull/1658
- prefer spaces url over temp url by @drbh in https://github.com/huggingface/text-generation-inference/pull/1662
- improve tool type, bump pydantic and outlines by @drbh in https://github.com/huggingface/text-generation-inference/pull/1650
- Remove unecessary cuda graph. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1664
- Repair idefics integration tests. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1663
- fix: LlamaTokenizerFast to AutoTokenizer at flash_mistral.py by @SeongBeomLEE in https://github.com/huggingface/text-generation-inference/pull/1637
- Inline images for multimodal models. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1666
New Contributors
- @hugoabonizio made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1617
- @yuanwu2017 made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1638
- @abhishekkrthakur made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1646
- @Wauplin made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1648
- @SachinVarghese made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1632
- @SeongBeomLEE made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1637
Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v1.4.3...v1.4.4