v2.0.4
版本发布时间: 2024-05-24 18:55:41
huggingface/text-generation-inference最新发布版本:v3.0.1(2024-12-12 04:13:58)
Main changes
- AMD MI300 compatibility by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1764
- Many bugfixes.
What's Changed
- OpenAI function calling compatible support by @phangiabao98 in https://github.com/huggingface/text-generation-inference/pull/1888
- Fixing types. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1906
- Types. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1909
- Fixing signals. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1910
- Removing some unused code. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1915
- MI300 compatibility by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1764
- Add TGI monitoring guide through Grafana and Prometheus by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1908
- Update grafana template by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1918
- Fix TunableOp bug by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1920
- Fix TGI issues with ROCm by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1921
- Fixing the download strategy for ibm-fms by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1917
- ROCm: make CK FA2 default instead of Triton by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1924
- docs: Fix grafana dashboard url by @edwardzjl in https://github.com/huggingface/text-generation-inference/pull/1925
- feat: include token in client test like server tests by @drbh in https://github.com/huggingface/text-generation-inference/pull/1932
- Creating doc automatically for supported models. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1929
- fix: use path inside of speculator config by @drbh in https://github.com/huggingface/text-generation-inference/pull/1935
- feat: add train medusa head tutorial by @drbh in https://github.com/huggingface/text-generation-inference/pull/1934
- reenable xpu for tgi by @sywangyi in https://github.com/huggingface/text-generation-inference/pull/1939
- Fixing some legacy behavior (big swapout of serverless on legacy stuff). by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1937
- Add completion route to client and add stop parameter where it's missing by @thomas-schillaci in https://github.com/huggingface/text-generation-inference/pull/1869
- Improving the logging system. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1938
- Fixing codellama loads by using purely
AutoTokenizer
. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1947
New Contributors
- @phangiabao98 made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1888
- @edwardzjl made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1925
- @thomas-schillaci made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1869
Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v2.0.3...v2.0.4