v1.4.0

huggingface/text-generation-inference

版本发布时间: 2024-01-27 02:07:45

huggingface/text-generation-inference最新发布版本:v2.0.1(2024-04-18 23:22:51)

Highlights

OpenAI compatible API #1427
exllama v2 Tensor Parallel #1490
GPTQ support for AMD GPUs #1489
Phi support #1442

What's Changed

fix: fix local loading for .bin models by @OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1419
Fix missing make target platform for local install: 'install-flash-attention-v2' by @deepily in https://github.com/huggingface/text-generation-inference/pull/1414
fix: follow base model for tokenizer in router by @OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1424
Fix local load for Medusa by @PYNing in https://github.com/huggingface/text-generation-inference/pull/1420
Return prompt vs generated tokens. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1436
feat: supports openai chat completions API by @drbh in https://github.com/huggingface/text-generation-inference/pull/1427
feat: support raise_exception, bos and eos tokens by @drbh in https://github.com/huggingface/text-generation-inference/pull/1450
chore: bump rust version and annotate/fix all clippy warnings by @drbh in https://github.com/huggingface/text-generation-inference/pull/1455
feat: conditionally toggle chat on invocations route by @drbh in https://github.com/huggingface/text-generation-inference/pull/1454
Disable decoder_input_details on OpenAI-compatible chat streaming, pass temp and top-k from API by @EndlessReform in https://github.com/huggingface/text-generation-inference/pull/1470
Fixing non divisible embeddings. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1476
Add messages api compatibility docs by @drbh in https://github.com/huggingface/text-generation-inference/pull/1478
Add a new /tokenize route to get the tokenized input by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1471
feat: adds phi model by @drbh in https://github.com/huggingface/text-generation-inference/pull/1442
fix: read stderr in download by @OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1486
fix: show warning with tokenizer config parsing error by @drbh in https://github.com/huggingface/text-generation-inference/pull/1488
fix: launcher doc typos by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1473
Reinstate exl2 with tp by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1490
Add sealion mpt support by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1477
Trying to fix that flaky test. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1491
fix: launcher doc typos by @thelinuxkid in https://github.com/huggingface/text-generation-inference/pull/1462
Update the docs to include newer models. by @Narsil in https://github.com/huggingface/text-generation-inference/pull/1492
GPTQ support on ROCm by @fxmarty in https://github.com/huggingface/text-generation-inference/pull/1489
feat: add tokenizer-config-path to launcher args by @drbh in https://github.com/huggingface/text-generation-inference/pull/1495

New Contributors

@deepily made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1414
@PYNing made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1420
@drbh made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1427
@EndlessReform made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1470
@thelinuxkid made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1462

Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v1.3.4...v1.4.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-01-27发行的版本