v1.5.0
版本发布时间: 2024-07-10 23:34:40
huggingface/text-embeddings-inference最新发布版本:v1.5.0(2024-07-10 23:34:40)
Notable Changes
- ONNX runtime for CPU deployments: greatly improve CPU deployment throughput
- Add
/similarity
route
What's Changed
- tokenizer max limit on input size by @ErikKaum in https://github.com/huggingface/text-embeddings-inference/pull/324
- docs: air-gapped deployments by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/326
- feat(onnx): add onnx runtime for better CPU perf by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/328
- feat: add
/similarity
route by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/331 - fix(ort): fix mean pooling by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/332
- chore(candle): update flash attn by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/335
- v1.5.0 by @OlivierDehaene in https://github.com/huggingface/text-embeddings-inference/pull/336
New Contributors
- @ErikKaum made their first contribution in https://github.com/huggingface/text-embeddings-inference/pull/324
Full Changelog: https://github.com/huggingface/text-embeddings-inference/compare/v1.4.0...v1.5.0