v0.1.0
版本发布时间: 2023-10-13 21:46:09
huggingface/text-embeddings-inference最新发布版本:v1.5.0(2024-07-10 23:34:40)
- No compilation step
- Dynamic shapes
- Small docker images and fast boot times. Get ready for true serverless!
- Token based dynamic batching
- Optimized transformers code for inference using Flash Attention, Candle and cuBLASLt
- Safetensors weight loading
- Production ready (distributed tracing with Open Telemetry, Prometheus metrics)