v0.1.0b4
版本发布时间: 2024-03-21 22:29:06
huggingface/optimum-nvidia最新发布版本:v0.1.0b8(2024-09-17 21:09:22)
#Highlights
- Update to TensorRT-LLM version 03-19-2024
- pip installation
- Float8 quantization workflow updated on more robust
- Save and restore prebuild engine from the Hugging Face Hub or locally on the machine
What's Changed
- Add ability to save local prebuilt engines by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/87
- Make float8 quantization back in the game. by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/92
- Fixed Repetition Penalty default value by @leopra in https://github.com/huggingface/optimum-nvidia/pull/66
- Update instructions for pip install by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/97
- Update to TensorRT-LLM v031224 by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/98
New Contributors
- @leopra made their first contribution in https://github.com/huggingface/optimum-nvidia/pull/66
Full Changelog: https://github.com/huggingface/optimum-nvidia/compare/v0.1.0b3...v0.1.0b4