v0.1.0b8
版本发布时间: 2024-09-17 21:09:22
huggingface/optimum-nvidia最新发布版本:v0.1.0b8(2024-09-17 21:09:22)
Optimum-Nvidia v0.1.0 Beta 8
Highlight
- Exporting a model is now more robust and better defined overall compared to previous version. All the parameters are now exposed through
optimum.nvidia.ExportConfig
- Bring back quantization and sparsity through integration of Nvidia's ModelOpt
- Added examples of quantization and sparsification recipes under
examples/quantization
- Integrated
optimum-nvidia
with the latestoptimum-cli
interface to support exporting engines without any code throughoptimum-cli export trtllm
.
Known Issues
- ModelOpt v0.15 as integrated in optimum-nvidia has an issue when trying to quantize with AWQ schema which is gone with v0.17. This dependency will be upgraded in the next release
What's Changed
- feat(package): make sure we dont have init as optimum level by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/132
- Enable trufflehog scanner CI on GA by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/136
- Enable automatic build of container at each release by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/137
- Refactor the overall Hugging Face -> TRTLLM export workflow by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/133
- feat(tests) : Update CI to use new workflow and silicon. by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/145
- move to new cluster by @glegendre01 in https://github.com/huggingface/optimum-nvidia/pull/150
- Bring back quantization with Nvidia ModelOpt by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/147
- (misc) disable xQA kernels for now as they seem to hang by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/152
- Add CLI quantization option by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/153
- tests(cli): uncomment out tests for CLI by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/154
- Fix license detection path by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/155
- Fix test again by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/156
- chore: remove invalid examples by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/157
- Bump version to 0.1.0b8 by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/158
- chore: update README badges by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/159
Full Changelog: https://github.com/huggingface/optimum-nvidia/compare/v0.1.0b7...v0.1.0b8