v0.1.0b3
版本发布时间: 2024-02-29 05:44:48
huggingface/optimum-nvidia最新发布版本:v0.1.0b8(2024-09-17 21:09:22)
Highlights
- This release brings support Google recently released model Gemma
-
optimum-nvidia
went through a major refactor which will make it much easier to support new models and integrate the latest one in the long run
TensorRT-LLM
- Update underlying TensorRT-LLM dependency to b7c309d1c9baa9c030680988cb73e461f6253b98 (v0.9.0)
Known issues
- The current
float8
flow is disabled until next release in order to support the new calibration workflow
What's Changed
- Bug fixes in readme. by @Anindyadeep in https://github.com/huggingface/optimum-nvidia/pull/63
- Bump TRTLLM to latest version #d879430 by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/65
- Ability to build Whisper encoder/decoder TRT engine by @fxmarty in https://github.com/huggingface/optimum-nvidia/pull/70
- Refactoring of the overall structure to better align with the new TRTLLM workflow moving forward by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/74
- Fix gemma 7b by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/77
- Update license by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/78
- Make pipelines compatible with the new workflow by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/79
- Fix repo code quality by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/80
- Bring back CI to a normal state by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/82
- Fix hardcoded embedding scale with value from config by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/85
- Make overall
optimum-nvidia
pip installable by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/83
New Contributors
- @Anindyadeep made their first contribution in https://github.com/huggingface/optimum-nvidia/pull/63
- @fxmarty made their first contribution in https://github.com/huggingface/optimum-nvidia/pull/70
Full Changelog: https://github.com/huggingface/optimum-nvidia/compare/v0.1.0b2...v0.1.0b3