v0.1.0b3

版本发布时间: 2024-02-29 05:44:48

huggingface/optimum-nvidia最新发布版本:v0.1.0b8(2024-09-17 21:09:22)

Highlights

This release brings support Google recently released model Gemma
optimum-nvidia went through a major refactor which will make it much easier to support new models and integrate the latest one in the long run

Update underlying TensorRT-LLM dependency to b7c309d1c9baa9c030680988cb73e461f6253b98 (v0.9.0)

The current float8 flow is disabled until next release in order to support the new calibration workflow

Bug fixes in readme. by @Anindyadeep in https://github.com/huggingface/optimum-nvidia/pull/63
Bump TRTLLM to latest version #d879430 by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/65
Ability to build Whisper encoder/decoder TRT engine by @fxmarty in https://github.com/huggingface/optimum-nvidia/pull/70
Refactoring of the overall structure to better align with the new TRTLLM workflow moving forward by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/74
Fix gemma 7b by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/77
Update license by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/78
Make pipelines compatible with the new workflow by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/79
Fix repo code quality by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/80
Bring back CI to a normal state by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/82
Fix hardcoded embedding scale with value from config by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/85
Make overall optimum-nvidia pip installable by @mfuntowicz in https://github.com/huggingface/optimum-nvidia/pull/83

Full Changelog: https://github.com/huggingface/optimum-nvidia/compare/v0.1.0b2...v0.1.0b3