v0.1.6
版本发布时间: 2023-06-17 21:07:57
bentoml/OpenLLM最新发布版本:v0.6.13(2024-10-17 21:17:35)
Features
Quantization now can be enabled during serving time:
openllm start stablelm --quantize int8
This will loads the model in 8-bit mode, with bitsandbytes
For CPU machine, don't worry, you can use --bettertransformer
instead:
openllm start stablelm --bettertransformer
Roadmap
- GPTQ is being developed, will include support soon
Installation
pip install openllm==0.1.6
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.6
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
What's Changed
- refactor: toplevel CLI by @aarnphm in https://github.com/bentoml/OpenLLM/pull/26
- docs: add LangChain and BentoML Examples by @parano in https://github.com/bentoml/OpenLLM/pull/25
- feat: fine-tuning [part 1] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/23
- feat: quantization by @aarnphm in https://github.com/bentoml/OpenLLM/pull/27
- perf: build quantization and better transformer behaviour by @aarnphm in https://github.com/bentoml/OpenLLM/pull/28
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.1.5...v0.1.6
1、 openllm-0.1.6.tar.gz 1.58MB