v0.1.6

版本发布时间: 2023-06-17 21:07:57

bentoml/OpenLLM最新发布版本:v0.6.13(2024-10-17 21:17:35)

Features

Quantization now can be enabled during serving time:

openllm start stablelm --quantize int8

This will loads the model in 8-bit mode, with bitsandbytes

For CPU machine, don't worry, you can use --bettertransformer instead:

openllm start stablelm --bettertransformer

pip install openllm==0.1.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.6

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

refactor: toplevel CLI by @aarnphm in https://github.com/bentoml/OpenLLM/pull/26
docs: add LangChain and BentoML Examples by @parano in https://github.com/bentoml/OpenLLM/pull/25
feat: fine-tuning [part 1] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/23
feat: quantization by @aarnphm in https://github.com/bentoml/OpenLLM/pull/27
perf: build quantization and better transformer behaviour by @aarnphm in https://github.com/bentoml/OpenLLM/pull/28

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.1.5...v0.1.6