v0.4.1
版本发布时间: 2023-11-08 21:35:23
bentoml/OpenLLM最新发布版本:v0.6.13(2024-10-17 21:17:35)
OpenLLM version 0.4.0 introduces several enhanced features.
-
Unified API and Continuous Batching support: 0.4.0 brings a simplified API for OpenLLM. Users can now run LLM with two new APIs.
-
await llm.generate_iterator(prompt, stop, **kwargs)
: one shot generation for any given promptimport openllm, asyncio llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta") async def infer(prompt,**kwargs): return await llm.generate(prompt, **kwargs) asyncio.run(infer("Time is a definition of"))
-
await llm.generate(prompt, stop, **kwargs
: stream generation that returns tokens as they become readyimport bentoml, openllm import openllm llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta") svc = bentoml.Service(name='zephyr-instruct', runners=[llm.runner]) @svc.api(input=bentoml.io.Text(), output=bentoml.io.Text(media_type='text/event-stream')) async def prompt(input_text: str) -> str: async for generation in llm.generate_iterator(input_text): yield f"data: {generation.outputs[0].text}\\n\\n"
-
Under async context, calls to both
llm.generate_iterator
andllm.generate
now supports continuous batching for the most optimal throughput. -
The backend is now automatically inferred based on the presence of
vllm
in the environment. However, if you prefer to manually specify the backend, you can achieve this by using thebackend
argument.openllm.LLM("HuggingFaceH4/zephyr-7b-beta", backend='pt')
-
Quantization can also be passed directly to this new
LLM
API.openllm.LLM("TheBloke/Mistral-7B-Instruct-v0.1-AWQ", quantize='awq')
-
-
Mistral Model: OpenLLM now supports Mistral. To start a Mistral server, simply execute
openllm start mistral
. -
AWQ and SqueezeLLM Quantization: AWQ and SqueezeLLM is now supported with vLLM backend. Simply pass
--quantize awq
or--quantize squezzellm
toopenllm start
to use AWQ or SqueezeLLM quantization.IMPORTANT: For using AWQ it is crucial that the model weight is already quantized with AWQ. Please look for the model variant on HuggingFace hub for the AWQ version of the model you want to use. Currently, only AWQ with vLLM is fully tested and supported.
-
General bug fixes: fixed a bug with regards to tag generation. Standalone Bento that use this new API should just work as expected if the model is already exists in the model store.
- For consistency, make sure to run
openllm prune -y --include-bentos
- For consistency, make sure to run
Installation
pip install openllm==0.4.1
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.4.1
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.1 start opt
To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.1
Find more information about this release in the CHANGELOG.md
What's Changed
- chore(runner): yield the outputs directly by @aarnphm in https://github.com/bentoml/OpenLLM/pull/573
- chore(openai): simplify client examples by @aarnphm in https://github.com/bentoml/OpenLLM/pull/574
- fix(examples): correct dependencies in requirements.txt [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/575
- refactor: cleanup typing to expose correct API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/576
- fix(stubs): update initialisation types by @aarnphm in https://github.com/bentoml/OpenLLM/pull/577
- refactor(strategies): move logics into openllm-python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/578
- chore(service): cleanup API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/579
- infra: disable npm updates and correct python packages by @aarnphm in https://github.com/bentoml/OpenLLM/pull/580
- chore(deps): bump aquasecurity/trivy-action from 0.13.1 to 0.14.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/583
- chore(deps): bump taiki-e/install-action from 2.21.7 to 2.21.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/581
- chore(deps): bump sigstore/cosign-installer from 3.1.2 to 3.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/582
- fix: device imports using strategies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/584
- fix(gptq): update config fields by @aarnphm in https://github.com/bentoml/OpenLLM/pull/585
- fix: unbound variable for completion client by @aarnphm in https://github.com/bentoml/OpenLLM/pull/587
- fix(awq): correct awq detection for support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/586
- feat(vllm): squeezellm by @aarnphm in https://github.com/bentoml/OpenLLM/pull/588
- docs: update quantization notes by @aarnphm in https://github.com/bentoml/OpenLLM/pull/589
- fix(cli): append model-id instruction to build by @aarnphm in https://github.com/bentoml/OpenLLM/pull/590
- container: update tracing dependencies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/591
Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.0...v0.4.1
1、 openllm-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
2、 openllm-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
3、 openllm-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
4、 openllm-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
5、 openllm-0.4.1-py3-none-any.whl 114.06KB
6、 openllm-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
7、 openllm-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
8、 openllm-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
9、 openllm-0.4.1.tar.gz 107.24KB
10、 openllm-build-base-container-0.4.1-aarch64-apple-darwin.tar.gz 1.89MB
11、 openllm-build-base-container-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
12、 openllm-build-base-container-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
13、 openllm-build-base-container-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
14、 openllm-build-base-container-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
15、 openllm-build-base-container-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
16、 openllm-build-base-container-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
17、 openllm-dive-bentos-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
18、 openllm-dive-bentos-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
19、 openllm-dive-bentos-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
20、 openllm-dive-bentos-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
21、 openllm-dive-bentos-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
22、 openllm-dive-bentos-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
23、 openllm-dive-bentos-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
24、 openllm-get-containerfile-0.4.1-aarch64-apple-darwin.tar.gz 1.89MB
25、 openllm-get-containerfile-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
26、 openllm-get-containerfile-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
27、 openllm-get-containerfile-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
28、 openllm-get-containerfile-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
29、 openllm-get-containerfile-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
30、 openllm-get-containerfile-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
31、 openllm-get-prompt-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
32、 openllm-get-prompt-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
33、 openllm-get-prompt-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
34、 openllm-get-prompt-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
35、 openllm-get-prompt-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
36、 openllm-get-prompt-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
37、 openllm-get-prompt-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
38、 openllm-list-bentos-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
39、 openllm-list-bentos-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
40、 openllm-list-bentos-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
41、 openllm-list-bentos-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
42、 openllm-list-bentos-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
43、 openllm-list-bentos-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
44、 openllm-list-bentos-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
45、 openllm-list-models-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
46、 openllm-list-models-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
47、 openllm-list-models-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
48、 openllm-list-models-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
49、 openllm-list-models-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
50、 openllm-list-models-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
51、 openllm-list-models-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
52、 openllm-playground-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB
53、 openllm-playground-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB
54、 openllm-playground-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB
55、 openllm-playground-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB
56、 openllm-playground-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB
57、 openllm-playground-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB
58、 openllm-playground-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB
59、 openllm_client-0.4.1-py3-none-any.whl 30.53KB
60、 openllm_client-0.4.1.tar.gz 28.8KB
61、 openllm_core-0.4.1-py3-none-any.whl 84.54KB
62、 openllm_core-0.4.1.tar.gz 66.3KB