v0.4.1

bentoml/OpenLLM

版本发布时间: 2023-11-08 21:35:23

bentoml/OpenLLM最新发布版本:v0.6.13(2024-10-17 21:17:35)

OpenLLM version 0.4.0 introduces several enhanced features.

Unified API and Continuous Batching support: 0.4.0 brings a simplified API for OpenLLM. Users can now run LLM with two new APIs.

await llm.generate_iterator(prompt, stop, **kwargs): one shot generation for any given prompt

import openllm, asyncio

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

async def infer(prompt,**kwargs):
  return await llm.generate(prompt, **kwargs)

asyncio.run(infer("Time is a definition of"))

await llm.generate(prompt, stop, **kwargs: stream generation that returns tokens as they become ready

import bentoml, openllm
import openllm

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

svc = bentoml.Service(name='zephyr-instruct', runners=[llm.runner])

@svc.api(input=bentoml.io.Text(), output=bentoml.io.Text(media_type='text/event-stream'))
async def prompt(input_text: str) -> str:
  async for generation in llm.generate_iterator(input_text):
    yield f"data: {generation.outputs[0].text}\\n\\n"

Under async context, calls to both llm.generate_iterator and llm.generate now supports continuous batching for the most optimal throughput.
The backend is now automatically inferred based on the presence of vllm in the environment. However, if you prefer to manually specify the backend, you can achieve this by using the backend argument.
```
openllm.LLM("HuggingFaceH4/zephyr-7b-beta", backend='pt')
```

Quantization can also be passed directly to this new LLM API.

openllm.LLM("TheBloke/Mistral-7B-Instruct-v0.1-AWQ", quantize='awq')

Mistral Model: OpenLLM now supports Mistral. To start a Mistral server, simply execute openllm start mistral.
AWQ and SqueezeLLM Quantization: AWQ and SqueezeLLM is now supported with vLLM backend. Simply pass --quantize awq or --quantize squezzellm to openllm start to use AWQ or SqueezeLLM quantization.

IMPORTANT: For using AWQ it is crucial that the model weight is already quantized with AWQ. Please look for the model variant on HuggingFace hub for the AWQ version of the model you want to use. Currently, only AWQ with vLLM is fully tested and supported.
General bug fixes: fixed a bug with regards to tag generation. Standalone Bento that use this new API should just work as expected if the model is already exists in the model store.
- For consistency, make sure to run openllm prune -y --include-bentos

Installation

pip install openllm==0.4.1

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.1

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.1 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.1

Find more information about this release in the CHANGELOG.md

What's Changed

chore(runner): yield the outputs directly by @aarnphm in https://github.com/bentoml/OpenLLM/pull/573
chore(openai): simplify client examples by @aarnphm in https://github.com/bentoml/OpenLLM/pull/574
fix(examples): correct dependencies in requirements.txt [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/575
refactor: cleanup typing to expose correct API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/576
fix(stubs): update initialisation types by @aarnphm in https://github.com/bentoml/OpenLLM/pull/577
refactor(strategies): move logics into openllm-python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/578
chore(service): cleanup API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/579
infra: disable npm updates and correct python packages by @aarnphm in https://github.com/bentoml/OpenLLM/pull/580
chore(deps): bump aquasecurity/trivy-action from 0.13.1 to 0.14.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/583
chore(deps): bump taiki-e/install-action from 2.21.7 to 2.21.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/581
chore(deps): bump sigstore/cosign-installer from 3.1.2 to 3.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/582
fix: device imports using strategies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/584
fix(gptq): update config fields by @aarnphm in https://github.com/bentoml/OpenLLM/pull/585
fix: unbound variable for completion client by @aarnphm in https://github.com/bentoml/OpenLLM/pull/587
fix(awq): correct awq detection for support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/586
feat(vllm): squeezellm by @aarnphm in https://github.com/bentoml/OpenLLM/pull/588
docs: update quantization notes by @aarnphm in https://github.com/bentoml/OpenLLM/pull/589
fix(cli): append model-id instruction to build by @aarnphm in https://github.com/bentoml/OpenLLM/pull/590
container: update tracing dependencies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/591

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.0...v0.4.1

相关地址：原始地址下载(tar) 下载(zip)

1、 openllm-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

2、 openllm-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

3、 openllm-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

4、 openllm-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

5、 openllm-0.4.1-py3-none-any.whl 114.06KB

6、 openllm-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

7、 openllm-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

8、 openllm-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

9、 openllm-0.4.1.tar.gz 107.24KB

10、 openllm-build-base-container-0.4.1-aarch64-apple-darwin.tar.gz 1.89MB

11、 openllm-build-base-container-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

12、 openllm-build-base-container-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

13、 openllm-build-base-container-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

14、 openllm-build-base-container-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

15、 openllm-build-base-container-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

16、 openllm-build-base-container-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

17、 openllm-dive-bentos-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

18、 openllm-dive-bentos-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

19、 openllm-dive-bentos-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

20、 openllm-dive-bentos-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

21、 openllm-dive-bentos-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

22、 openllm-dive-bentos-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

23、 openllm-dive-bentos-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

24、 openllm-get-containerfile-0.4.1-aarch64-apple-darwin.tar.gz 1.89MB

25、 openllm-get-containerfile-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

26、 openllm-get-containerfile-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

27、 openllm-get-containerfile-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

28、 openllm-get-containerfile-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

29、 openllm-get-containerfile-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

30、 openllm-get-containerfile-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

31、 openllm-get-prompt-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

32、 openllm-get-prompt-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

33、 openllm-get-prompt-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

34、 openllm-get-prompt-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

35、 openllm-get-prompt-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

36、 openllm-get-prompt-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

37、 openllm-get-prompt-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

38、 openllm-list-bentos-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

39、 openllm-list-bentos-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

40、 openllm-list-bentos-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

41、 openllm-list-bentos-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

42、 openllm-list-bentos-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

43、 openllm-list-bentos-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

44、 openllm-list-bentos-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

45、 openllm-list-models-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

46、 openllm-list-models-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

47、 openllm-list-models-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

48、 openllm-list-models-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

49、 openllm-list-models-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

50、 openllm-list-models-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

51、 openllm-list-models-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

52、 openllm-playground-0.4.1-aarch64-apple-darwin.tar.gz 1.9MB

53、 openllm-playground-0.4.1-aarch64-unknown-linux-gnu.tar.gz 1.96MB

54、 openllm-playground-0.4.1-i686-unknown-linux-gnu.tar.gz 2.02MB

55、 openllm-playground-0.4.1-powerpc64le-unknown-linux-gnu.tar.gz 2.22MB

56、 openllm-playground-0.4.1-x86_64-apple-darwin.tar.gz 1.98MB

57、 openllm-playground-0.4.1-x86_64-unknown-linux-gnu.tar.gz 2.16MB

58、 openllm-playground-0.4.1-x86_64-unknown-linux-musl.tar.gz 2.19MB

59、 openllm_client-0.4.1-py3-none-any.whl 30.53KB

60、 openllm_client-0.4.1.tar.gz 28.8KB

61、 openllm_core-0.4.1-py3-none-any.whl 84.54KB

62、 openllm_core-0.4.1.tar.gz 66.3KB

查看：2023-11-08发行的版本