v0.2.5
版本发布时间: 2023-12-14 15:58:13
vllm-project/vllm最新发布版本:v0.4.1(2024-04-24 10:28:08)
Major changes
- Optimize Mixtral performance with expert parallelism (thanks to @Yard1)
- [BugFix] Fix input positions for long context with sliding window
What's Changed
- Update Dockerfile to support Mixtral by @simon-mo in https://github.com/vllm-project/vllm/pull/2027
- Remove python 3.10 requirement by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2040
- [CI/CD] Upgrade PyTorch version to v2.1.1 by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2045
- Upgrade transformers version to 4.36.0 by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2046
- Remove einops from dependencies by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2049
- gqa added to mpt attn by @megha95 in https://github.com/vllm-project/vllm/pull/1938
- Update Dockerfile to build Megablocks by @simon-mo in https://github.com/vllm-project/vllm/pull/2042
- Fix peak memory profiling by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2031
- Implement lazy model loader by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2044
- [ROCm] Upgrade xformers version dependency for ROCm; update documentations by @tjtanaa in https://github.com/vllm-project/vllm/pull/2079
- Update installation instruction for CUDA 11.8 by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2086
- [Docs] Add notes on ROCm-supported models by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2087
- [BugFix] Fix input positions for long context with sliding window by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2088
- Mixtral expert parallelism by @Yard1 in https://github.com/vllm-project/vllm/pull/2090
- Bump up to v0.2.5 by @WoosukKwon in https://github.com/vllm-project/vllm/pull/2095
Full Changelog: https://github.com/vllm-project/vllm/compare/v0.2.4...v0.2.5
1、 vllm-0.2.5+cu118-cp310-cp310-manylinux1_x86_64.whl 9.36MB
2、 vllm-0.2.5+cu118-cp311-cp311-manylinux1_x86_64.whl 9.36MB
3、 vllm-0.2.5+cu118-cp38-cp38-manylinux1_x86_64.whl 9.36MB
4、 vllm-0.2.5+cu118-cp39-cp39-manylinux1_x86_64.whl 9.37MB
5、 vllm-0.2.5-cp310-cp310-manylinux1_x86_64.whl 9.37MB
6、 vllm-0.2.5-cp311-cp311-manylinux1_x86_64.whl 9.39MB
7、 vllm-0.2.5-cp38-cp38-manylinux1_x86_64.whl 9.38MB