v0.5.3
版本发布时间: 2024-08-07 11:38:56
InternLM/lmdeploy最新发布版本:v0.6.0a0(2024-08-26 17:12:19)
What's Changed
🚀 Features
- PyTorch Engine AWQ support by @grimoire in https://github.com/InternLM/lmdeploy/pull/1913
- Phi3 awq by @grimoire in https://github.com/InternLM/lmdeploy/pull/1984
- Fix chunked prefill by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/2201
- support VLMs with Qwen as the language model by @irexyc in https://github.com/InternLM/lmdeploy/pull/2207
💥 Improvements
- Support specifying a prefix of assistant response by @AllentDan in https://github.com/InternLM/lmdeploy/pull/2172
- Strict check for
name_map
inInternLM2Chat7B
by @SamuraiBUPT in https://github.com/InternLM/lmdeploy/pull/2156 - Check errors for attention kernels by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/2206
- update base image to support cuda12.4 in dockerfile by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/2182
- Stop synchronizing for
length_criterion
by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/2202 - adapt MiniCPM-Llama3-V-2_5 new code by @irexyc in https://github.com/InternLM/lmdeploy/pull/2139
- Remove duplicate code by @cmpute in https://github.com/InternLM/lmdeploy/pull/2133
🐞 Bug fixes
- [Hotfix] miss parentheses when calcuating the coef of llama3 rope by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2157
- support logit softcap by @grimoire in https://github.com/InternLM/lmdeploy/pull/2158
- Fix gmem to smem WAW conflict in awq gemm kernel by @foreverrookie in https://github.com/InternLM/lmdeploy/pull/2111
- Fix gradio serve using a wrong chat template by @AllentDan in https://github.com/InternLM/lmdeploy/pull/2131
- fix runtime error when using dynamic scale rotary embed for InternLM2… by @CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2212
- Add peer-access-enabled allocator by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/2218
- Fix typos in profile_generation.py by @jiajie-yang in https://github.com/InternLM/lmdeploy/pull/2233
📚 Documentations
- docs: fix Qwen typo by @ArtificialZeng in https://github.com/InternLM/lmdeploy/pull/2136
- wrong expression by @ArtificialZeng in https://github.com/InternLM/lmdeploy/pull/2165
- clearify the model type LLM or MLLM in supported model matrix by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2209
- docs: add Japanese README by @eltociear in https://github.com/InternLM/lmdeploy/pull/2237
🌐 Other
- bump version to 0.5.2.post1 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2159
- update news about cooperation with modelscope/swift by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2200
- bump version to v0.5.3 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2242
New Contributors
- @ArtificialZeng made their first contribution in https://github.com/InternLM/lmdeploy/pull/2136
- @foreverrookie made their first contribution in https://github.com/InternLM/lmdeploy/pull/2111
- @SamuraiBUPT made their first contribution in https://github.com/InternLM/lmdeploy/pull/2156
- @CyCle1024 made their first contribution in https://github.com/InternLM/lmdeploy/pull/2212
- @jiajie-yang made their first contribution in https://github.com/InternLM/lmdeploy/pull/2233
- @cmpute made their first contribution in https://github.com/InternLM/lmdeploy/pull/2133
Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.5.2...v0.5.3
1、 lmdeploy-0.5.3+cu118-cp310-cp310-manylinux2014_x86_64.whl 68.16MB
2、 lmdeploy-0.5.3+cu118-cp310-cp310-win_amd64.whl 45.21MB
3、 lmdeploy-0.5.3+cu118-cp311-cp311-manylinux2014_x86_64.whl 68.17MB
4、 lmdeploy-0.5.3+cu118-cp311-cp311-win_amd64.whl 45.21MB
5、 lmdeploy-0.5.3+cu118-cp312-cp312-manylinux2014_x86_64.whl 68.19MB
6、 lmdeploy-0.5.3+cu118-cp312-cp312-win_amd64.whl 45.21MB
7、 lmdeploy-0.5.3+cu118-cp38-cp38-manylinux2014_x86_64.whl 68.16MB
8、 lmdeploy-0.5.3+cu118-cp38-cp38-win_amd64.whl 45.21MB
9、 lmdeploy-0.5.3+cu118-cp39-cp39-manylinux2014_x86_64.whl 68.15MB
10、 lmdeploy-0.5.3+cu118-cp39-cp39-win_amd64.whl 45.21MB