v0.5.1
版本发布时间: 2024-07-16 18:05:55
InternLM/lmdeploy最新发布版本:v0.6.0a0(2024-08-26 17:12:19)
What's Changed
🚀 Features
- Support phi3-vision by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1845
- Support internvl2 chat template by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1911
- support gemma2 in pytorch engine by @grimoire in https://github.com/InternLM/lmdeploy/pull/1924
- Add tools to api_server for InternLM2 model by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1763
- support internvl2-1b by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1983
- feat: support llama2 and internlm2 on 910B by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2011
- Support glm 4v by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1947
- support internlm-xcomposer2d5-7b by @irexyc in https://github.com/InternLM/lmdeploy/pull/1932
- add chat template for codegeex4 by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/2013
💥 Improvements
- misc: rm unnecessary files by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1875
- drop stop words by @grimoire in https://github.com/InternLM/lmdeploy/pull/1823
- Add usage in stream response by @fbzhong in https://github.com/InternLM/lmdeploy/pull/1876
- Optimize sampling on pytorch engine. by @grimoire in https://github.com/InternLM/lmdeploy/pull/1853
- Remove deprecated chat cli and vl examples by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1899
- vision model use tp number of gpu by @irexyc in https://github.com/InternLM/lmdeploy/pull/1854
- misc: add default api_server_url for api_client by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1922
- misc: add transformers version check for TurboMind Tokenizer by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1917
- fix: append _stats when size > 0 by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1809
- refactor: update awq linear and rm legacy by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1940
- feat: add gpu topo for check_env by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1944
- fix transformers version check for InternVL2 by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1952
- Upgrade gradio by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1930
- refactor sampling layer setup by @irexyc in https://github.com/InternLM/lmdeploy/pull/1912
- Add exception handler to imge encoder by @irexyc in https://github.com/InternLM/lmdeploy/pull/2010
- Avoid the same session id for openai endpoint by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1995
🐞 Bug fixes
- Fix error link reference by @zihaomu in https://github.com/InternLM/lmdeploy/pull/1881
- Fix internlm-xcomposer2-vl awq search scale by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1890
- fix SamplingDecodeTest and SamplingDecodeTest2 unittest failure by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1874
- Fix smem size for fused split-kv reduction by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/1909
- fix llama3 chat template by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1956
- fix: set PYTHONIOENCODING to UTF-8 before start tritonserver by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1971
- Fix internvl2-40b model export by @irexyc in https://github.com/InternLM/lmdeploy/pull/1979
- fix logprobs by @irexyc in https://github.com/InternLM/lmdeploy/pull/1968
- fix unexpected argument error when deploying "cogvlm-chat-hf" by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1982
- fix mixtral and mistral cache_position by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1941
- Fix the session_len assignment logic by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2007
- Fix logprobs openai api by @irexyc in https://github.com/InternLM/lmdeploy/pull/1985
- Fix internvl2-40b awq inference by @AllentDan in https://github.com/InternLM/lmdeploy/pull/2023
- Fix side effect of #1995 by @AllentDan in https://github.com/InternLM/lmdeploy/pull/2033
📚 Documentations
- docs: update faq for turbomind so not found by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1877
- [Doc]: Change to sphinx-book-theme in readthedocs by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1880
- docs: update compatibility section in README by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1946
- docs: update kv quant doc by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1977
- docs: sync the core features in README to index.rst by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1988
- Fix table rendering for readthedocs by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1998
- docs: fix Ada compatibility by @zhyncs in https://github.com/InternLM/lmdeploy/pull/2016
- update xcomposer2d5 docs by @irexyc in https://github.com/InternLM/lmdeploy/pull/2037
🌐 Other
- [ci] add internlm2.5 models into testcase by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/1928
- bump version to v0.5.1 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/2022
New Contributors
- @zihaomu made their first contribution in https://github.com/InternLM/lmdeploy/pull/1881
- @fbzhong made their first contribution in https://github.com/InternLM/lmdeploy/pull/1876
Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.5.0...v0.5.1
1、 lmdeploy-0.5.1+cu118-cp310-cp310-manylinux2014_x86_64.whl 67.86MB
2、 lmdeploy-0.5.1+cu118-cp310-cp310-win_amd64.whl 45.14MB
3、 lmdeploy-0.5.1+cu118-cp311-cp311-manylinux2014_x86_64.whl 67.87MB
4、 lmdeploy-0.5.1+cu118-cp311-cp311-win_amd64.whl 45.14MB
5、 lmdeploy-0.5.1+cu118-cp312-cp312-manylinux2014_x86_64.whl 67.89MB
6、 lmdeploy-0.5.1+cu118-cp312-cp312-win_amd64.whl 45.14MB
7、 lmdeploy-0.5.1+cu118-cp38-cp38-manylinux2014_x86_64.whl 67.87MB
8、 lmdeploy-0.5.1+cu118-cp38-cp38-win_amd64.whl 45.14MB
9、 lmdeploy-0.5.1+cu118-cp39-cp39-manylinux2014_x86_64.whl 67.86MB
10、 lmdeploy-0.5.1+cu118-cp39-cp39-win_amd64.whl 45.13MB