v0.5.0
版本发布时间: 2024-07-01 15:22:00
InternLM/lmdeploy最新发布版本:v0.6.0a0(2024-08-26 17:12:19)
What's Changed
🚀 Features
- support MiniCPM-Llama3-V 2.5 by @irexyc in https://github.com/InternLM/lmdeploy/pull/1708
- [Feature]: Support llava for pytorch engine by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1641
- Device dispatcher by @grimoire in https://github.com/InternLM/lmdeploy/pull/1775
- Add GLM-4-9B-Chat by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/1724
- Torch deepseek v2 by @grimoire in https://github.com/InternLM/lmdeploy/pull/1621
- Support internvl-chat for pytorch engine by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1797
- Add interfaces to the pipeline to obtain logits and ppl by @irexyc in https://github.com/InternLM/lmdeploy/pull/1652
- [Feature]: Support cogvlm-chat by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1502
💥 Improvements
- support mistral and llava_mistral in turbomind by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1579
- Add health endpoint by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1679
- upgrade the version of the dependency package peft by @grimoire in https://github.com/InternLM/lmdeploy/pull/1687
- Follow the conventional model_name by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1677
- API Image URL fetch timeout by @vody-am in https://github.com/InternLM/lmdeploy/pull/1684
- Support internlm-xcomposer2-4khd-7b awq by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1666
- update dockerfile and docs by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1715
- lazy import VLAsyncEngine to avoid bringing in VLMs dependencies when deploying LLMs by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1714
- feat: align with OpenAI temperature range by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1733
- feat: align with OpenAI temperature range in api server by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1734
- Refactor converter about get_input_model_registered_name and get_output_model_registered_name_and_config by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1702
- Refine max_new_tokens logic to improve user experience by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1705
- Refactor loading weights by @grimoire in https://github.com/InternLM/lmdeploy/pull/1603
- refactor config by @grimoire in https://github.com/InternLM/lmdeploy/pull/1751
- Add anomaly handler by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/1780
- Encode raw image file to base64 by @irexyc in https://github.com/InternLM/lmdeploy/pull/1773
- skip inference for oversized inputs by @grimoire in https://github.com/InternLM/lmdeploy/pull/1769
- fix: prevent numpy breakage by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1791
- More accurate time logging for ImageEncoder and fix concurrent image processing corruption by @irexyc in https://github.com/InternLM/lmdeploy/pull/1765
- Optimize kernel launch for triton2.2.0 and triton2.3.0 by @grimoire in https://github.com/InternLM/lmdeploy/pull/1499
- feat: auto set awq model_format from hf by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1799
- check driver mismatch by @grimoire in https://github.com/InternLM/lmdeploy/pull/1811
- PyTorchEngine adapts to the latest internlm2 modeling. by @grimoire in https://github.com/InternLM/lmdeploy/pull/1798
- AsyncEngine create cancel task in exception. by @grimoire in https://github.com/InternLM/lmdeploy/pull/1807
- compat internlm2 for pytorch engine by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1825
- Add model revision & download_dir to cli by @irexyc in https://github.com/InternLM/lmdeploy/pull/1814
- fix image encoder request queue by @irexyc in https://github.com/InternLM/lmdeploy/pull/1837
- Harden stream callback by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/1838
- Support Qwen2-1.5b awq by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1793
- remove chat template config in turbomind engine by @irexyc in https://github.com/InternLM/lmdeploy/pull/1161
- misc: align PyTorch Engine temprature with TurboMind by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1850
- docs: update cache-max-entry-count help message by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1892
🐞 Bug fixes
- fix typos by @irexyc in https://github.com/InternLM/lmdeploy/pull/1690
- [Bugfix] fix internvl-1.5-chat vision model preprocess and freeze weights by @DefTruth in https://github.com/InternLM/lmdeploy/pull/1741
- lock setuptools version in dockerfile by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1770
- Fix openai package can not use proxy stream mode by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1692
- Fix finish_reason by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1768
- fix uncached stop words by @grimoire in https://github.com/InternLM/lmdeploy/pull/1754
- [side-effect]Fix param
--cache-max-entry-count
is not taking effect (#1758) by @QwertyJack in https://github.com/InternLM/lmdeploy/pull/1778 - support qwen2 1.5b by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1782
- fix falcon attention by @grimoire in https://github.com/InternLM/lmdeploy/pull/1761
- Refine AsyncEngine exception handler by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1789
- [side-effect] fix weight_type caused by PR #1702 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1795
- fix best_match_model by @irexyc in https://github.com/InternLM/lmdeploy/pull/1812
- Fix Request completed log by @irexyc in https://github.com/InternLM/lmdeploy/pull/1821
- fix qwen-vl-chat hung by @irexyc in https://github.com/InternLM/lmdeploy/pull/1824
- Detokenize with prompt token ids by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1753
- Update engine.py to fix small typos by @WANGSSSSSSS in https://github.com/InternLM/lmdeploy/pull/1829
- [side-effect] bring back "--cap" argument in chat cli by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1859
- Fix vl session-len by @AllentDan in https://github.com/InternLM/lmdeploy/pull/1860
- fix gradio vl "stop_words" by @irexyc in https://github.com/InternLM/lmdeploy/pull/1873
- fix qwen2 cache_position for PyTorch Engine when transformers>4.41.2 by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1886
- fix model name matching for internvl by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1867
📚 Documentations
- docs: add BentoLMDeploy in README by @zhyncs in https://github.com/InternLM/lmdeploy/pull/1736
- [Doc]: Update docs for internlm2.5 by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/1887
🌐 Other
- add longtext generation benchmark by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/1694
- add qwen2 model into testcase by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/1772
- fix pr test for newest internlm2 model by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/1806
- react test evaluation config by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/1861
- bump version to v0.5.0 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/1852
New Contributors
- @DefTruth made their first contribution in https://github.com/InternLM/lmdeploy/pull/1741
- @QwertyJack made their first contribution in https://github.com/InternLM/lmdeploy/pull/1778
- @WANGSSSSSSS made their first contribution in https://github.com/InternLM/lmdeploy/pull/1829
Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.4.2...v0.5.0
1、 lmdeploy-0.5.0+cu118-cp310-cp310-manylinux2014_x86_64.whl 71.37MB
2、 lmdeploy-0.5.0+cu118-cp310-cp310-win_amd64.whl 48.84MB
3、 lmdeploy-0.5.0+cu118-cp311-cp311-manylinux2014_x86_64.whl 71.39MB
4、 lmdeploy-0.5.0+cu118-cp311-cp311-win_amd64.whl 48.85MB
5、 lmdeploy-0.5.0+cu118-cp312-cp312-manylinux2014_x86_64.whl 71.39MB
6、 lmdeploy-0.5.0+cu118-cp312-cp312-win_amd64.whl 48.85MB
7、 lmdeploy-0.5.0+cu118-cp38-cp38-manylinux2014_x86_64.whl 71.38MB
8、 lmdeploy-0.5.0+cu118-cp38-cp38-win_amd64.whl 48.84MB
9、 lmdeploy-0.5.0+cu118-cp39-cp39-manylinux2014_x86_64.whl 71.37MB
10、 lmdeploy-0.5.0+cu118-cp39-cp39-win_amd64.whl 48.85MB