v0.1.17
版本发布时间: 2024-06-08 10:58:55
sgl-project/sglang最新发布版本:v0.3.0(2024-09-04 19:50:29)
Highlights
- Add data parallelim #480
- Add speculative execution for OpenAI API #250
- Update vllm to v0.4.3 for new quantization features #511
- Better error handling (#457, #449, #514)
What's Changed
- [Feat] Add llava qwen, llava mistral by @kcz358 in https://github.com/sgl-project/sglang/pull/419
- Format code by @hnyls2002 in https://github.com/sgl-project/sglang/pull/441
- Add finish_reason to OpenAI API by @mgerstgrasser in https://github.com/sgl-project/sglang/pull/446
- Simplify port allocation by @merrymercy in https://github.com/sgl-project/sglang/pull/447
- Add PUT for generate api by @Ying1123 in https://github.com/sgl-project/sglang/pull/448
- Improve error handling & abort disconnected requests by @merrymercy in https://github.com/sgl-project/sglang/pull/449
- Fix the broken
--disable-radix-cache
by @hnyls2002 in https://github.com/sgl-project/sglang/pull/451 - openai chat speculative execution by @ChuyueSun in https://github.com/sgl-project/sglang/pull/250
- Fix openai speculative execution by @Ying1123 in https://github.com/sgl-project/sglang/pull/456
- Abort disconnected requests by @merrymercy in https://github.com/sgl-project/sglang/pull/457
- Rename api_num_spec_tokens -> num_api_spec_tokens by @merrymercy in https://github.com/sgl-project/sglang/pull/458
- Use model loader from vllm by @merrymercy in https://github.com/sgl-project/sglang/pull/459
- port fp8 mixtral by @merrymercy in https://github.com/sgl-project/sglang/pull/460
- fix test bug in srt_llava_next_test.py by @bingwork in https://github.com/sgl-project/sglang/pull/470
- Add the instruction link to the LLaVA-NeXT-Video at README by @ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/463
- Improve logging & add logit cap by @merrymercy in https://github.com/sgl-project/sglang/pull/471
- Optimize retract by @hnyls2002 in https://github.com/sgl-project/sglang/pull/440
- Add benchmark scripts by @Ying1123 in https://github.com/sgl-project/sglang/pull/476
- [Feat/Fix] Refactoring Llava models into single file by @Luodian in https://github.com/sgl-project/sglang/pull/475
- Improve benchmark scripts & rename some scripts by @merrymercy in https://github.com/sgl-project/sglang/pull/477
- Improve benchmark scripts & add more models by @merrymercy in https://github.com/sgl-project/sglang/pull/484
- Support data parallelism (static) by @Ying1123 in https://github.com/sgl-project/sglang/pull/480
- Make the server random by default by @merrymercy in https://github.com/sgl-project/sglang/pull/488
- Revert "Make the server random by default" by @Ying1123 in https://github.com/sgl-project/sglang/pull/492
- update the script: examples/usage/llava_video/srt_example_llava_v.sh by @ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/491
- Make the server random by default by @merrymercy in https://github.com/sgl-project/sglang/pull/493
- Update vllm to v0.4.3 by @merrymercy in https://github.com/sgl-project/sglang/pull/511
- remove redundant pad_input_ids function by @amosyou in https://github.com/sgl-project/sglang/pull/500
- Litellm Backend by @huyiwen in https://github.com/sgl-project/sglang/pull/502
- Fix rid state map leak + Refractor .finished by @Qubitium in https://github.com/sgl-project/sglang/pull/505
- Crash the server when error or OOM happens by @merrymercy in https://github.com/sgl-project/sglang/pull/514
- Update version to 0.1.17 by @merrymercy in https://github.com/sgl-project/sglang/pull/515
New Contributors
- @kcz358 made their first contribution in https://github.com/sgl-project/sglang/pull/419
- @mgerstgrasser made their first contribution in https://github.com/sgl-project/sglang/pull/446
- @bingwork made their first contribution in https://github.com/sgl-project/sglang/pull/470
- @amosyou made their first contribution in https://github.com/sgl-project/sglang/pull/500
- @huyiwen made their first contribution in https://github.com/sgl-project/sglang/pull/502
Full Changelog: https://github.com/sgl-project/sglang/compare/v0.1.16...v0.1.17