v0.1.16

sgl-project/sglang

版本发布时间: 2024-05-14 08:36:05

sgl-project/sglang最新发布版本:v0.3.0(2024-09-04 19:50:29)

Highlight

Support more models: DBRX, Command-R, Gemma
Support llava-video (#423, https://llava-vl.github.io/blog/2024-04-30-llava-next-video/)
Cache performance improvements (#418, #364)
Marlin quantization kernels
Many bug fixes
Update dependencies to be compatible with their latest versions

What's Changed

Fix Runtime missing some ServerArgs options by @Qubitium in https://github.com/sgl-project/sglang/pull/281
adding the triton docker build minimal example by @amirarsalan90 in https://github.com/sgl-project/sglang/pull/242
Fix flashinfer >= 0.0.3 compat by @Qubitium in https://github.com/sgl-project/sglang/pull/282
Fix Incorrect CURL Request Example in README by @amirarsalan90 in https://github.com/sgl-project/sglang/pull/287
enable marlin kernels by @qeternity in https://github.com/sgl-project/sglang/pull/286
Fix env (docker) compat due to file usage by @Qubitium in https://github.com/sgl-project/sglang/pull/288
Fix marlin model loading compat with autogptq by @Liurl21 in https://github.com/sgl-project/sglang/pull/290
Fix outlines-0.0.35 incompatibility by @ZhouGongZaiShi in https://github.com/sgl-project/sglang/pull/291
[Fix/Potential Bugs] Can not correctly import models in python/sglang/srt/models by @Luodian in https://github.com/sgl-project/sglang/pull/311
Use Anthropic messages API by @janimo in https://github.com/sgl-project/sglang/pull/304
Add StableLM model. by @janimo in https://github.com/sgl-project/sglang/pull/301
Support oai in benchmark/mmlu by @merrymercy in https://github.com/sgl-project/sglang/pull/323
Update version to v0.1.14 by @merrymercy in https://github.com/sgl-project/sglang/pull/324
Cleanup codebase: removed unnecessary code/logic by @Qubitium in https://github.com/sgl-project/sglang/pull/298
Update dependencies by @janimo in https://github.com/sgl-project/sglang/pull/326
Openrouter usage example by @janimo in https://github.com/sgl-project/sglang/pull/327
model_rpc style improvement by @hnyls2002 in https://github.com/sgl-project/sglang/pull/293
model_runner simplify by @hnyls2002 in https://github.com/sgl-project/sglang/pull/329
Logprobs Refractor by @hnyls2002 in https://github.com/sgl-project/sglang/pull/331
DBRX support by @hnyls2002 in https://github.com/sgl-project/sglang/pull/337
Add support for new autogptq quant_config.checkpoint_format by @Qubitium in https://github.com/sgl-project/sglang/pull/332
Fix llava parallelism/fork bug by @lockon-n in https://github.com/sgl-project/sglang/pull/315
Eliminate 2 gpu ops during sampling when logit_bias is zero by @hnyls2002 in https://github.com/sgl-project/sglang/pull/343
Revert "Eliminate 2 gpu ops during sampling when logit_bias is zero" by @hnyls2002 in https://github.com/sgl-project/sglang/pull/345
Eliminate 2 gpu ops during sampling when logit_bias is zero by @Qubitium in https://github.com/sgl-project/sglang/pull/338
Add timeout to get_meta_info by @SimoneRaponi in https://github.com/sgl-project/sglang/pull/346
Fix typos in infer_batch.py by @tom-doerr in https://github.com/sgl-project/sglang/pull/354
Time cost utils by @hnyls2002 in https://github.com/sgl-project/sglang/pull/355
Update README.md by @eltociear in https://github.com/sgl-project/sglang/pull/358
support command-r by @ZhouXingg in https://github.com/sgl-project/sglang/pull/369
Fix issue #367 – System message not supported for Anthropic (anthropic.BadRequestError) by @fronx in https://github.com/sgl-project/sglang/pull/368
Update model support in readme by @Ying1123 in https://github.com/sgl-project/sglang/pull/370
Optimize radix tree matching by @ispobock in https://github.com/sgl-project/sglang/pull/364
Reduce overhead when fork(1) by @hnyls2002 in https://github.com/sgl-project/sglang/pull/375
llama3 instruct template by @qeternity in https://github.com/sgl-project/sglang/pull/372
add .isort.cfg by @hnyls2002 in https://github.com/sgl-project/sglang/pull/378
Revert removing the unused imports by @hnyls2002 in https://github.com/sgl-project/sglang/pull/385
Benchmark Updates by @hnyls2002 in https://github.com/sgl-project/sglang/pull/382
Improve performance when running with full parallel by @hnyls2002 in https://github.com/sgl-project/sglang/pull/394
Minor: style improvement of radix_cache and memory_pool by @hnyls2002 in https://github.com/sgl-project/sglang/pull/395
Format Benchmark Code by @hnyls2002 in https://github.com/sgl-project/sglang/pull/399
Fix chatml template by @merrymercy in https://github.com/sgl-project/sglang/pull/406
Adding RAG tracing & eval cookbook using Parea by @joschkabraun in https://github.com/sgl-project/sglang/pull/390
SamplingParams add "spaces_between_special_tokens" argument by @ZhouXingg in https://github.com/sgl-project/sglang/pull/392
Organize Benchmark by @hnyls2002 in https://github.com/sgl-project/sglang/pull/381
Add Cohere Command R chat template by @noah-kim-theori in https://github.com/sgl-project/sglang/pull/411
Fix sync() when fork(1) by @hnyls2002 in https://github.com/sgl-project/sglang/pull/412
Include finish reason in meta info response by @qeternity in https://github.com/sgl-project/sglang/pull/415
Make public APIs more standard. by @hnyls2002 in https://github.com/sgl-project/sglang/pull/416
Compat with latest VLLM 0.4.2 main + fork.number rename + Flashinfer 0.0.4 by @Qubitium in https://github.com/sgl-project/sglang/pull/380
Optimize the memory usage of logits processor by @merrymercy in https://github.com/sgl-project/sglang/pull/420
Clean up by @merrymercy in https://github.com/sgl-project/sglang/pull/422
Fix logit processor bugs by @merrymercy in https://github.com/sgl-project/sglang/pull/427
Minor fix for the import path by @merrymercy in https://github.com/sgl-project/sglang/pull/428
Move openai api server into a separate file by @merrymercy in https://github.com/sgl-project/sglang/pull/429
Fix flashinfer by @merrymercy in https://github.com/sgl-project/sglang/pull/430
Update version to 0.1.15 by @merrymercy in https://github.com/sgl-project/sglang/pull/431
Misc fixes by @merrymercy in https://github.com/sgl-project/sglang/pull/432
Allow input_ids in the input of the /generate endpoint by @lolipopshock in https://github.com/sgl-project/sglang/pull/363
Improve error handling by @merrymercy in https://github.com/sgl-project/sglang/pull/433
Cache optimizations by @hnyls2002 in https://github.com/sgl-project/sglang/pull/418
Update readme by @merrymercy in https://github.com/sgl-project/sglang/pull/434
Raise errors for prompts that are too long by @merrymercy in https://github.com/sgl-project/sglang/pull/436
support llava video by @ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/426
Fix streaming by @merrymercy in https://github.com/sgl-project/sglang/pull/437
Update version to 0.1.16 by @merrymercy in https://github.com/sgl-project/sglang/pull/438

New Contributors

@Qubitium made their first contribution in https://github.com/sgl-project/sglang/pull/281
@amirarsalan90 made their first contribution in https://github.com/sgl-project/sglang/pull/242
@Liurl21 made their first contribution in https://github.com/sgl-project/sglang/pull/290
@ZhouGongZaiShi made their first contribution in https://github.com/sgl-project/sglang/pull/291
@Luodian made their first contribution in https://github.com/sgl-project/sglang/pull/311
@janimo made their first contribution in https://github.com/sgl-project/sglang/pull/304
@lockon-n made their first contribution in https://github.com/sgl-project/sglang/pull/315
@SimoneRaponi made their first contribution in https://github.com/sgl-project/sglang/pull/346
@tom-doerr made their first contribution in https://github.com/sgl-project/sglang/pull/354
@ZhouXingg made their first contribution in https://github.com/sgl-project/sglang/pull/369
@fronx made their first contribution in https://github.com/sgl-project/sglang/pull/368
@ispobock made their first contribution in https://github.com/sgl-project/sglang/pull/364
@joschkabraun made their first contribution in https://github.com/sgl-project/sglang/pull/390
@noah-kim-theori made their first contribution in https://github.com/sgl-project/sglang/pull/411
@lolipopshock made their first contribution in https://github.com/sgl-project/sglang/pull/363
@ZhangYuanhan-AI made their first contribution in https://github.com/sgl-project/sglang/pull/426

Full Changelog: https://github.com/sgl-project/sglang/compare/v0.1.13...v0.1.16

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-05-14发行的版本