0.2.5

open-compass/opencompass

版本发布时间: 2024-05-30 00:35:04

open-compass/opencompass最新发布版本:0.3.5(2024-11-04 10:56:18)

The OpenCompass team is thrilled to announce the release of OpenCompass v0.2.5!

🌟 Highlights

Simplify the huggingface / vllm / lmdeploy model wrapper. meta_template is no longer needed to be hand-crafted in model configs
Introduce evaluation results README in ~20 dataset config folders.

🚀 New Features

#1065 Add LLaMA-3 Series Configs
#1048 Add TheoremQA with 5-shot
#1094 Support Math evaluation via judgemodel
#1080 Add gpqa prompt from simple_evals, openai
#1074 Add mmlu prompt from simple_evals, openai
#1123 Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs

📖 Documentation

#1053 Update readme
#1102 Update NeedleInAHaystack Docs
#1110 Update README.md
#1205 Remove --no-batch-padding and Use --hf-num-gpus

🐛 Bug Fixes

#1036 Update setup.py install_requires
#1051 Fixed the issue caused
#1043 fix multiround
#1070 Fix sequential runner
#1079 Fix Llama-3 meta template

⚙ Enhancements and Refactors

#1163 enable HuggingFacewithChatTemplate with --accelerator via cli
#1104 fix prompt template
#1109 Update performance of common benchmarks

🎉 Welcome New Contributors

@liuwei130, @IcyFeather233, @VVVenus1212, @binary-husky, @dmitrysarov, @eltociear, @acylam, @lfy79001, @JuhaoLiang1997, @yaoyingyy, and @jxd0712 made their first contributions. Welcome to the OpenCompass community!

🔗 Full Change Logs

[Fix] Update setup.py install_requires by @Leymore in https://github.com/open-compass/opencompass/pull/1036
add ChemBench by @liuwei130 in https://github.com/open-compass/opencompass/pull/1032
[Fix] logger.error -> logger.debug in OpenAI by @Leymore in https://github.com/open-compass/opencompass/pull/1050
[Sync] Bump version to 0.2.4 by @Leymore in https://github.com/open-compass/opencompass/pull/1052
[Doc] Update readme by @tonysy in https://github.com/open-compass/opencompass/pull/1053
[fix]Fixed the issue caused by the repeated loading of VLLM model dur… by @IcyFeather233 in https://github.com/open-compass/opencompass/pull/1051
[Sync] Sync with internal code 2024.04.19 by @Leymore in https://github.com/open-compass/opencompass/pull/1064
[Fix] fix multiround by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1043
[Feature] Add LLaMA-3 Series Configs by @Leymore in https://github.com/open-compass/opencompass/pull/1065
[Feature] Add TheoremQA with 5-shot by @Leymore in https://github.com/open-compass/opencompass/pull/1048
[Fix] Fix sequential runner by @Leymore in https://github.com/open-compass/opencompass/pull/1070
Add lmdeploy tis python backend model by @ispobock in https://github.com/open-compass/opencompass/pull/1014
Fix Llama-3 meta template by @liushz in https://github.com/open-compass/opencompass/pull/1079
Add humaneval prompt from simple_evals, openai by @jingmingzhuo in https://github.com/open-compass/opencompass/pull/1076
[Feature] Support Math evaluation via judgemodel by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1094
[Feature] support arenahard evaluation by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1096
Update CIBench by @kleinzcy in https://github.com/open-compass/opencompass/pull/1089
[Feature] Add gpqa prompt from simple_evals, openai by @Francis-llgg in https://github.com/open-compass/opencompass/pull/1080
[Deperecate] Remove multi-modal related stuff by @kennymckormick in https://github.com/open-compass/opencompass/pull/1072
add vllm get_ppl by @VVVenus1212 in https://github.com/open-compass/opencompass/pull/1003
fix: python path bug by @binary-husky in https://github.com/open-compass/opencompass/pull/1063
fix output typing, change mutable list to immutable tuple by @dmitrysarov in https://github.com/open-compass/opencompass/pull/989
[Doc] Update NeedleInAHaystack Docs by @DseidLi in https://github.com/open-compass/opencompass/pull/1102
[Feature] add support for Flames datasets by @Yggdrasill7D6 in https://github.com/open-compass/opencompass/pull/1093
adapt to lmdeploy v0.4.0 by @lvhan028 in https://github.com/open-compass/opencompass/pull/1073
[Fix] fix prompt template by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1104
[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README by @liushz in https://github.com/open-compass/opencompass/pull/1103
[Update] Update performance of common benchmarks by @tonysy in https://github.com/open-compass/opencompass/pull/1109
[Fix] fix cmb dataset by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1106
[Docs] Update README.md by @eltociear in https://github.com/open-compass/opencompass/pull/1110
[Feature] Adding support for LLM Compression Evaluation by @acylam in https://github.com/open-compass/opencompass/pull/1108
[Fix] remove redundant pre-commit check by @Leymore in https://github.com/open-compass/opencompass/pull/891
fix LightllmApi workers bug by @helloyongyang in https://github.com/open-compass/opencompass/pull/1113
[Feature] Add mmlu prompt from simple_evals, openai by @Leymore in https://github.com/open-compass/opencompass/pull/1074
[Feature] update drop dataset from openai simple eval by @kleinzcy in https://github.com/open-compass/opencompass/pull/1092
add mgsm datasets by @Yggdrasill7D6 in https://github.com/open-compass/opencompass/pull/1081
[Fix] Fix AGIEval chinese sets by @xu-song in https://github.com/open-compass/opencompass/pull/972
S3Eval Dataset by @lfy79001 in https://github.com/open-compass/opencompass/pull/916
[Feature] Add AceGPT-MMLUArabic benchmark by @JuhaoLiang1997 in https://github.com/open-compass/opencompass/pull/1099
[Fix] fix links by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1120
[Fix] Fix NeedleBench Summarizer Typo by @DseidLi in https://github.com/open-compass/opencompass/pull/1125
[Feature] Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs by @acylam in https://github.com/open-compass/opencompass/pull/1123
[Sync] Update accelerator by @Leymore in https://github.com/open-compass/opencompass/pull/1122
[Fix] fix alpacaeval while add caching path by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1139
[Fix] fix multiround by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1146
[Fix] Fix Needlebench Summarizer by @DseidLi in https://github.com/open-compass/opencompass/pull/1143
[Feature] Add huggingface apply_chat_template by @Leymore in https://github.com/open-compass/opencompass/pull/1098
[Feat] Support dataset_suffix check for mixed configs by @xu-song in https://github.com/open-compass/opencompass/pull/973
[Format] Add some config lints by @Leymore in https://github.com/open-compass/opencompass/pull/892
[Sync] Sync with internal codes 2024.05.14 by @Leymore in https://github.com/open-compass/opencompass/pull/1156
[Fix] fix arenahard summarizer by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1154
[Fix] use ProcessPoolExecutor during mbpp eval by @Leymore in https://github.com/open-compass/opencompass/pull/1159
[Fix] Update stop_words in huggingface_above_v4_33 by @Leymore in https://github.com/open-compass/opencompass/pull/1160
Update accelerator by @liushz in https://github.com/open-compass/opencompass/pull/1152
[Feat] enable HuggingFacewithChatTemplate with --accelerator via cli by @Leymore in https://github.com/open-compass/opencompass/pull/1163
update test workflow by @zhulinJulia24 in https://github.com/open-compass/opencompass/pull/1167
[Sync] Sync with internal codes 2024.05.17 by @Leymore in https://github.com/open-compass/opencompass/pull/1171
add dependency in daily test workflow by @zhulinJulia24 in https://github.com/open-compass/opencompass/pull/1173
[Sync] Sync with internal codes 2024.05.21.1 by @Leymore in https://github.com/open-compass/opencompass/pull/1175
Update MathBench by @liushz in https://github.com/open-compass/opencompass/pull/1176
[Fix] fix template by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1178
Fix a bug in drop_gen.py by @kleinzcy in https://github.com/open-compass/opencompass/pull/1191
[Fix] temporary files using tempfile by @yaoyingyy in https://github.com/open-compass/opencompass/pull/1186
[Fix] add support for lmdeploy api judge by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1193
[Fix] fix length by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1180
support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks by @jxd0712 in https://github.com/open-compass/opencompass/pull/1190
[Feat] Update charm summary by @Leymore in https://github.com/open-compass/opencompass/pull/1194
Update accelerator by @liushz in https://github.com/open-compass/opencompass/pull/1195
[Sync] Sync with internal codes 2024.05.28 by @Leymore in https://github.com/open-compass/opencompass/pull/1204
Fix VLLM argument error by @xu-song in https://github.com/open-compass/opencompass/pull/1207
[Docs] Remove --no-batch-padding and Use --hf-num-gpus by @Leymore in https://github.com/open-compass/opencompass/pull/1205
[Fix] Rollback opt model configs by @Leymore in https://github.com/open-compass/opencompass/pull/1213
Update running command readme by @Leymore in https://github.com/open-compass/opencompass/pull/1206
[Sync] Sync with internal code 2024.05.30 by @Leymore in https://github.com/open-compass/opencompass/pull/1214

For a detailed overview of all changes, check out our Full Changelog.

相关地址：原始地址下载(tar) 下载(zip)

查看：2024-05-30发行的版本