0.2.5
版本发布时间: 2024-05-30 00:35:04
open-compass/opencompass最新发布版本:0.3.6(2024-11-19 11:54:28)
The OpenCompass team is thrilled to announce the release of OpenCompass v0.2.5!
🌟 Highlights
- Simplify the huggingface / vllm / lmdeploy model wrapper.
meta_template
is no longer needed to be hand-crafted in model configs - Introduce evaluation results README in ~20 dataset config folders.
🚀 New Features
- #1065 Add LLaMA-3 Series Configs
- #1048 Add TheoremQA with 5-shot
- #1094 Support Math evaluation via judgemodel
- #1080 Add gpqa prompt from simple_evals, openai
- #1074 Add mmlu prompt from simple_evals, openai
- #1123 Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs
📖 Documentation
- #1053 Update readme
- #1102 Update NeedleInAHaystack Docs
- #1110 Update README.md
- #1205 Remove --no-batch-padding and Use --hf-num-gpus
🐛 Bug Fixes
- #1036 Update setup.py install_requires
- #1051 Fixed the issue caused
- #1043 fix multiround
- #1070 Fix sequential runner
- #1079 Fix Llama-3 meta template
⚙ Enhancements and Refactors
- #1163 enable HuggingFacewithChatTemplate with --accelerator via cli
- #1104 fix prompt template
- #1109 Update performance of common benchmarks
🎉 Welcome New Contributors
- @liuwei130, @IcyFeather233, @VVVenus1212, @binary-husky, @dmitrysarov, @eltociear, @acylam, @lfy79001, @JuhaoLiang1997, @yaoyingyy, and @jxd0712 made their first contributions. Welcome to the OpenCompass community!
🔗 Full Change Logs
- [Fix] Update setup.py install_requires by @Leymore in https://github.com/open-compass/opencompass/pull/1036
- add ChemBench by @liuwei130 in https://github.com/open-compass/opencompass/pull/1032
- [Fix] logger.error -> logger.debug in OpenAI by @Leymore in https://github.com/open-compass/opencompass/pull/1050
- [Sync] Bump version to 0.2.4 by @Leymore in https://github.com/open-compass/opencompass/pull/1052
- [Doc] Update readme by @tonysy in https://github.com/open-compass/opencompass/pull/1053
- [fix]Fixed the issue caused by the repeated loading of VLLM model dur… by @IcyFeather233 in https://github.com/open-compass/opencompass/pull/1051
- [Sync] Sync with internal code 2024.04.19 by @Leymore in https://github.com/open-compass/opencompass/pull/1064
- [Fix] fix multiround by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1043
- [Feature] Add LLaMA-3 Series Configs by @Leymore in https://github.com/open-compass/opencompass/pull/1065
- [Feature] Add TheoremQA with 5-shot by @Leymore in https://github.com/open-compass/opencompass/pull/1048
- [Fix] Fix sequential runner by @Leymore in https://github.com/open-compass/opencompass/pull/1070
- Add lmdeploy tis python backend model by @ispobock in https://github.com/open-compass/opencompass/pull/1014
- Fix Llama-3 meta template by @liushz in https://github.com/open-compass/opencompass/pull/1079
- Add humaneval prompt from simple_evals, openai by @jingmingzhuo in https://github.com/open-compass/opencompass/pull/1076
- [Feature] Support Math evaluation via judgemodel by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1094
- [Feature] support arenahard evaluation by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1096
- Update CIBench by @kleinzcy in https://github.com/open-compass/opencompass/pull/1089
- [Feature] Add gpqa prompt from simple_evals, openai by @Francis-llgg in https://github.com/open-compass/opencompass/pull/1080
- [Deperecate] Remove multi-modal related stuff by @kennymckormick in https://github.com/open-compass/opencompass/pull/1072
- add vllm get_ppl by @VVVenus1212 in https://github.com/open-compass/opencompass/pull/1003
- fix: python path bug by @binary-husky in https://github.com/open-compass/opencompass/pull/1063
- fix output typing, change mutable list to immutable tuple by @dmitrysarov in https://github.com/open-compass/opencompass/pull/989
- [Doc] Update NeedleInAHaystack Docs by @DseidLi in https://github.com/open-compass/opencompass/pull/1102
- [Feature] add support for Flames datasets by @Yggdrasill7D6 in https://github.com/open-compass/opencompass/pull/1093
- adapt to lmdeploy v0.4.0 by @lvhan028 in https://github.com/open-compass/opencompass/pull/1073
- [Fix] fix prompt template by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1104
- [Fix] Fix Math Evaluation with Judge Model Evaluator & Add README by @liushz in https://github.com/open-compass/opencompass/pull/1103
- [Update] Update performance of common benchmarks by @tonysy in https://github.com/open-compass/opencompass/pull/1109
- [Fix] fix cmb dataset by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1106
- [Docs] Update README.md by @eltociear in https://github.com/open-compass/opencompass/pull/1110
- [Feature] Adding support for LLM Compression Evaluation by @acylam in https://github.com/open-compass/opencompass/pull/1108
- [Fix] remove redundant pre-commit check by @Leymore in https://github.com/open-compass/opencompass/pull/891
- fix LightllmApi workers bug by @helloyongyang in https://github.com/open-compass/opencompass/pull/1113
- [Feature] Add mmlu prompt from simple_evals, openai by @Leymore in https://github.com/open-compass/opencompass/pull/1074
- [Feature] update drop dataset from openai simple eval by @kleinzcy in https://github.com/open-compass/opencompass/pull/1092
- add mgsm datasets by @Yggdrasill7D6 in https://github.com/open-compass/opencompass/pull/1081
- [Fix] Fix AGIEval chinese sets by @xu-song in https://github.com/open-compass/opencompass/pull/972
- S3Eval Dataset by @lfy79001 in https://github.com/open-compass/opencompass/pull/916
- [Feature] Add AceGPT-MMLUArabic benchmark by @JuhaoLiang1997 in https://github.com/open-compass/opencompass/pull/1099
- [Fix] fix links by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1120
- [Fix] Fix NeedleBench Summarizer Typo by @DseidLi in https://github.com/open-compass/opencompass/pull/1125
- [Feature] Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs by @acylam in https://github.com/open-compass/opencompass/pull/1123
- [Sync] Update accelerator by @Leymore in https://github.com/open-compass/opencompass/pull/1122
- [Fix] fix alpacaeval while add caching path by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1139
- [Fix] fix multiround by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1146
- [Fix] Fix Needlebench Summarizer by @DseidLi in https://github.com/open-compass/opencompass/pull/1143
- [Feature] Add huggingface apply_chat_template by @Leymore in https://github.com/open-compass/opencompass/pull/1098
- [Feat] Support dataset_suffix check for mixed configs by @xu-song in https://github.com/open-compass/opencompass/pull/973
- [Format] Add some config lints by @Leymore in https://github.com/open-compass/opencompass/pull/892
- [Sync] Sync with internal codes 2024.05.14 by @Leymore in https://github.com/open-compass/opencompass/pull/1156
- [Fix] fix arenahard summarizer by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1154
- [Fix] use ProcessPoolExecutor during mbpp eval by @Leymore in https://github.com/open-compass/opencompass/pull/1159
- [Fix] Update stop_words in huggingface_above_v4_33 by @Leymore in https://github.com/open-compass/opencompass/pull/1160
- Update accelerator by @liushz in https://github.com/open-compass/opencompass/pull/1152
- [Feat] enable HuggingFacewithChatTemplate with --accelerator via cli by @Leymore in https://github.com/open-compass/opencompass/pull/1163
- update test workflow by @zhulinJulia24 in https://github.com/open-compass/opencompass/pull/1167
- [Sync] Sync with internal codes 2024.05.17 by @Leymore in https://github.com/open-compass/opencompass/pull/1171
- add dependency in daily test workflow by @zhulinJulia24 in https://github.com/open-compass/opencompass/pull/1173
- [Sync] Sync with internal codes 2024.05.21.1 by @Leymore in https://github.com/open-compass/opencompass/pull/1175
- Update MathBench by @liushz in https://github.com/open-compass/opencompass/pull/1176
- [Fix] fix template by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1178
- Fix a bug in drop_gen.py by @kleinzcy in https://github.com/open-compass/opencompass/pull/1191
- [Fix] temporary files using tempfile by @yaoyingyy in https://github.com/open-compass/opencompass/pull/1186
- [Fix] add support for lmdeploy api judge by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1193
- [Fix] fix length by @bittersweet1999 in https://github.com/open-compass/opencompass/pull/1180
- support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks by @jxd0712 in https://github.com/open-compass/opencompass/pull/1190
- [Feat] Update charm summary by @Leymore in https://github.com/open-compass/opencompass/pull/1194
- Update accelerator by @liushz in https://github.com/open-compass/opencompass/pull/1195
- [Sync] Sync with internal codes 2024.05.28 by @Leymore in https://github.com/open-compass/opencompass/pull/1204
- Fix VLLM argument error by @xu-song in https://github.com/open-compass/opencompass/pull/1207
- [Docs] Remove --no-batch-padding and Use --hf-num-gpus by @Leymore in https://github.com/open-compass/opencompass/pull/1205
- [Fix] Rollback opt model configs by @Leymore in https://github.com/open-compass/opencompass/pull/1213
- Update running command readme by @Leymore in https://github.com/open-compass/opencompass/pull/1206
- [Sync] Sync with internal code 2024.05.30 by @Leymore in https://github.com/open-compass/opencompass/pull/1214
For a detailed overview of all changes, check out our Full Changelog.