v0.6.3
版本发布时间: 2024-06-24 08:58:37
tatsu-lab/alpaca_eval最新发布版本:v0.6.5(2024-08-18 07:39:20)
What's Changed
- Add the evaluation result for our latest model by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/286
- Add Ghost 7B Alpha to AlpacaEval by @lh0x00 in https://github.com/tatsu-lab/alpaca_eval/pull/288
- Add link for FsfairX-Zephyr-Chat-v0.1 by @hendrydong in https://github.com/tatsu-lab/alpaca_eval/pull/289
- add Qwen1.5-110B-Chat self-report results by @Lukeming-tsinghua in https://github.com/tatsu-lab/alpaca_eval/pull/291
- [ENH] verifying all the qwens by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/292
- Enable analyzing evaluators/annotators on data without multiple generator models by @rdnfn in https://github.com/tatsu-lab/alpaca_eval/pull/293
- Add Storm-7B to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/294
- Use verified by default by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/297
- Add SPPO-Mistral7B-PairRM to AlpacaEval by @Edward-Sun in https://github.com/tatsu-lab/alpaca_eval/pull/298
- Add ExPO results to AlpacaEval by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/299
- Fix typo in README.md by @tongyx361 in https://github.com/tatsu-lab/alpaca_eval/pull/302
- Add Yi-Large Preview to AlpacaEval by @HyperdriveHustle in https://github.com/tatsu-lab/alpaca_eval/pull/304
- "Add Mistral-7B+RAHF-DUAL+LoRA to AlpacaEval" by @LiuAmber in https://github.com/tatsu-lab/alpaca_eval/pull/307
- [verified] Yi-large by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/309
- [ADD] GPT4-o by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/311
- [ENH] add LC SEM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/317
- llama3 evaluator by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/314
- Update README.md by @zhuang-li in https://github.com/tatsu-lab/alpaca_eval/pull/315
- [CLEAN] move evaluators lb llama3 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/318
- [ENH] vicuna 1.5 by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/319
- Add Llama-3-Instruct-8B-SimPO to AlpacaEval by @xiamengzhou in https://github.com/tatsu-lab/alpaca_eval/pull/320
- [ENH] Use multi threading instead of processing by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/321
- Add Aligner 2B+GPT-4 Turbo (04/09) Results by @AlignInc in https://github.com/tatsu-lab/alpaca_eval/pull/324
- Add REBEL-Llama-3-8B-Instruct to AlpacaEval by @ZhaolinGao in https://github.com/tatsu-lab/alpaca_eval/pull/326
- [ENH&BUG] improve VLLM by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/330
- Add ExPO +
Llama-3-Instruct-8B-SimPO
results by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/331 - fix model link by @chujiezheng in https://github.com/tatsu-lab/alpaca_eval/pull/332
- Add merlinite-7B-AOT to AlpacaEval by @imelnyk in https://github.com/tatsu-lab/alpaca_eval/pull/334
- [BUG] fix bs in VLLM and add chatml by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/338
- Add Together-MoA, Together-MoA-Lite to AlpacaEval by @IsThatYou in https://github.com/tatsu-lab/alpaca_eval/pull/342
- Add Nanbeige2-16B-Chat to AlpacaEval by @yuani114 in https://github.com/tatsu-lab/alpaca_eval/pull/345
- Add claude-3-5-sonnet-20240620 to AlpacaEval by @MarjovanLier in https://github.com/tatsu-lab/alpaca_eval/pull/348
- [BUG] trust repo alpaca_eval by @YannDubs in https://github.com/tatsu-lab/alpaca_eval/pull/349
- Add OpenPipe Mixture of Agents model to Alpaca Eval by @saum7800 in https://github.com/tatsu-lab/alpaca_eval/pull/347
- Add Storm-7B, Storm-7B (best-of-64) to AlpacaEval by @yifan123 in https://github.com/tatsu-lab/alpaca_eval/pull/344
- Add Infinity-Instruct-3M-0613-Mistral-7B to AlpacaEval by @cszhengyh in https://github.com/tatsu-lab/alpaca_eval/pull/351
New Contributors
- @hendrydong made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/286
- @lh0x00 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/288
- @yifan123 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/294
- @Edward-Sun made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/298
- @chujiezheng made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/299
- @tongyx361 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/302
- @LiuAmber made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/307
- @zhuang-li made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/314
- @xiamengzhou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/320
- @ZhaolinGao made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/326
- @imelnyk made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/334
- @IsThatYou made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/342
- @MarjovanLier made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/348
- @saum7800 made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/347
- @cszhengyh made their first contribution in https://github.com/tatsu-lab/alpaca_eval/pull/351
Full Changelog: https://github.com/tatsu-lab/alpaca_eval/compare/v0.6.2...v0.6.3