MyGit

v2.0.1

bigscience-workshop/petals

版本发布时间: 2023-07-23 22:54:09

bigscience-workshop/petals最新发布版本:v2.2.0(2023-09-07 01:29:56)

Highlights

🛣️ Inference of longer sequences. We extended the max sequence length to 8192 tokens for Llama 2 and added chunking to avoid server out-of-memory errors (happened when processing long prefixes). This became possible thanks to multi-query attention used in Llama 2, which uses 8x less GPU memory for attention caches. Now you can process longer sequences using a Petals client and have dialogues of up to 8192 tokens at https://chat.petals.dev

🐍 Python 3.11 support. Petals clients and servers now work on Python 3.11.

🐞 Bug fixes. We fixed the server's --token argument (used to provide your 🤗 Model Hub access token for loading Llama 2), possible deadlocks in the server, issues with fine-tuning speed (servers available via relays are deprioritized) and other minor load balancing issues.

🪟 Running server on Windows. We made a better guide for running a server in WSL (Windows Subsystem for Linux).

📦 Running server on Runpod. We added a guide for using a Petals template on Runpod.

What's Changed

Full Changelog: https://github.com/bigscience-workshop/petals/compare/v2.0.0.post1...v2.0.1

相关地址:原始地址 下载(tar) 下载(zip)

查看:2023-07-23发行的版本