GPTFast-0.2.0
版本发布时间: 2024-04-02 12:16:41
MDK8888/GPTFast最新发布版本:GPTFast-0.3.1(2024-08-22 12:11:36)
- Inference speeds are now accelerated by 6-8.5x
- Static key-value caching is now enabled for all Hugging Face models
- Support for generic sampling functions in addition to argmax
- Debugged speculative decoding