GPTFast-0.3.0
版本发布时间: 2024-06-21 09:51:43
MDK8888/GPTFast最新发布版本:GPTFast-0.3.1(2024-08-22 12:11:36)
- GPTQ INT4 quantization available for all HF models
- Accelerates inference speed by 7.6x-9x
- Integrates optimized INT4 matrix multiplication kernels from the PyTorch team for all HF models