v1.4.1
版本发布时间: 2024-04-21 16:38:30
intel/intel-extension-for-transformers最新发布版本:v1.4.2(2024-05-24 20:23:38)
Highlights Improvements Examples Bug Fixing
Highlights
- Support Weight-only Quantization on MTL iGPU
- Upgrade lm-eval to 0.4.2
- Support Llama3
Improvements
- Support TPP for Xeon Tensor Parallel (5f0430f )
- Refine Model
from_pretrained
Whenuse_neural_speed
(39ecf38e )
Examples
- Add vision front-end demo (1c6550 )
- Add example for table extraction, and enabled multi-page table handling pipeline (db9e6fb )
- Adapted textual inversion distillation for quantization example to latest transformers and diffusers packages (0ec83b1 )
- Update NeuralChat Notebooks (83bb65a, 629b9d4 )
Bug Fixing
- Fix QBits actshuf buf overflow under large batch (a6f3ab3 )
- Fix TPP support for single socket (a690072 )
- Fix retrieval dependency (281b0a3 )
- Fix loading issue of woq model with parameters (37f9db25 )
Validated Configurations
- Python 3.10
- Ubuntu 22.04
- PyTorch 2.2.0+cpu
- Intel® Extension for Torch 2.2.0+cpu