v1.12

intel/neural-compressor

版本发布时间: 2022-05-27 22:40:00

intel/neural-compressor最新发布版本:v2.6(2024-06-14 21:55:11)

Features

Quantization
- Support accuracy-aware AMP (INT8/BF16/FP32) on PyTorch
- Improve post-training quantization (static & dynamic) on PyTorch
- Improve post-training quantization on TensorFlow
- Improve QLinear and QDQ quantization modes on ONNX Runtime
- Improve accuracy-aware AMP (INT8/FP32) on ONNX Runtime
Pruning
- Improve pruning-once-for-all for NLP models
Sparsity
- Support experimental sparse kernel for reference examples

Productivity

Support model deployment by loading INT8 models directly from HuggingFace model hub
Improve GUI with optimized model downloading, performance profiling, etc.

Ecosystem

Highlight simple quantization usage with few clicks on ONNX Model Zoo
Upstream INC quantized models (ResNet101, Tiny YoloV3) to ONNX Model Zoo

Examples

Add Bert-mini distillation + quantization notebook example
Add DLRM & SSD-ResNet34 quantization examples on IPEX
Improve BERT structured sparsity training example

Validated Configurations

Python 3.8, 3.9, 3.10
Centos 8.3 & Ubuntu 18.04 & Win10
TensorFlow 2.6.2, 2.7, 2.8
Intel TensorFlow 1.15.0 UP3, 2.7, 2.8
PyTorch 1.8.0+cpu, 1.9.0+cpu, 1.10.0+cpu
IPEX 1.8.0, 1.9.0, 1.10.0
MxNet 1.6.0, 1.7.0, 1.8.0
ONNX Runtime 1.8.0, 1.9.0, 1.10.0

相关地址：原始地址下载(tar) 下载(zip)

查看：2022-05-27发行的版本