v1.5.0
版本发布时间: 2023-06-07 13:18:20
neuralmagic/deepsparse最新发布版本:v1.8.0(2024-07-20 02:18:07)
New Features:
- ONNX evaluation pipeline for OpenPifPaf (#915)
- YOLOv8 segmentation pipelines and validation (#924)
-
deepsparse.benchmark_sweep
CLI to enable sweeps of benchmarks across different settings such as cores and batch sizes (#860) -
Engine.generate_random_inputs()
API (#966) - Example data logging configurations for pipelines/server (#867)
- Expanded built-in functions for NLP and CV pipeline logging to enable better monitoring (#865) (#862)
- Product usage analytics tracking in DeepSparse Community edition (documentation)
Performance Improvements:
- Inference latency for unstructured sparse-quantized CNNs has been improved by up to 2x.
- Inference throughput and latency for dense CNNs has been improved by up to 20%.
- Inference throughput and latency for dense transformers has been improved by up to 30%.
- The following operators are now supported for performance:
- Neg, Unsqueeze with non-constant inputs
- MatMulInteger with two non-constant inputs
- GEMM with constant weights and 4D or 5D inputs
Changes:
- Transformers and YOLOv5 integrations migrated from auto install to install from PyPI packages. Going forward,
pip install deepsparse[transformers]
andpip install deepsparse[yolov5]
will need to be used. - DeepSparse now uses hwloc to determine CPU topology. This fixes a bug where DeepSparse could not be used performantly inside of a Kubernetes cluster with a static CPU manager policy.
- When users pass in a
num_streams
parameter that is smaller than the number of cores, multi-stream and elastic scheduler behaviors have been improved. Previously, DeepSparse would divide the system intonum_streams
chunks and fill each chunk until it ran out of threads. Now, each stream will use a number of threads equal tonum_cores
divided bynum_streams
, with the remainder distributed in a round-robin fashion.
Resolved Issues:
-
In networks with a Clip operator where min isn't equal to zero, performance bugs no longer occurs.
-
Crashing eliminated:
- Pipeline conll eval using
ignore_labels
. (#903) - YOLOv8 pipelines handling models with dynamic inputs. (#967)
- QA pipelines with sequence lengths equal to or less than 128. (#889)
- Image classification pipelines handling PNG images. (#870)
- ONNX overriding of shapes if a list was not passed in; this now automatically wraps in a list. (#914)
- Pipeline conll eval using
-
Assertion errors/failures removed:
- Networks with both Convolutions and GEMM operations.
- YOLOv8 model compilation.
- Slice and Unsqueeze operators with a negative axis.
- OPT models involving a constant tensor that is broadcast in two different ways.
Known Issues:
- None
1、 deepsparse-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 39.95MB
2、 deepsparse-1.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 39.96MB
3、 deepsparse-1.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 39.95MB
4、 deepsparse-1.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 39.95MB
5、 deepsparse-1.5.0.tar.gz 39.59MB
6、 deepsparse-ent-1.5.0.tar.gz 41.15MB
7、 deepsparse-ent_api_demo.tar.gz 72.92MB
8、 deepsparse_api_demo.tar.gz 71.34MB
9、 deepsparse_ent-1.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 41.52MB
10、 deepsparse_ent-1.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 41.53MB
11、 deepsparse_ent-1.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 41.52MB
12、 deepsparse_ent-1.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl 41.52MB