8.2.1
版本发布时间: 2021-11-25 02:19:00
NVIDIA/TensorRT最新发布版本:v10.3.0(2024-08-09 07:23:49)
TensorRT OSS release corresponding to TensorRT 8.2.1.8 GA release.
-
Updates since TensorRT 8.2.0 EA release.
-
Please refer to the TensorRT 8.2.1 GA release notes for more information.
-
ONNX parser v8.2.1
- Removed duplicate constant layer checks that caused some performance regressions
- Fixed expand dynamic shape calculations
- Added parser-side checks for
Scatter
layer support
-
Sample updates
- Added Tensorflow Object Detection API converter samples, including Single Shot Detector, Faster R-CNN and Mask R-CNN models
- Multiple enhancements in HuggingFace transformer demos
- Added multi-batch support
- Fixed resultant performance regression in batchsize=1
- Fixed T5 large/T5-3B accuracy issues
- Added notebooks for T5 and GPT-2
- Added CPU benchmarking option
- Deprecated
kSTRICT_TYPES
(strict type constraints). Equivalent behaviour now achieved by settingPREFER_PRECISION_CONSTRAINTS
,DIRECT_IO
, andREJECT_EMPTY_ALGORITHMS
- Removed
sampleMovieLens
- Renamed sampleReformatFreeIO to sampleIOFormats
- Add
idleTime
option for samples to control qps - Specify default value for
precisionConstraints
- Fixed reporting of TensorRT build version in trtexec
- Fixed
combineDescriptions
typo in trtexec/tracer.py - Fixed usages of
kDIRECT_IO
-
Plugin updates
-
EfficientNMS
plugin support extended to TF-TRT, and for clang builds. - Sanitize header definitions for BERT fused MHA plugin
- Separate C++ and cu files in
splitPlugin
to avoid PTX generation (required for CUDA enhanced compatibility support) - Enable C++14 build for plugins
-
-
ONNX tooling updates
- onnx-graphsurgeon upgraded to v0.3.14
- Polygraphy upgraded to v0.33.2
- pytorch-quantization toolkit upgraded to v2.1.2
-
Build and container fixes
- Add
SM86
target to defaultGPU_ARCHS
for platforms with cuda-11.1+ - Remove deprecated
SM_35
and addSM_60
to defaultGPU_ARCHS
- Skip CUB builds for cuda 11.0+ #1455
- Fixed cuda-10.2 container build failures in Ubuntu 20.04
- Add native ARM server build container
- Install devtoolset-8 for updated g++ version in CentOS7
- Added a note on supporting c++14 builds for CentOS7
- Fixed docker build for large UIDs #1373
- Updated README instructions for Jetpack builds
- Add
-
demo enhancements
- Updated Tacotron2 instructions and add CPU benchmarking
- Fixed issues in demoBERT python notebook
-
Documentation updates
- Updated Python documentation for
add_reduce
,add_top_k
, andISoftMaxLayer
- Renamed default GitHub branch to
main
and updated hyperlinks
- Updated Python documentation for