版本发布时间: 2021-12-08 08:22:39
microsoft/onnxruntime最新发布版本:v1.19.2(2024-09-05 03:33:14)
- As noted in the deprecation notice in ORT 1.9, InferenceSession now requires the providers parameters to be set when enabling Execution Providers other than default CPUExecutionProvider. e.g. InferenceSession('model.onnx', providers=['CUDAExecutionProvider'])
- Python 3.6 support removed for Mac builds. Since 3.6 is end-of-life in December 2021, it will no longer be supported from next release (ORT 1.11) onwards
- Removed dependency on optional-lite
- Removed experimental Featurizers code
- Support for plug-in custom thread creation and join functions to enable usage of external threads
- Optional type support from op set 15
- Introduced indirect Convolution method for QLinearConv which has symmetrically quantized filter, i.e., filter type is int8 and zero point of filter is 0. The method leverages in-direct buffer instead of memcpy'ing the original data and doesn’t need to compute the sum of each pixel of output image for quantized Conv.
- X64: new kernels - including avx2, avxvnni, avx512 and avx 512 vnni, for general and depthwise quantized Conv.
- ARM64: new kernels for depthwise quantized Conv.
- Tensor shape optimization to avoid allocating heap memory in most cases - #9542
- Added transpose optimizer to push and cancel transpose ops, significantly improving perf for models requiring layout transformation
- Python
- Following through on the deprecation notice in ORT 1.9, InferenceSession now requires the providers parameters to be set when enabling Execution Providers other than default CPUExecutionProvider. e.g. InferenceSession('model.onnx', providers=['CUDAExecutionProvider'])
- C/C++
- New API to query CUDA stream to launch a custom kernel for scenarios where custom ops compiled into shared libraries need implicit synchronization with ORT CUDA kernels - #9141
- Updated Invalid -> OrtInvalidAllocator
- Updated every item in OrtCudnnConvAlgoSearch to a safer global name
- WinML
- New APIs to create OrtValues from Windows platform specific ID3D12Resources by exposing DirectML Execution Provider specific APIs. These APIs allow DML to extend the C-API and provide EP specific extensions.
- OrtSessionOptionsAppendExecutionProviderEx_DML
- DmlCreateGPUAllocationFromD3DResource
- DmlFreeGPUAllocation
- DmlGetD3D12ResourceFromAllocation
- Bug fix: LearningModel::LoadFromFilePath in UWP apps
- New APIs to create OrtValues from Windows platform specific ID3D12Resources by exposing DirectML Execution Provider specific APIs. These APIs allow DML to extend the C-API and provide EP specific extensions.
- Added Mac M1 Universal2 build support for a single binary that runs natively on both Apple silicon and Intel-based Macs. These are included in the official Nuget packages. (build instructions)
- Windows C API Symbols are now uploaded to Microsoft symbol server
- Nuget package now supports ARM64 Linux C#
- Python GPU package now includes both TensorRT and CUDA EPs. Note: EPs need to be explicitly registered to ensure the correct provider is used. e.g. InferenceSession('model.onnx', providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider']). Please also ensure you have appropriate TensorRT dependencies and CUDA dependencies installed.
Execution Providers
- TensorRT EP
- Python GPU release packages now include support for TensorRT 8.0. Enable TensorrtExecutionProvider by explicitly setting providers parameter when creating an InferenceSession. e.g. InferenceSession('model.onnx', providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider'])
- Published quantized BERT model example
- Add support for OpenVINO 2021.4.x
- Auto Plugin support
- IO Buffer/Copy Avoidance Optimizations for GPU plugin
- Misc fixes
- Add Softmaxgrad op
- Add Transpose, Reshape, Pow and LeakyRelu ops
- Add DynamicQuantizeLinear op
- Add squeeze/unsqueeze ops
- DirectML EP
- Added Xamarin support to the ORT C# Nuget packages
- Updated target frameworks in native package
- iOS and Android binaries now included in native package
- ORT format models now have backwards compatibility guarantee
- Support WebAssembly SIMD for qgemm kernel to accelerate the performance of quantized models
- Upgraded existing WebGL kernels to the latest opset
- Optimized bundle size to support various production scenarios, such as WebAssembly only or WebGL only
Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members: snnn, gineshidalgo99, fs-eire, gwang-msft, edgchen1, hariharans29, skottmckay, jeffdaily, baijumeswani, fdwr, smk2007, suffiank, souptc, RyanUnderhill, iK1D, yuslepukhin, chilo-ms, satyajandhyala, hanbitmyths, thiagocrepaldi, wschin, tianleiwu, pengwa, xadupre, zhanghuanrong, SherlockNoMad, wangyems, RandySheriffH, ashbhandare, tiagoshibata, yufenglee, mindest, sumitsays, MaajidKhan, gramalingam, tracysh, georgen117, jywu-msft, sfatimar, martinb35, nkreeger, ytaous, ashari4, stevenlix, chandru-r, jingyanwangms, mosdav, raviskolli, faxu, liqunfu, kit1980, weixingzhang, pranavsharma, jcwchen, chenfucn, BowenBao, jeffbloo
1、 Microsoft.AI.MachineLearning.1.10.0.symbols.zip 181.42MB
2、 Microsoft.AI.MachineLearning.1.10.0.zip 43.16MB
3、 Microsoft.ML.OnnxRuntime.DirectML.1.10.0.zip 148.31MB
4、 onnxruntime-linux-aarch64-1.10.0.tgz 4.69MB
5、 onnxruntime-linux-x64-1.10.0.tgz 5.17MB
6、 onnxruntime-linux-x64-gpu-1.10.0.tgz 99.41MB
7、 onnxruntime-osx-arm64-1.10.0.tgz 4.52MB
8、 onnxruntime-osx-universal2-1.10.0.tgz 9.56MB
9、 onnxruntime-osx-x86_64-1.10.0.tgz 5.14MB
10、 onnxruntime-win-arm-1.10.0.zip 30.44MB
11、 onnxruntime-win-arm64-1.10.0.zip 32.59MB
12、 onnxruntime-win-x64-1.10.0.zip 32.9MB
13、 onnxruntime-win-x64-gpu-1.10.0.zip 134.15MB
14、 onnxruntime-win-x86-1.10.0.zip 31.98MB