v1.17.3
版本发布时间: 2024-04-18 23:46:48
microsoft/onnxruntime最新发布版本:v1.19.2(2024-09-05 03:33:14)
What's new?
General:
- Update copying API header files to make Linux logic consistent with Windows (#19736) - @mszhanyi
- Pin ONNX version to fix DML and Python packaging pipeline exceptions (#20073) - @mszhanyi
Build System & Packages:
- Fix minimal build with training APIs enabled bug affecting Apple framework (#19858) - @edgchen1
Core:
- Fix SplitToSequence op with string tensor bug (#19942) - @Craigacp
CUDA EP:
- Fix onnxruntime_test_all build break with CUDA (#19673) - @gedoensmax
- Fix broken pooling CUDA NHWC ops and ensure NCHW / NHWC parity (#19889) - @mtavenrath
TensorRT EP:
- Fix TensorRT build break caused by image update (#19880) - @jywu-msft
- Fix TensorRT custom op list concurrency bug (#20093) - @chilo-ms
Web:
- Add hardSigmoid op support and hardSigmoid activation for fusedConv (#19215, #19233) - @qjia7
- Add support for WebNN async API with Asyncify (#19415) - @Honry
- Add uniform support for conv, conv transpose, conv grouped, and fp16 (#18753, #19098) - @axinging
- Add capture and replay support for JS EP (#18989) - @fs-eire
- Add LeakyRelu activation for fusedConv (#19369) - @qjia7
- Add FastGelu custom op support (#19392) - @fs-eire
- Allow uint8 tensors for WebGPU (#19545) - @satyajandhyala
- Add and optimize MatMulNBits (#19852) - @satyajandhyala
- Enable ort-web with any Float16Array polyfill (#19305) - @fs-eire
- Allow multiple EPs to be specified in backend resolve logic (#19735) - @fs-eire
- Various bug fixes: (#19258) - @gyagp, (#19201, #19554) - @hujiajie, (#19262, #19981) - @guschmue, (#19581, #19596, #19387) - @axinging, (#19613) - @satyajandhyala
- Various improvements for performance and usability: (#19202) - @qjia7, (#18900, #19281, #18883) - @axinging, (#18788, #19737) - @satyajandhyala, (#19610) - @segevfiner, (#19614, #19702, #19677, #19857, #19940) - @fs-eire, (#19791) - @gyagp, (#19868) - @guschmue, (#19433) - @martholomew, (#19932) - @ibelem
Windows:
- Fix Windows memory mapping bug affecting some larger models (#19623) - @yufenglee
Kernel Optimizations:
- Fix GQA and Rotary Embedding bugs affecting some models (#19801, #19874) - @aciddelgado
- Update replacement of MultiHeadAttention (MHA) and GroupQueryAttention (GQA) (#19882) - @kunal-vaishnavi
- Add support for packed QKV input and Rotary Embedding with sm<80 using Memory Efficient Attention kernel (#20012) - @aciddelgado
Models:
- Add support for benchmarking LLaMA model end-to-end performance (#19985, #20033, #20149) - @kunal-vaishnavi
- Add example to demonstrate export of Open AI Whisper implementation with batched prompts (#19854) - @shubhambhokare1
This patch release also includes additional fixes by @spampana95 and @enximi. Big thank you to all our contributors!
1、 Microsoft.AI.MachineLearning.1.17.3.nupkg 29.65MB
2、 Microsoft.AI.MachineLearning.1.17.3.snupkg 226.76MB
3、 Microsoft.ML.OnnxRuntime.DirectML.1.17.3.zip 5.18MB
4、 Microsoft.ML.OnnxRuntime.Managed.1.17.3.nupkg 634.61KB
5、 onnxruntime-linux-aarch64-1.17.3.tgz 4.7MB
6、 onnxruntime-linux-x64-1.17.3.tgz 5.53MB
7、 onnxruntime-linux-x64-gpu-1.17.3.tgz 162.91MB
8、 onnxruntime-linux-x64-gpu-cuda12-1.17.3.tgz 163.61MB
9、 onnxruntime-linux-x64-rocm-1.17.3.tgz 104.79MB
10、 onnxruntime-osx-arm64-1.17.3.tgz 6.92MB
11、 onnxruntime-osx-universal2-1.17.3.tgz 14.68MB
12、 onnxruntime-osx-x86_64-1.17.3.tgz 7.89MB
13、 onnxruntime-training-linux-aarch64-1.17.3.tgz 5.05MB
14、 onnxruntime-training-linux-x64-1.17.3.tgz 5.95MB
15、 onnxruntime-training-win-arm-1.17.3.zip 59.38MB
16、 onnxruntime-training-win-arm64-1.17.3.zip 62.84MB
17、 onnxruntime-training-win-x64-1.17.3.zip 63.29MB
18、 onnxruntime-training-win-x86-1.17.3.zip 62.42MB
19、 onnxruntime-win-arm64-1.17.3.zip 57.87MB
20、 onnxruntime-win-x64-1.17.3.zip 59.16MB
21、 onnxruntime-win-x64-gpu-1.17.3.zip 191.83MB
22、 onnxruntime-win-x64-gpu-cuda12-1.17.3.zip 192.46MB