ONNX Runtime v1.19.2
Announcements
- ORT 1.19.2 is a small patch release, fixing some broken workflows and introducing bug fixes.
Build System & Packages
- Fixed the signing of native DLLs.
- Disabled absl symbolize in Windows Release build to avoid dependency on dbghelp.dll.
Training
- Restored support for CUDA compute capability 7.0 and 7.5 with CUDA 12, and 6.0 and 6.1 with CUDA 11.
- Several fixes for training CI pipelines.
Mobile
- Fixed ArgMaxOpBuilder::AddToModelBuilderImpl() nullptr Node access for CoreML EP.
Generative AI
- Added CUDA kernel for Phi3 MoE.
- Added smooth softmax support in CUDA and CPU kernels for the GroupQueryAttention operator.
- Fixed number of splits calculations in GroupQueryAttention CUDA operator.
- Enabled causal support in the MultiHeadAttention CUDA operator.
Contributors
@prathikr, @mszhanyi, @edgchen1, @tianleiwu, @wangyems, @aciddelgado, @mindest, @snnn, @baijumeswani, @MaanavD
Thanks to everyone who helped ship this release smoothly!
Full Changelog: v1.19.0...v1.19.2