Announcements

Starting from the next release(ONNX Runtime 1.16.0), at operating system level we will drop the support for

iOS 11 and below. iOS 12 will be the minimum supported version.
CentOS 7, Ubuntu 18.04, and any Linux distro without glibc version >=2.28.

At compiler level we will drop the support for

GCC version <= 9
Visual Studio 2019

Also, we will remove the onnxruntime_DISABLE_ABSEIL build option since we will upgrade protobuf and the new protobuf version will need abseil.

General

Added support for ONNX Optional type in C# API
Added collectives to support multi-GPU inferencing
Updated macOS build machines to macOS-12, which comes with Xcode 14.2 and we should stop using Xcode 12.4
Added Python 3.11 support (deprecate 3.7, support 3.8-3.11) in packages for Onnxruntime CPU, Onnxruntime-GPU, Onnxruntime-directml, and onnxruntime-training.
Updated to CUDA 11.8. ONNX Runtime source code is still compatible with CUDA 11.4 and 12.x.
Dropped the support for Windows 8.1 and below
Eager mode code and onnxruntime_ENABLE_EAGER_MODE cmake option are deleted.
Upgraded Mimalloc version from 2.0.3 to 2.1.1
Upgraded protobuf version from 3.18.3 to 21.12
New dependency: cutlass, which is only used in CUDA/TensorRT packages.
Upgraded DNNL from 2.7.1 to 3.0

Build System

On POSIX systems by default we disallow using "root" user to build the code. If needed, you can append "--allow_running_as_root" to your build command to bypass the check.
Add the support for building the source natively on Windows ARM64 with Visual Studio 2022.
Added a Gradle wrapper and updated Gradle version from 6.8.3 to 8.0.1. (Gradle is the tool for building ORT Java package)
When doing cross-compiling, the build scripts will try to download a prebuit protoc from Github instead of building the binary from source. Because now protobuf has many dependencies. It is not easy to setup a build environment for protobuf.

Performance

Improved string marshalling and reduce GC pressure
Added a build option to allow using a lock-free queue in threadpool for improved CPU utilization
Fix CPU memory leak due to external weights
Added fused decoder multi-head attention kernel to improve GPT and decoder models(like T5, Whisper)
Added packing mode to improve encoder models with inputs of large padding ratio
Improved generation algorithm (BeamSearch, TopSampling, GreedySearch)
Improved performance for StableDiffusion, ViT, GPT, whisper models

Execution Providers

Two new execution providers: JS EP and QNN EP.

TensorRT EP

Official support for TensorRT 8.6
Explicit shape profile overrides
Support for TensorRT plugins via ORT custom op
Improve support for TensorRT options (heuristics, sparsity, optimization level, auxiliary stream, tactic source selection etc.)
Support for TensorRT timing cache
Improvements to our test coverage, specifically for opset16-17 models and package pipeline unit test coverage.
Other misc bugfixes and improvements.

OpenVINO EP

Support for OpenVINO 2023.0
Dynamic shapes support for iGPU
Changes to OpenVINO backend to improve first inference latency
Deprecation of HDDL-VADM and Myriad VPU support
Misc bug fixes.

QNN EP

Initial Public preview release

DirectML EP:

Updated to DirectML 1.12
Opset 16-17 support

AzureEP

Added support for OpenAI whisper model
Available in a Nuget pkg in addition to Python

Mobile

New packages

Swift Package Manager for onnxruntime
Nuget package for onnxruntime-extensions (supports Android/iOS for MAUI/Xamarin)
React Native package for onnxruntime can optionally include onnxruntime-extensions

Pre/Post processing

Added support for built-in pre and post processing for NLP scenarios: classification, question-answering, text-prediction
Added support for built-in pre and post processing for Speech Recognition (Whisper)
Added support for built-in post processing for Object Detection (YOLO). Non-max suppression, draw bounding boxes
Additional CoreML and NNAPI kernels to support customer scenarios
- NNAPI: BatchNormalization, LRN
- CoreML: Div, Flatten, LeakyRelu, LRN, Mul, Pad, Pow, Sub

Web

[preview] WebGPU support
Support building the source code with "MinGW make" on Windows.

ORT Training

On-device training:

Official package for On-Device Training now available. On-device training extends ORT Inference solutions to enable training on edge devices.
APIs and Language bindings supported for C, C++, Python, C#, Java.
Packages available for Desktop and Android.
For custom builds refer build instructions.

Others

Added graph optimizations which leverage the sparsity in the label data to improve performance. With these optimizations we see performance gains ranging from 4% to 15% for popular HF models over baseline ORT.
Vision transformer models like ViT, BEIT and SwinV2 see upto 44% speedup with ORT Training+ DeepSpeed over PyTorch eager mode on AzureML.
Added optimizations for SOTA models like Dolly and Whisper. ORT Training + DS now gives ~17% speedup for Whisper and ~4% speedup for Dolly over PyTorch eager mode. Dolly optimizations on main branch show a ~40% over eager mode.

Known Issues

The onnxruntime-training 1.15.0 packages published to pypi.org were actually built in Debug mode instead of Release mode. You can get the right one from https://download.onnxruntime.ai/ . We will fix the issue in the next patch release.
XNNPack EP does not work on x86 CPUs without AVX-512 instructions, because we used wrong alignment when allocating buffers for XNNPack to use.
The CUDA EP source code has a build error when CUDA version <11.6. See #16000.
The onnxruntime-training builds are missing the training header files.

Contributions

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:
snnn, fs-eire, edgchen1, wejoncy, mszhanyi, PeixuanZuo, pengwa, jchen351, cloudhan, tianleiwu, PatriceVignola, wangyems, adrianlizarraga, chenfucn, HectorSVC, baijumeswani, justinchuby, skottmckay, yuslepukhin, RandyShuai, RandySheriffH, natke, YUNQIUGUO, smk2007, jslhcl, chilo-ms, yufenglee, RyanUnderhill, hariharans29, zhanghuanrong, askhade, wschin, jywu-msft, mindest, zhijxu-MS, dependabot[bot], xadupre, liqunfu, nums11, gramalingam, Craigacp, fdwr, shalvamist, jstoecker, yihonglyu, sumitsays, stevenlix, iK1D, pranavsharma, georgen117, sfatimar, MaajidKhan, satyajandhyala, faxu, jcwchen, hanbitmyths, jeffbloo, souptc, ytaous kunal-vaishnavi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX Runtime v1.15.0