-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Upcoming Release Roadmap
Faith Xu edited this page Nov 21, 2022
·
8 revisions
Target Release: Early February 2023
- ONNX 1.13 / Opset 18 support
- Performance improvements to GRU and Slice operator for real-time audio scenarios
- Make threadpool NUMA aware
- Multi-stream EP refactoring
- Convert git submodules to cmake FetchContent
- Building from source will require cmake version >=3.24 instead of >=3.18
- Improved source code release in Github release page, including git submodules
- XNNPACK in Android/iOS mobile packages
- Onnxruntime-extensions packages for mobile and web
- ORT Training Nuget packages: CPU & GPU
- Add support of quantization on machines with AMX (i.e.,Rapid Sapphire)
- Enhanced performance of transformer models (including Stable Diffusion) on GPU, including improvement of Encoder&Decoder and generation method: beam search, greedy search and top p sampling
- [new] Azure EP (Preview) - supports AzureML hosted models using Triton
- [new] Proxy EP (Preview) - enables custom code to be used as an EP using ORT custom op APIs
- TensorRT 8.5
- FasterTransformer library integration
- ROCm EP GA (5.4)
- Pre/Post processing
- Support updating mobilenet and super resolution models including usage of custom ops for conversion to/from jpg/png
- onnxruntime-extensions packages for Android and iOS with required custom ops
- Updated samples repo to demonstrate end-to-end usage with onnxruntime-extensions package
- XNNPACK
- More common operators
- iOS build support
- Support for using ORT allocator in XNNPACK kernels
- Support for usage in minimal build
- onnxruntime-extensions included in default build with NLP ops
- XNNPACK GEMM
- Improved exception handling
- Utility functions (i.e. image2tensor, tensor2image)
- On-Device Training support for CustomNLP, available in new Nuget package
- Optimizations and bug fixes for Hugging Face models
- Stable diffusion optimizations
- Expose FP16 optimizer in torch-ort
Please use the learning roadmap on the home wiki page for building general understanding of ORT.