Skip to content

v0.6.0

Latest
Compare
Choose a tag to compare
@baijumeswani baijumeswani released this 14 Feb 18:07
97d44f6

Release Notes

We are excited to announce the release of onnxruntime-genai version 0.6.0. Below are the key updates included in this release:

  1. Support for contextual or continuous decoding allows users to carry out multi-turn conversation style generation.
  2. Support for new models such as Deepseek R1, AMD OLMo, IBM Granite and others.
  3. Python 3.13 wheels have been introduced
  4. Support for generation for models sourced from Qualcomm's AI Hub. This work also includes publishing a nuget package Microsoft.ML.OnnxRuntimeGenAI.QNN for QNN EP
  5. Support for WebGPU EP

This release also includes performance improvements to optimize memory usage and speed. In addition, there are several bug fixes that resolve issues reported by users.