Skip to content

Commit 0ec2dc7

Browse files
committed
Added load_config feature details and updated OV version
1 parent ef32705 commit 0ec2dc7

File tree

2 files changed

+47
-5
lines changed

2 files changed

+47
-5
lines changed

docs/build/eps.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -277,11 +277,11 @@ See more information on the OpenVINO™ Execution Provider [here](../execution-p
277277

278278
1. Install the OpenVINO™ offline/online installer from Intel<sup>®</sup> Distribution of OpenVINO™<sup>TM</sup> Toolkit **Release 2024.3** for the appropriate OS and target hardware:
279279
* [Windows - CPU, GPU, NPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?PACKAGE=OPENVINO_BASE&VERSION=v_2024_3_0&OP_SYSTEM=WINDOWS&DISTRIBUTION=ARCHIVE).
280-
* [Linux - CPU, GPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?PACKAGE=OPENVINO_BASE&VERSION=v_2024_3_0&OP_SYSTEM=LINUX&DISTRIBUTION=ARCHIVE)
280+
* [Linux - CPU, GPU, NPU](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?PACKAGE=OPENVINO_BASE&VERSION=v_2024_3_0&OP_SYSTEM=LINUX&DISTRIBUTION=ARCHIVE)
281281

282282
Follow [documentation](https://docs.openvino.ai/2024/home.html) for detailed instructions.
283283

284-
*2024.3 is the current recommended OpenVINO™ version. [OpenVINO™ 2023.3](https://docs.openvino.ai/2023.3/home.html) is minimal OpenVINO™ version requirement.*
284+
*2024.5 is the current recommended OpenVINO™ version. [OpenVINO™ 2024.5](https://docs.openvino.ai/2024/index.html) is minimal OpenVINO™ version requirement.*
285285

286286
2. Configure the target hardware with specific follow on instructions:
287287
* To configure Intel<sup>®</sup> Processor Graphics(GPU) please follow these instructions: [Windows](https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html#windows), [Linux](https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html#linux)

docs/execution-providers/OpenVINO-ExecutionProvider.md

Lines changed: 45 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Accelerate ONNX models on Intel CPUs, GPUs, NPU with Intel OpenVINO™ Execution
2020
## Install
2121

2222
Pre-built packages and Docker images are published for OpenVINO™ Execution Provider for ONNX Runtime by Intel for each release.
23-
* OpenVINO™ Execution Provider for ONNX Runtime Release page: [Latest v5.4 Release](https://github.com/intel/onnxruntime/releases)
23+
* OpenVINO™ Execution Provider for ONNX Runtime Release page: [Latest v5.6 Release](https://github.com/intel/onnxruntime/releases)
2424
* Python wheels Ubuntu/Windows: [onnxruntime-openvino](https://pypi.org/project/onnxruntime-openvino/)
2525
* Docker image: [openvino/onnxruntime_ep_ubuntu20](https://hub.docker.com/r/openvino/onnxruntime_ep_ubuntu20)
2626

@@ -230,6 +230,46 @@ Refer to [Session Options](https://github.com/microsoft/onnxruntime/blob/main/in
230230
Optimizes ORT quantized models for the NPU device to only keep QDQs for supported ops and optimize for performance and accuracy.Generally this feature will give better performance/accuracy with ORT Optimizations disabled.
231231
Refer to [Configuration Options](#configuration-options) for more information about using these runtime options.
232232
233+
### Loading Custom JSON OV Config During Runtime
234+
This feature is developed to facilitate loading of OVEP parameters from a single JSON configuration file.
235+
The JSON input schema must be of format -
236+
```
237+
{
238+
"DEVICE_KEY": {"PROPERTY": "PROPERTY_VALUE"}
239+
}
240+
```
241+
where "DEVICE_KEY" can be CPU, NPU or GPU , "PROPERTY" must be a valid entity defined in OV from its properties.hpp sections and "PROPERTY_VALUE" must be passed in as a string. If we pass any other type like int/bool we encounter errors from ORT like below -
242+
243+
Exception during initialization: [json.exception.type_error.302] type must be string, but is a number.
244+
245+
While one can set the int/bool values like this "NPU_TILES": "2" which is valid (refer to the example given below).
246+
If someone passes incorrect keys, it will be skipped with a warning while incorrect values assigned to a valid key will result in an exception arising from OV framework.
247+
248+
The valid properties are of 2 types viz. MUTABLE (R/W) & IMMUTABLE (R ONLY) these are also governed while setting the same. If an IMMUTABLE property is being set, we skip setting the same with a similar warning.
249+
250+
Example:
251+
252+
The usage of this functionality using onnxruntime_perf_test application is as below –
253+
254+
```
255+
onnxruntime_perf_test.exe -e openvino -m times -r 1 -i "device_type|NPU load_config|npu_config.json" model.onnx
256+
```
257+
where the npu_config.json file is defined as below –
258+
259+
```bash
260+
{
261+
"NPU": {
262+
"PERFORMANCE_HINT": "THROUGHPUT",
263+
"WORKLOAD_TYPE": "Efficient",
264+
"NPU_TILES": "2",
265+
"LOG_LEVEL": "LOG_DEBUG",
266+
"NPU_COMPILATION_MODE_PARAMS": "enable-weights-swizzling=false enable-activation-swizzling=false enable-grouped-matmul=false"
267+
}
268+
}
269+
270+
```
271+
To explicitly enable logs one must use "LOG_LEVEL": "LOG_DEBUG" in the JSON device configuration property. The log verifies that the correct device parameters and properties are being set / populated during runtime with OVEP.
272+
233273
### OpenVINO Execution Provider Supports EP-Weight Sharing across sessions
234274
The OpenVINO Execution Provider (OVEP) in ONNX Runtime supports EP-Weight Sharing, enabling models to efficiently share weights across multiple inference sessions. This feature enhances the execution of Large Language Models (LLMs) with prefill and KV cache, reducing memory consumption and improving performance when running multiple inferences.
235275
@@ -238,7 +278,7 @@ With EP-Weight Sharing, prefill and KV cache models can now reuse the same set o
238278
These changes enable weight sharing between two models using the session context option: ep.share_ep_contexts.
239279
Refer to [Session Options](https://github.com/microsoft/onnxruntime/blob/5068ab9b190c549b546241aa7ffbe5007868f595/include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h#L319) for more details on configuring this runtime option.
240280
241-
### OVEP suppports CreateSessionFromArray API
281+
### OVEP supports CreateSessionFromArray API
242282
The OpenVINO Execution Provider (OVEP) in ONNX Runtime supports creating sessions from memory using the CreateSessionFromArray API. This allows loading models directly from memory buffers instead of file paths. The CreateSessionFromArray loads the model in memory then creates a session from the in-memory byte array.
243283
244284
Note:
@@ -268,11 +308,12 @@ options[num_streams] = "8";
268308
options[cache_dir] = "";
269309
options[context] = "0x123456ff";
270310
options[enable_qdq_optimizer] = "True";
311+
options[load_config] = "config_path.json";
271312
session_options.AppendExecutionProvider_OpenVINO_V2(options);
272313
```
273314
274315
### C/C++ Legacy API
275-
Note: This api is no longer officially supported. Users are requested to move to V2 API.
316+
Note: This API is no longer officially supported. Users are requested to move to V2 API.
276317
277318
The session configuration options are passed to SessionOptionsAppendExecutionProvider_OpenVINO() API as shown in an example below for GPU device type:
278319
@@ -315,6 +356,7 @@ The following table lists all the available configuration options for API 2.0 an
315356
| context | string | OpenCL Context | void* | This option is only available when OpenVINO EP is built with OpenCL flags enabled. It takes in the remote context i.e the cl_context address as a void pointer.|
316357
| enable_opencl_throttling | string | True/False | boolean | This option enables OpenCL queue throttling for GPU devices (reduces CPU utilization when using GPU). |
317358
| enable_qdq_optimizer | string | True/False | boolean | This option enables QDQ Optimization to improve model performance and accuracy on NPU. |
359+
| load_config | string | Any custom JSON path | string | This option enables a feature for loading custom JSON OV config during runtime which sets OV parameters. |
318360
319361
320362
Valid Hetero or Multi or Auto Device combinations:

0 commit comments

Comments
 (0)