Example code of run_async use for submitting outstanding inference requests #22126
Unanswered
hylandk-movidius
asked this question in
Performance Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi folks -
does anyone have any sample code (in python on C) that shows how to use the run_async command successfully on a single ONNX session?
My attempts to use it have not resulted in outstanding requests being sent to the accelerator. The only way I can get more than one outstanding inference request is by creating multiple ONNX sessions and submitting inference requests to each one.
Python script that I am using is below.
Does anyone know how to get this to work on a single ONNX session?
Kevin
import numpy as np
import os
import threading
import time
import onnxruntime as ort
model_path = 'mobilenetv2_035_96.onnx'
Create session options
session_options = ort.SessionOptions()
OUTSTANDING_REQ = 16
REQ = 32000
SECONDS_OFFSET = 86400
print("------------------------------------------------------------------------")
print(" Creating ONNX Inference sessions for each outstanding inference")
sessions = [ort.InferenceSession(model_path, providers=[('OpenVINOExecutionProvider', {'device_type': 'NPU'})], sess_options=session_options) for _ in range(OUTSTANDING_REQ)]
print("------------------------------------------------------------------------")
class run_async_inf:
def init(self, int_id, target):
self.__event = threading.Event()
self.__outputs = None
self.__err = ''
self.__id = int_id
self.__count = 0
self.__target = target
class run_async_inf_callback:
def init(self,int_id):
self.__id = int_id
@staticmethod
def callback(outputs: np.ndarray, state: run_async_inf, err: str) -> None:
state.fill_outputs(outputs, err)
def run_async_inference(label, session_array,run_opts):
print("------------------------------------------------------------------------")
# create an inference request and a callback for each outstatning request.
infer_requests = [run_async_inf(_,REQ/OUTSTANDING_REQ) for _ in range(OUTSTANDING_REQ)]
Get model input information
input_name = sessions[0].get_inputs()[0].name
input_shape = sessions[0].get_inputs()[0].shape
input_type = sessions[0].get_inputs()[0].type
Replace this with real input data matching the model's input shape and type
dummy_input = np.random.randn(*input_shape).astype(np.float32)
Reference - https://onnxruntime.ai/docs/execution-providers/Azure-ExecutionProvider.html
print("------------------------------------------------------------------------")
print("onnxruntime version:", ort.version)
print("providers:", sessions[0].get_providers())
print("------------------------------------------------------------------------")
Create RunOptions
run_options = ort.RunOptions()
run_options.log_verbosity_level = 3
run_async_inference('DEFAULT',sessions,run_options)
exit(0)
Beta Was this translation helpful? Give feedback.
All reactions