model concurrency between different providers #11527
Unanswered
victorehailo
asked this question in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
hello,
I am trying to understand the way the ONNX executes a model graph that is composed of multiple provider nodes.
from a performance standpoint, I would expect the runtime to break the supplied batch and execute nodes that belong to different providers in parallel so the backend accelerators can be fully utilized.
trying to understand this I saw only the option to execute the session itself multiple times. in this way, the runtime can't guarantee the parallel use of the backend accelerators. moreover, the runtime can create congestion on a single EP in case the session executions are synced
I want to understand if this is the case?
Beta Was this translation helpful? Give feedback.
All reactions