-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Out-Tree EP feature #21450
base: main
Are you sure you want to change the base?
[WIP] Out-Tree EP feature #21450
Conversation
@@ -0,0 +1,35 @@ | |||
#pragma once |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,14 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
#ifdef __cplusplus | ||
extern "C" { | ||
#endif | ||
OrtExecutionProviderFactory* RegisterCustomEp() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Status instead #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to do this? This function will new a factory object by invoking its constructor which has no return type
… EP as graph API is not exported by ORT. Need to put these graph API into ortapi structure
…roviderAdapter::Compile()
} OrtMetaDef; | ||
|
||
typedef struct OrtIndexedSubGraph { | ||
OrtMetaDef* meta_def; // TODO(leca): how to define a nested structure pointer? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be a pointer to an OrtMetaDef? It may be simpler if this meta_def is contained by value instead. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks we will check the pointer is null or not to distinguish between single node mode and fused node mode (See base class IExecutionProvider::GetCapability() which does not set this pointer and TryAssignSingleNode() which will check this pointer)
|
||
OutTreeEp::OutTreeEp(const char* ep_type, const OutTreeEpInfo& ep_info) : info(ep_info) { | ||
type = ep_type; | ||
OrtExecutionProvider::GetCapability = [](const OrtExecutionProvider* this_, const OrtGraphViewer* graph, size_t* cnt, OrtIndexedSubGraph*** indexed_sub_graph) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly, the type of the OrtIndexedSubGraph*** indexed_sub_graph
parameter is essentially asking the EP to fill out an array of pointers to OrtIndexedSubGraph
objects.
Would it be simpler to change this to OrtIndexedSubgraph** indexed_sub_graph
so that the EP fills out an array of OrtIndexedSubGraph
objects directly? Each OrtIndexedSubgraph struct is a simple POD that can be created on the stack and copied around. It seems like it would result in less pointer tracking. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, who is responsible for freeing this memory and when? If the EP allocates an array, then the EP should free it. The currently example leaks the allocations.
Edit: one possibility is to have onnxruntime call a new EP function (e.g., ReleaseOrtIndexedSubGraph()) so the the EP can free the memory. onnxruntime would call this once it is done using the indexed_sub_graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that we don't know how many OrtIndexedSubGraph would be before we call GetCapability() function. I will fix the leak issue in the coming commits
@@ -0,0 +1,21 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,89 @@ | |||
#include "kernel_ep.h" |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,36 @@ | |||
#pragma once |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,15 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,14 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,627 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,285 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,266 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,147 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,557 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,110 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,54 @@ | |||
#include "qnn_execution_provider.h" |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
@@ -0,0 +1,33 @@ | |||
#pragma once |
Check warning
Code scanning / lintrunner
CLANGFORMAT/format Warning
Run lintrunner -a to apply this patch.
Add some utility files for plugin ep to include and compile. - provider option map -> `provider_option`.h - provider option parser -> `provider_option_utils`.h - some macro define, classes and functions from include/onnxruntime/core/common
…untime into leca/outOfTreeEP
Hook TRT EP plugin to run the existing unit test in CI - Migrate from `onnxruntime/test/providers/tensorrt/tensorrt_basic_test.cc` - Replace internal APIs with new EP APIs - Add unit test in `onnxruntime_shared_lib_test` (which links against onnxruntime dll) - Build ORT with `--test_tensorrt_ep_plugin` to run `onnxruntime_shared_lib_test` Note: The unit test doesn't cover all the cases since current TRT EP plugin hasn't added all the features yet, will update later.
…e attribute might contain null character (#22769) When running EP Context model, EP might call `OrtGraphApis::OrtNode_GetAttributeStr` to get the string-based content of the attribute. However, the API returns the c_str() of the string, and it's possible that the cache context contains null character, so the string might be cut off and caller ends up getting the wrong string. Add a new OrtGraphApis::OrtNode_GetAttributeStrWithSize to return const char* pointer and string size.
This PR support several features: - Add new graph API to create and update EP Context graph, and dump EP Context model. 1. OrtGraph_CreateOrUpdateEpCtxGraph 2. OrtGraph_DumpOnnxModel 3. OrtGraph_ReleaseGraph - Add new graph API to dump onnx model - The APIs provided by this PR can dump EP Context model when the whole model can be run by one EP, the APIs also aim to support the case where the whole model is partitioned into multiple EP's subgraphs. (Note: i haven't fully tested the partitioning case, please help review it) - Modify TRT EP plugin to use those APIs.
TRT 8 doesn't support INT64 and DOUBLE data type. TRT 10 doesn't support DOUBLE data type. Therefore, TRT EP internally needs to convert INT64 to INT32, and DOUBLE to FLOAT, which needs the cuda::Impl_Cast function. The implementation is copied from CUDA EP.
Description
Out-Tree EP feature.
Motivation and Context