Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Out-Tree EP feature #21450

Draft
wants to merge 78 commits into
base: main
Choose a base branch
from
Draft

[WIP] Out-Tree EP feature #21450

wants to merge 78 commits into from

Conversation

jslhcl
Copy link
Contributor

@jslhcl jslhcl commented Jul 23, 2024

Description

Out-Tree EP feature.

Motivation and Context

  1. Decouple ExecutionProvider from ONNXRuntime and make it a standalone class which will benefit 3rd party EP authors to write their own EP.
  2. Binary compatibility is required for this feature.

samples/outTreeEp/out_tree_ep.cc Fixed Show fixed Hide fixed
samples/c_test/test.cpp Fixed Show fixed Hide fixed
@@ -0,0 +1,35 @@
#pragma once

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,14 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
samples/c_test/test.cpp Fixed Show fixed Hide fixed
#ifdef __cplusplus
extern "C" {
#endif
OrtExecutionProviderFactory* RegisterCustomEp() {
Copy link
Contributor Author

@jslhcl jslhcl Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Status instead #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to do this? This function will new a factory object by invoking its constructor which has no return type

… EP as graph API is not exported by ORT. Need to put these graph API into ortapi structure
samples/outTreeEp/out_tree_ep.cc Fixed Show fixed Hide fixed
samples/c_test/test.cpp Fixed Show fixed Hide fixed
} OrtMetaDef;

typedef struct OrtIndexedSubGraph {
OrtMetaDef* meta_def; // TODO(leca): how to define a nested structure pointer?
Copy link
Contributor

@adrianlizarraga adrianlizarraga Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be a pointer to an OrtMetaDef? It may be simpler if this meta_def is contained by value instead. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks we will check the pointer is null or not to distinguish between single node mode and fused node mode (See base class IExecutionProvider::GetCapability() which does not set this pointer and TryAssignSingleNode() which will check this pointer)


OutTreeEp::OutTreeEp(const char* ep_type, const OutTreeEpInfo& ep_info) : info(ep_info) {
type = ep_type;
OrtExecutionProvider::GetCapability = [](const OrtExecutionProvider* this_, const OrtGraphViewer* graph, size_t* cnt, OrtIndexedSubGraph*** indexed_sub_graph) {
Copy link
Contributor

@adrianlizarraga adrianlizarraga Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, the type of the OrtIndexedSubGraph*** indexed_sub_graph parameter is essentially asking the EP to fill out an array of pointers to OrtIndexedSubGraph objects.

Would it be simpler to change this to OrtIndexedSubgraph** indexed_sub_graph so that the EP fills out an array of OrtIndexedSubGraph objects directly? Each OrtIndexedSubgraph struct is a simple POD that can be created on the stack and copied around. It seems like it would result in less pointer tracking. #Resolved

Copy link
Contributor

@adrianlizarraga adrianlizarraga Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, who is responsible for freeing this memory and when? If the EP allocates an array, then the EP should free it. The currently example leaks the allocations.

Edit: one possibility is to have onnxruntime call a new EP function (e.g., ReleaseOrtIndexedSubGraph()) so the the EP can free the memory. onnxruntime would call this once it is done using the indexed_sub_graph.

Copy link
Contributor Author

@jslhcl jslhcl Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we don't know how many OrtIndexedSubGraph would be before we call GetCapability() function. I will fix the leak issue in the coming commits

@@ -0,0 +1,21 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,89 @@
#include "kernel_ep.h"

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,36 @@
#pragma once

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,15 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,14 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,627 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,285 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,266 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,147 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,557 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,110 @@
// Copyright (c) Microsoft Corporation. All rights reserved.

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,54 @@
#include "qnn_execution_provider.h"

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
@@ -0,0 +1,33 @@
#pragma once

Check warning

Code scanning / lintrunner

CLANGFORMAT/format Warning

See https://clang.llvm.org/docs/ClangFormat.html.
Run lintrunner -a to apply this patch.
jslhcl and others added 30 commits October 14, 2024 22:15
Add some utility files for plugin ep to include and compile.

- provider option map -> `provider_option`.h
- provider option parser -> `provider_option_utils`.h
- some macro define, classes and functions from
include/onnxruntime/core/common
Hook TRT EP plugin to run the existing unit test in CI

- Migrate from
`onnxruntime/test/providers/tensorrt/tensorrt_basic_test.cc`
- Replace internal APIs with new EP APIs
- Add unit test in `onnxruntime_shared_lib_test` (which links against
onnxruntime dll)
- Build ORT with `--test_tensorrt_ep_plugin` to run
`onnxruntime_shared_lib_test`

Note: The unit test doesn't cover all the cases since current TRT EP
plugin hasn't added all the features yet, will update later.
…e attribute might contain null character (#22769)

When running EP Context model, EP might call
`OrtGraphApis::OrtNode_GetAttributeStr` to get the string-based content
of the attribute.
However, the API returns the c_str() of the string, and it's possible
that the cache context contains null character, so the string might be
cut off and caller ends up getting the wrong string.

Add a new OrtGraphApis::OrtNode_GetAttributeStrWithSize to return const
char* pointer and string size.
This PR support several features:

- Add new graph API to create and update EP Context graph, and dump EP
Context model.

1.  OrtGraph_CreateOrUpdateEpCtxGraph
2. OrtGraph_DumpOnnxModel
3. OrtGraph_ReleaseGraph 

- Add new graph API to dump onnx model
- The APIs provided by this PR can dump EP Context model when the whole
model can be run by one EP, the APIs also aim to support the case where
the whole model is partitioned into multiple EP's subgraphs. (Note: i
haven't fully tested the partitioning case, please help review it)
- Modify TRT EP plugin to use those APIs.
TRT 8 doesn't support INT64 and DOUBLE data type.
TRT 10 doesn't support DOUBLE data type.

Therefore, TRT EP internally needs to convert INT64 to INT32, and DOUBLE
to FLOAT, which needs the cuda::Impl_Cast function.
The implementation is copied from CUDA EP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants