[FEATURE] Support ONNX and TensorRT export #4

IamShubhamGupto · 2024-05-03T06:49:45Z

Hello @guipotje and team,

In this PR, I am adding support for exporting ONNX and TensorRT models. On running the TensorRT engine, we can observe improvement in FPS. Since the evaluation code is currently unavailable, I cannot quantify the accuracy but it feels similar to the provided xfeat.pt file performance. The following is currently supported and changed:

Export onnx with opset version preference (default 17)
Export onnx with dynamic axis preference (enabled by default)
Export TensorRT with image size preference (default 480x640)
Export TensorRT with fp32 preference
Export TensorRT with fp16 preference (enabled by default)
Export TensorRT with workspace size preference (default 4096MiB)
Demo realtime_demo.py with engine
Make realtime_demo.py backwards compatible with cv2 version 4.5.0
Update README.md for TensorRT export

We can possibly further improve FPS by simplifying the onnx export using onnx-simplier but that can be an iterative PR.

I will be attaching logs and performance observations in comments below. Let me know if there any changes required. Thank you again for the amazing work!

IamShubhamGupto · 2024-05-03T07:28:39Z

TensorRT export logs and statistics - trt.log

mode	observed FPS
XFeat (cuda)	21.4
XFeat (tensorrt)	24.7

XFeat (cuda)

XFeat (tensorrt)

TensoRT is more sparse for almost similar poses but generally seems to have the same performance if not better.

Device: Nvidia Jetson AGX Xavier - Jetpack 5.1
Docker image: dustynv/l4t-ml:r35.4.1

guipotje · 2024-05-07T12:39:13Z

Hello @IamShubhamGupto and @acai66, thank you so much for your contributions!

I notice that both of you are providing an ONNX export, and it probably doesn't make sense to merge both since they serve the same purpose. Do you have any suggestions on how to proceed? In my opinion, it makes sense to choose the one that supports the earliest operator set, as it would maximize the model's compatibility with the greatest number of devices and libraries

IamShubhamGupto · 2024-05-07T13:01:30Z

Hello @IamShubhamGupto and @acai66, thank you so much for your contributions!

I notice that both of you are providing an ONNX export, and it probably doesn't make sense to merge both since they serve the same purpose. Do you have any suggestions on how to proceed? In my opinion, it makes sense to choose the one that supports the earliest operator set, as it would maximize the model's compatibility with the greatest number of devices and libraries

sure, maybe me and @acai66 can work on collaborating our work. In summary, there is no specific operator set defined in this PR and is up to the user to provide one, by default it is the latest supported and that is 17. Users can choose to use an older op set such as 13 by providing command line arguments.

python3 export.py --onnx_opset 13

Another key difference in our PRs is @acai66 can quantize the XFeatModel and XFeat modules where as I choose to only quantize XFeatModel. This feature can be merged into my branch as the code to then export TensorRT engine is already available here .

Let me know what you guys think

IamShubhamGupto · 2024-05-09T01:19:08Z

Hey just bumping this conversation up again @guipotje @acai66

acai66 · 2024-05-09T01:32:25Z

Hello @IamShubhamGupto and @acai66, thank you so much for your contributions!
I notice that both of you are providing an ONNX export, and it probably doesn't make sense to merge both since they serve the same purpose. Do you have any suggestions on how to proceed? In my opinion, it makes sense to choose the one that supports the earliest operator set, as it would maximize the model's compatibility with the greatest number of devices and libraries

sure, maybe me and @acai66 can work on collaborating our work. In summary, there is no specific operator set defined in this PR and is up to the user to provide one, by default it is the latest supported and that is 17. Users can choose to use an older op set such as 13 by providing command line arguments.
python3 export.py --onnx_opset 13 
Another key difference in our PRs is @acai66 can quantize the XFeatModel and XFeat modules where as I choose to only quantize XFeatModel. This feature can be merged into my branch as the code to then export TensorRT engine is already available here .

Let me know what you guys think

Thank you for your suggestion. I concur with your viewpoint and believe that merging our work would indeed be a beneficial course of action.

acai66 · 2024-05-09T07:59:20Z

My PR is aimed at improving the convenience and compatibility of deployment, especially in environments without PyTorch. I've uploaded an example of using the ONNX Runtime Python API, which can be easily reimplemented using the ONNX Runtime C++ API.

IamShubhamGupto · 2024-05-09T16:02:36Z

My PR is aimed at improving the convenience and compatibility of deployment, especially in environments without PyTorch. I've uploaded an example of using the ONNX Runtime Python API, which can be easily reimplemented using the ONNX Runtime C++ API.

that's amazing, we can then find some time this week to start merging the examples. I'll give your editor access to my fork.

guipotje · 2024-05-09T17:19:54Z

My PR is aimed at improving the convenience and compatibility of deployment, especially in environments without PyTorch. I've uploaded an example of using the ONNX Runtime Python API, which can be easily reimplemented using the ONNX Runtime C++ API.

that's amazing, we can then find some time this week to start merging the examples. I'll give your editor access to my fork.

Fantastic, @IamShubhamGupto and @acai66! While I'm not experienced in ONNX, I'm here to help with anything related to the original code — just let me know.

acai66 · 2024-05-10T06:18:07Z

My PR is aimed at improving the convenience and compatibility of deployment, especially in environments without PyTorch. I've uploaded an example of using the ONNX Runtime Python API, which can be easily reimplemented using the ONNX Runtime C++ API.

that's amazing, we can then find some time this week to start merging the examples. I'll give your editor access to my fork.

Thank you very much for the invitation. I am concerned that I have little experience with TensorRT deployment. I have invited you to be a collaborator on the forked repository.

acai66 · 2024-05-10T06:25:35Z

My PR is aimed at improving the convenience and compatibility of deployment, especially in environments without PyTorch. I've uploaded an example of using the ONNX Runtime Python API, which can be easily reimplemented using the ONNX Runtime C++ API.

that's amazing, we can then find some time this week to start merging the examples. I'll give your editor access to my fork.

Fantastic, @IamShubhamGupto and @acai66! While I'm not experienced in ONNX, I'm here to help with anything related to the original code — just let me know.

I apologize for my poor English; I am using translation software to communicate.
I have made some modifications to the source code to improve the compatibility and convenience of exporting ONNX. However, I am unsure if this will affect the original PyTorch functionality and performance. I am unable to conduct sufficient related tests.

IamShubhamGupto · 2024-05-10T12:45:32Z

@acai66 for now I am pausing development on my branch. we will merge onnx export from your PR and then I will continue TensorRT development here.

I will review and commit changes on your PR in the next couple of hours.

IamShubhamGupto added 6 commits May 2, 2024 20:41

untested exporter

0321736

tensorrt export serialization does not work

41fa366

clean commit

463a45b

remove unused import

b9145db

ignore model exports

1f304ce

update readme docs

35bd72e

IamShubhamGupto marked this pull request as ready for review May 3, 2024 07:29

use fp16 precision

8b2b710

IamShubhamGupto marked this pull request as draft May 10, 2024 12:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support ONNX and TensorRT export #4

[FEATURE] Support ONNX and TensorRT export #4

IamShubhamGupto commented May 3, 2024 •

edited

Loading

IamShubhamGupto commented May 3, 2024

guipotje commented May 7, 2024

IamShubhamGupto commented May 7, 2024

IamShubhamGupto commented May 9, 2024

acai66 commented May 9, 2024

acai66 commented May 9, 2024

IamShubhamGupto commented May 9, 2024

guipotje commented May 9, 2024

acai66 commented May 10, 2024

acai66 commented May 10, 2024 •

edited

Loading

IamShubhamGupto commented May 10, 2024

[FEATURE] Support ONNX and TensorRT export #4

Are you sure you want to change the base?

[FEATURE] Support ONNX and TensorRT export #4

Conversation

IamShubhamGupto commented May 3, 2024 • edited Loading

IamShubhamGupto commented May 3, 2024

guipotje commented May 7, 2024

IamShubhamGupto commented May 7, 2024

IamShubhamGupto commented May 9, 2024

acai66 commented May 9, 2024

acai66 commented May 9, 2024

IamShubhamGupto commented May 9, 2024

guipotje commented May 9, 2024

acai66 commented May 10, 2024

acai66 commented May 10, 2024 • edited Loading

IamShubhamGupto commented May 10, 2024

IamShubhamGupto commented May 3, 2024 •

edited

Loading

acai66 commented May 10, 2024 •

edited

Loading