You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docsrc/tutorials/serving_torch_tensorrt_with_triton.rst
+83-62
Original file line number
Diff line number
Diff line change
@@ -22,42 +22,55 @@ Step 1: Optimize your model with Torch-TensorRT
22
22
Most Torch-TensorRT users will be familiar with this step. For the purpose of
23
23
this demonstration, we will be using a ResNet50 model from Torchhub.
24
24
25
-
Let’s first pull the `NGC PyTorch Docker container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`__. You may need to create
25
+
We will be working in the ``//examples/triton`` directory which contains the scripts used in this tutorial.
26
+
27
+
First pull the `NGC PyTorch Docker container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch>`__. You may need to create
26
28
an account and get the API key from `here <https://ngc.nvidia.com/setup/>`__.
27
29
Sign up and login with your key (follow the instructions
28
30
`here <https://ngc.nvidia.com/setup/api-key>`__ after signing up).
29
31
30
32
::
31
33
32
-
# <xx.xx> is the yy:mm for the publishing tag for NVIDIA's Pytorch
33
-
# container; eg. 22.04
34
+
# YY.MM is the yy:mm for the publishing tag for NVIDIA's Pytorch
35
+
# container; eg. 24.08
36
+
# NOTE: Use the publishing tag for both the PyTorch container and the Triton Containers
34
37
35
-
docker run -it --gpus all -v ${PWD}:/scratch_space nvcr.io/nvidia/pytorch:<xx.xx>-py3
38
+
docker run -it --gpus all -v ${PWD}:/scratch_space nvcr.io/nvidia/pytorch:YY.MM-py3
36
39
cd /scratch_space
37
40
38
-
Once inside the container, we can proceed to download a ResNet model from
39
-
Torchhub and optimize it with Torch-TensorRT.
41
+
With the container we can export the model in to the correct directory in our Triton model repository. This export script uses the **Dynamo** frontend for Torch-TensorRT to compile the PyTorch model to TensorRT. Then we save the model using **TorchScript** as a serialization format which is supported by Triton.
Secondly, we specify the names of the input and output layer(s) of our model.
202
+
Secondly, we specify the names of the input and output layer(s) of our model. This can be obtained during export and should already be specified in your ``config.pbtxt``
0 commit comments