Disable dynamic batching for a TRT model #7471

le-Greg · 2024-07-24T12:21:57Z

le-Greg
Jul 24, 2024

Hi, i have a Triton server with 2 parts :

A python backend
A TensorRT model

The python code is basically this :
1- Receive HTTP request
2- Preprocess the input and batch the data
3- Send an internal request to the TensorRT model
4- Postprocess the output
5- Send the answer with HTTP

The python code already takes care of batching, and i don't want Triton's dynamic batching feature to batch multiple requests together. So can i disable dynamic batching for a TRT model, while keeping the model with a flexible batch dimension for the python code to use ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable dynamic batching for a TRT model #7471

{{title}}

Replies: 0 comments

Select a reply

Disable dynamic batching for a TRT model #7471

le-Greg Jul 24, 2024

Replies: 0 comments

le-Greg
Jul 24, 2024