You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The python code is basically this :
1- Receive HTTP request
2- Preprocess the input and batch the data
3- Send an internal request to the TensorRT model
4- Postprocess the output
5- Send the answer with HTTP
The python code already takes care of batching, and i don't want Triton's dynamic batching feature to batch multiple requests together. So can i disable dynamic batching for a TRT model, while keeping the model with a flexible batch dimension for the python code to use ?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi, i have a Triton server with 2 parts :
The python code is basically this :
1- Receive HTTP request
2- Preprocess the input and batch the data
3- Send an internal request to the TensorRT model
4- Postprocess the output
5- Send the answer with HTTP
The python code already takes care of batching, and i don't want Triton's dynamic batching feature to batch multiple requests together. So can i disable dynamic batching for a TRT model, while keeping the model with a flexible batch dimension for the python code to use ?
Beta Was this translation helpful? Give feedback.
All reactions