[FEA]: Remove/Deprecate the model_max_batch_size
config option
#421
Labels
feature request
New feature or request
model_max_batch_size
config option
#421
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
As mentioned in issue #420, the different options for batch size,
model_max_batch_size
andpipeline_max_batch_size
, can be confusing to users and it's not clear how they interact or impact performance. This config option is a legacy value from one of the first iterations of Morpheus where multiple stages would need this option to coordinate the size of messages. Since it's only used by one stage (Inference), it does not make sense to be a config option any more.Describe your ideal solution
The
model_max_batch_size
option should be removed/deprecated. Where it is still needed, (on the InferenceStage implementations), we can automatically determine the max batch size either from the model or the service. For example, theTritonInferenceStage
can determine themodel_max_batch_size
during the initialization step.To allow for backward compatibility, we could add a
model_max_batch_size
property to the InferenceStage itself to override any automatically determined value.Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: