You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
When running a pipeline with pipeline_max_batch_size < model_max_batch_size, its not clear to the user that inference requests will be clamped to the minimum values. So the inference batch size effectively is min(pipeline_max_batch_size, model_max_batch_size) which can subtly reduce performance if the user is not aware of the differences between the two options.
Describe your ideal solution
In this scenario, we should generate a warning stating that the value of model_max_batch_size will be reduced to pipeline_max_batch_size. This would at least help indicate what batch size the inference is truly running at.
Describe any alternatives you have considered
The model_max_batch_size could potentially be remove and automatically determined. Will file a secondary bug to address that issue.
Additional context
No response
Code of Conduct
I agree to follow this project's Code of Conduct
I have searched the open feature requests and have found no duplicates for this feature request
The text was updated successfully, but these errors were encountered:
Change pipeline_batch_size and model_max_batch_size to properties and run some validation each time either value is set, issuing a warning when pipeline_batch_size < model_max_batch_size.
Refactor Config so all values must be constructor-injected, and all settings are read-only properties. Validation will only need to be run in the constructor, instead of each time a property is set.
The quickest/easiest thing to do is the first option, but if the validation logic gets more complicated in the future, running validation in each setter could get messy.
To avoid a breaking change, I'm going to go with the first option, but encapsulate the validation logic in a private _validate method which can be called from both setters so we do not need to maintain multiple paths for config validation.
Is this a new feature, an improvement, or a change to existing functionality?
Improvement
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem this feature solves
When running a pipeline with
pipeline_max_batch_size
<model_max_batch_size
, its not clear to the user that inference requests will be clamped to the minimum values. So the inference batch size effectively ismin(pipeline_max_batch_size, model_max_batch_size)
which can subtly reduce performance if the user is not aware of the differences between the two options.Describe your ideal solution
In this scenario, we should generate a warning stating that the value of
model_max_batch_size
will be reduced topipeline_max_batch_size
. This would at least help indicate what batch size the inference is truly running at.Describe any alternatives you have considered
The
model_max_batch_size
could potentially be remove and automatically determined. Will file a secondary bug to address that issue.Additional context
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: