Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA]: Show warning when pipeline_max_batch_size < model_max_batch_size #420

Closed
2 tasks done
mdemoret-nv opened this issue Oct 27, 2022 · 1 comment · Fixed by #1858
Closed
2 tasks done

[FEA]: Show warning when pipeline_max_batch_size < model_max_batch_size #420

mdemoret-nv opened this issue Oct 27, 2022 · 1 comment · Fixed by #1858
Assignees
Labels
feature request New feature or request

Comments

@mdemoret-nv
Copy link
Contributor

Is this a new feature, an improvement, or a change to existing functionality?

Improvement

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem this feature solves

When running a pipeline with pipeline_max_batch_size < model_max_batch_size, its not clear to the user that inference requests will be clamped to the minimum values. So the inference batch size effectively is min(pipeline_max_batch_size, model_max_batch_size) which can subtly reduce performance if the user is not aware of the differences between the two options.

Describe your ideal solution

In this scenario, we should generate a warning stating that the value of model_max_batch_size will be reduced to pipeline_max_batch_size. This would at least help indicate what batch size the inference is truly running at.

Describe any alternatives you have considered

The model_max_batch_size could potentially be remove and automatically determined. Will file a secondary bug to address that issue.

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request
@cwharris
Copy link
Contributor

It sounds like the value's we're considering are:

Config.pipeline_batch_size
Config.model_max_batch_size

These are currently values, not properties.

There are a couple of ways to address this issue:

  • Change pipeline_batch_size and model_max_batch_size to properties and run some validation each time either value is set, issuing a warning when pipeline_batch_size < model_max_batch_size.

  • Refactor Config so all values must be constructor-injected, and all settings are read-only properties. Validation will only need to be run in the constructor, instead of each time a property is set.

The quickest/easiest thing to do is the first option, but if the validation logic gets more complicated in the future, running validation in each setter could get messy.

To avoid a breaking change, I'm going to go with the first option, but encapsulate the validation logic in a private _validate method which can be called from both setters so we do not need to maintain multiple paths for config validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants