-
Notifications
You must be signed in to change notification settings - Fork 51
Use query in linear flags - seq as fallback option #396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Agata Dobrzyniewicz <[email protected]>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
Signed-off-by: Agata Dobrzyniewicz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please look at the proposed changes, also change 'seq' to 'query' in line 88 vllm_gaudi/extension/bucketing/linear.py
for p, e, d in zip(params, env_vars, default_values): | ||
val = os.environ.get(e) | ||
|
||
if val is None and dim == 'query': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know we don't have VLLM_DECODE_SEQ_BUCKET_{p} - is there a need for making this code handle such a case ? This code looks like it would need to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't set this dim - nor query nor seq, for decode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so shouldn't we put also "and phase == 'prompt' and then set
fallback_env = f'VLLM_PROMPT_SEQ_BUCKET_{p}'.upper()
in 102 line ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can we port that to 0.11.0?
Signed-off-by: Agata Dobrzyniewicz <[email protected]>
🚧 CI BlockedThe main CI workflow was not started for the following reason:
|
No flag -> default
Query -> query value
Seq -> seq value + "will be depricated"warning
Query & seq -> query value