Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI/Build][REDO] Add is_quant_method_supported to control quantization test configurations #5466

Merged

Conversation

mgoin
Copy link
Sponsor Collaborator

@mgoin mgoin commented Jun 12, 2024

There were many separate uses of checking CUDA compute capability in order to run quantization tests, so it is best practice to have one function to be the source-of-truth to check support.

Instead of each quantization test having duplicated code such as:

from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS

aqlm_not_supported = True

if torch.cuda.is_available():
    capability = torch.cuda.get_device_capability()
    capability = capability[0] * 10 + capability[1]
    aqlm_not_supported = (capability <
                          QUANTIZATION_METHODS["aqlm"].get_min_capability())

It can be replaced with:

from tests.quantization.utils import is_quant_method_supported

aqlm_not_supported = not is_quant_method_supported("aqlm")

@simon-mo simon-mo enabled auto-merge (squash) June 12, 2024 21:15
@simon-mo simon-mo merged commit 23ec72f into vllm-project:main Jun 13, 2024
117 of 119 checks passed
robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 16, 2024
joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants