-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: vllm v0.5.0 internal assert failed #5450
Comments
You only have 2 GPUs, why use tensor parallel size = 4? |
@youkaichao |
If you use a tensor parallel size different from the number of GPUs you have, then this is indeed a known issue. #5473 should solve it. |
No, I actually run vLLM on Kubernetes. Every time I modify the tensor parallel size, I manually adjust the number of GPUs simultaneously. The environment description shows only 2 GPUs because I copied it from another issue I had raised previously, where I encountered a similar problem on the same computing cluster. Therefore, I reused the environment description. |
you can take a look at #6056 |
Your current environment
🐛 Describe the bug
I use vllm/vllm-openai:v0.5.0 on k8s to deploy qwen 2 72b instruct, with tensor parallel size = 4, args looks like:
then I got the following error:
This same config works normally with vllm/vllm-openai:v0.4.3.
I tried to set tensor parallel size = 8, then I got a bunch of exceptions like #5439 ,and it takes very long time to launch, I did not wait to see if it starts successfully.
The text was updated successfully, but these errors were encountered: