Skip to content

fix: old version ray placement will set the available GPUs to 0 for vllm#112

Merged
vanwaals merged 1 commit intomainfrom
fix/wrong_ray_placement_usage
May 8, 2026
Merged

fix: old version ray placement will set the available GPUs to 0 for vllm#112
vanwaals merged 1 commit intomainfrom
fix/wrong_ray_placement_usage

Conversation

@linqinluli
Copy link
Copy Markdown
Collaborator

  1. bug fix for TP>1 in sampling backend. When tensor_parallel_size > 1, vLLM spawns an EngineCore in a background process that inherits CUDA_VISIBLE_DEVICES from the parent Ray actor. Previously, num_gpus=0 caused Ray to set CUDA_VISIBLE_DEVICES="", so the child process saw zero GPUs and vLLM crashed with KeyError: 'bundles'. By assigning num_gpus=tensor_parallel_size, Ray populates CUDA_VISIBLE_DEVICES correctly, and vLLM handles its own internal placement group.
  2. Some other modifications for pre-commit and gitignore.

@linqinluli linqinluli requested a review from vanwaals May 8, 2026 05:58
@vanwaals vanwaals merged commit 969a74f into main May 8, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants