fix: old version ray placement will set the available GPUs to 0 for vllm by linqinluli · Pull Request #112 · agentscope-ai/TuFT

linqinluli · 2026-05-08T05:58:37Z

bug fix for TP>1 in sampling backend. When tensor_parallel_size > 1, vLLM spawns an EngineCore in a background process that inherits CUDA_VISIBLE_DEVICES from the parent Ray actor. Previously, num_gpus=0 caused Ray to set CUDA_VISIBLE_DEVICES="", so the child process saw zero GPUs and vLLM crashed with KeyError: 'bundles'. By assigning num_gpus=tensor_parallel_size, Ray populates CUDA_VISIBLE_DEVICES correctly, and vLLM handles its own internal placement group.
Some other modifications for pre-commit and gitignore.

…llm.

fix: old version ray placement will set the available GPUs to 0 for v…

661a4e9

…llm.

linqinluli requested a review from vanwaals May 8, 2026 05:58

vanwaals approved these changes May 8, 2026

View reviewed changes

vanwaals merged commit 969a74f into main May 8, 2026
3 checks passed

Provide feedback