Skip to content

[feat] Add NUMA-aware CPU core split for vllm worker and store threads#854

Merged
Infinite666 merged 7 commits intoModelEngine-Group:developfrom
wangwenxin0312:dev_fuse_op
Mar 24, 2026
Merged

[feat] Add NUMA-aware CPU core split for vllm worker and store threads#854
Infinite666 merged 7 commits intoModelEngine-Group:developfrom
wangwenxin0312:dev_fuse_op

Conversation

@wangwenxin0312
Copy link
Copy Markdown
Contributor

Purpose

Improve CPU affinity by assigning worker and store cores based on NUMA locality for CUDA and Ascend devices.

Modifications

  • Detect NUMA node from CUDA PCI info
  • Support Ascend visible-device mapping
  • Split NUMA cpulist into worker/store groups

Test

Enable by setting the environment variable VLLM_CPU_AFFINITY.

@wangwenxin0312 wangwenxin0312 force-pushed the dev_fuse_op branch 3 times, most recently from 958e621 to 64be838 Compare March 24, 2026 02:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional NUMA-aware CPU affinity logic to the vLLM UCM connector so worker and (intended) store operations can be pinned to CPU cores local to the device (CUDA / Ascend), enabled via VLLM_CPU_AFFINITY=1.

Changes:

  • Add NUMA detection and core splitting logic (CUDA PCI → NUMA; Ascend visible-device mapping → NUMA fallback).
  • Attempt to split cores into worker vs. store groups and pass store cores through the connector config.
  • Bind the worker process/thread to the computed worker cores when enabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@wangwenxin0312 wangwenxin0312 force-pushed the dev_fuse_op branch 4 times, most recently from c07b986 to 2c3a7ed Compare March 24, 2026 12:20
@Infinite666 Infinite666 merged commit 415511d into ModelEngine-Group:develop Mar 24, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants