[feat] Add NUMA-aware CPU core split for vllm worker and store threads by wangwenxin0312 · Pull Request #854 · ModelEngine-Group/unified-cache-management

wangwenxin0312 · 2026-03-23T08:10:42Z

Purpose

Improve CPU affinity by assigning worker and store cores based on NUMA locality for CUDA and Ascend devices.

Modifications

Detect NUMA node from CUDA PCI info
Support Ascend visible-device mapping
Split NUMA cpulist into worker/store groups

Test

Enable by setting the environment variable VLLM_CPU_AFFINITY.

ucm/integration/vllm/ucm_connector.py

Copilot

Pull request overview

Adds optional NUMA-aware CPU affinity logic to the vLLM UCM connector so worker and (intended) store operations can be pinned to CPU cores local to the device (CUDA / Ascend), enabled via VLLM_CPU_AFFINITY=1.

Changes:

Add NUMA detection and core splitting logic (CUDA PCI → NUMA; Ascend visible-device mapping → NUMA fallback).
Attempt to split cores into worker vs. store groups and pass store cores through the connector config.
Bind the worker process/thread to the computed worker cores when enabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ucm/integration/vllm/ucm_connector.py

wangwenxin0312 force-pushed the dev_fuse_op branch from f5324a0 to 8001c07 Compare March 23, 2026 08:23

wangwenxin0312 requested review from Infinite666, harrisonyhq, mag1c-h, qyh111, wuhuxiao and ygwpz as code owners March 23, 2026 08:23

wangwenxin0312 force-pushed the dev_fuse_op branch from 8001c07 to c7ef0ba Compare March 23, 2026 08:33

mag1c-h reviewed Mar 23, 2026

View reviewed changes

ucm/integration/vllm/ucm_connector.py Show resolved Hide resolved

wangwenxin0312 force-pushed the dev_fuse_op branch 3 times, most recently from 958e621 to 64be838 Compare March 24, 2026 02:51

Infinite666 reviewed Mar 24, 2026

View reviewed changes

ucm/integration/vllm/ucm_connector.py Outdated Show resolved Hide resolved

ucm/integration/vllm/ucm_connector.py Outdated Show resolved Hide resolved

wangwenxin0312 force-pushed the dev_fuse_op branch from 64be838 to 040addf Compare March 24, 2026 04:39

yuanzhg078 requested a review from Copilot March 24, 2026 06:27

Copilot started reviewing on behalf of yuanzhg078 March 24, 2026 06:27 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

wangwenxin0312 force-pushed the dev_fuse_op branch from 24daf60 to e2ce546 Compare March 24, 2026 07:22

qyh111 reviewed Mar 24, 2026

View reviewed changes

ucm/integration/vllm/ucm_connector.py Outdated Show resolved Hide resolved

ucm/integration/vllm/ucm_connector.py Outdated Show resolved Hide resolved

wangwenxin0312 force-pushed the dev_fuse_op branch 4 times, most recently from c07b986 to 2c3a7ed Compare March 24, 2026 12:20

wangwenxin0312 added 7 commits March 24, 2026 20:36

CPU core split for vllm worker and store

ef6c26d

npu path fix

8ae97b5

bugfix

63a10d2

delete is_cuda

98d6fde

refactor

7a97880

add VLLM_CPU_AFFINITY

881dc69

refactor v2

caf8823

wangwenxin0312 force-pushed the dev_fuse_op branch from e9ea313 to caf8823 Compare March 24, 2026 12:36

Infinite666 approved these changes Mar 24, 2026

View reviewed changes

Infinite666 merged commit 415511d into ModelEngine-Group:develop Mar 24, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Add NUMA-aware CPU core split for vllm worker and store threads#854

[feat] Add NUMA-aware CPU core split for vllm worker and store threads#854
Infinite666 merged 7 commits intoModelEngine-Group:developfrom
wangwenxin0312:dev_fuse_op

wangwenxin0312 commented Mar 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wangwenxin0312 commented Mar 23, 2026

Purpose

Modifications

Test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants