[Core] Enhance Ray node support multi-accelerator labels #58278

noemotiovon · 2025-10-29T08:30:10Z

Description

_get_current_node_accelerators now detects and returns all accelerator types per node, not just a single (AcceleratorManager, count) tuple.
Ray node labels now support multiple accelerator managers.
Default labels from all accelerators are merged.
Conflicts between accelerator default labels are logged.
User-specified and autoscaler labels are merged with default accelerator labels, with warnings on overrides.

This improves Ray’s handling of heterogeneous accelerator nodes and enables more flexible scheduling.

Related issues

Related to #58206

- _get_current_node_accelerators now detects and returns all accelerator types per node, not just a single (AcceleratorManager, count) tuple. - Ray node labels now support multiple accelerator managers. - Default labels from all accelerators are merged. - Conflicts between accelerator default labels are logged. - User-specified and autoscaler labels are merged with default accelerator labels, with warnings on overrides. This improves Ray’s handling of heterogeneous accelerator nodes and enables more flexible scheduling. Signed-off-by: noemotiovon <[email protected]>

noemotiovon · 2025-10-29T08:32:47Z

@ryanaoleary, could you please help me check whether this PR is reasonable? It’s an enhancement to PR #53360.

ryanaoleary · 2025-10-29T10:14:16Z

@ryanaoleary, could you please help me check whether this PR is reasonable? It’s an enhancement to PR #53360.

Is it possible for a Ray node to have multiple accelerator managers? My understanding was that we would only ever detect one accelerator type (and therefore AcceleratorManager) per node. This PR seems reasonable to me if there's a use-case for it, I'm just not sure I understand when it would occur.

noemotiovon · 2025-10-30T01:28:57Z

Yeah, as far as I know, a Ray node typically only has one type of accelerator, so there should only be a single AcceleratorManager per node. However, I’ve seen some projects use Ray in a way where they describe other resources using the GPU resource count — which is technically an incorrect usage. That said, if we allow a node to register multiple accelerator tags, it would incidentally support their use case as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Core] Enhance Ray node support multi-accelerator labels #58278

[Core] Enhance Ray node support multi-accelerator labels #58278

Uh oh!

noemotiovon commented Oct 29, 2025

Uh oh!

noemotiovon commented Oct 29, 2025

Uh oh!

ryanaoleary commented Oct 29, 2025

Uh oh!

noemotiovon commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Core] Enhance Ray node support multi-accelerator labels #58278

Are you sure you want to change the base?

[Core] Enhance Ray node support multi-accelerator labels #58278

Uh oh!

Conversation

noemotiovon commented Oct 29, 2025

Description

Related issues

Uh oh!

noemotiovon commented Oct 29, 2025

Uh oh!

ryanaoleary commented Oct 29, 2025

Uh oh!

noemotiovon commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants