Skip to content

Conversation

dstrodtman
Copy link
Contributor

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@dstrodtman dstrodtman requested a review from a team as a code owner July 1, 2025 20:28
Copy link

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jul 16, 2025
Copy link

This pull request has been automatically closed because there has been no more activity in the 14 days
since being marked stale.

Please feel free to reopen or open a new pull request if you'd still like this to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for your contribution!

@github-actions github-actions bot closed this Jul 30, 2025
@MengjinYan
Copy link
Contributor

Not stale

@MengjinYan MengjinYan reopened this Aug 14, 2025
@MengjinYan MengjinYan requested a review from a team as a code owner August 14, 2025 18:30
@github-actions github-actions bot added unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it. and removed stale The issue is stale. It will be closed within 7 days unless there are further conversation labels Aug 15, 2025
@ray-gardener ray-gardener bot added the core Issues that should be addressed in Ray Core label Aug 15, 2025

- NodeAffinitySchedulingStrategy when `soft=false`. Use the default `ray.io/node-id` label instead.
- The `accelerator_type` option for tasks and actors. Use the default `ray.io/accelerator-type` label instead.
- Custom resources such as the `special_hardware` pattern. Use custom labels instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Just to be more specific, the pattern that we want to replace is mainly using custom resources as labels, mentioned in the section here.

Pointing this out mainly because the custom resource can still be useful when the resources are used as resources with a quantity attached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating this to call out this behavior AND change the existing resources docs to point to labels in this instance.

Copy link
Contributor

@MengjinYan MengjinYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one more update regarding the autoscaler support.


- NodeAffinitySchedulingStrategy when `soft=false`. Use the default `ray.io/node-id` label instead.
- The `accelerator_type` option for tasks and actors. Use the default `ray.io/accelerator-type` label instead.
- Custom resources such as the `special_hardware` pattern. Use custom labels instead.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating this to call out this behavior AND change the existing resources docs to point to labels in this instance.


You can add custom labels to your nodes using the `--labels` or `--labels-file` parameter when running `ray start`. See the following examples:

<!-- INSERT EXAMPLES -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janetli19 @MengjinYan Still haven't really seen much here beyond what you can accomplish with defaults.

Maybe an example where we mark something like CPU size to allow users to specify tasks to specific VM types in a heterogenous CPU environment?

Could use help coding something up as I'm pretty novice at Ray.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that's exactly the example we give below. So here we just need to show define the config for that value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one example we can give is to set cpu family to indicate that the node is with AMD CPU.

# Start a head node with label name "cpu-family" & value "amd" 
ray start --head --labels="cpu-family=amd"

@dstrodtman dstrodtman added go add ONLY when ready to merge, run all tests and removed go add ONLY when ready to merge, run all tests labels Sep 25, 2025
cursor[bot]

This comment was marked as outdated.

@dstrodtman dstrodtman added the go add ONLY when ready to merge, run all tests label Sep 25, 2025
Copy link
Contributor

@MengjinYan MengjinYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one last nit comment and a missed code example.


(defaults)=
## Default node labels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the offline discussion, we might want to add some disclaimer similar to the following:

Ray reserves all labels ray.io and anyscale.io namespaces.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyscale.com?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be anyscale.com not io

why would Ray reserve anyscale.com labels though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's a good point. From Ray's perspective, we should only reserve ray.io namespace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reminds me we should add validation that users can't set ray.io/ labels if we haven't already @ryanaoleary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edoakes We previously had that for NodeLabelSchedulingPolic here, but we loosened this restriction for labels passed in with the new API (--labels and --labels-file) since I think we discussed passing labels from KubeRay with the ray.io/ prefix in the rayStartParams for autoscaling. For ray-project/kuberay#4106, should we require that labels passed to the top-level Labels field can't begin with ray.io/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MengjinYan LMK if this needs to happen for this release or you want to add later.

Copy link
Contributor

@MengjinYan MengjinYan Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dstrodtman I think for this release, we should still say that ray.io is reserved but we probably don't need to mention user shouldn't set ray.io label if we haven't set it already.

Autoscaler V2 supports label-based scheduling. To enable autoscaler to scale up nodes to fulfill label requirements, you need to create multiple worker groups for different label requirement combinations and specify the all the corresponding labels in the `rayStartParams` field in the Ray cluster configuration. For example:

```{python}
rayStartParams: {
Copy link
Contributor

@ryanaoleary ryanaoleary Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ray-project/kuberay#4106 is merged we can direct users to specify the top-level Labels field under the worker or head group with their desired labels with KubeRay v1.5+, but for now rayStartParams is the only option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MengjinYan LMK if this needs to happen for this release or you want to add later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dstrodtman I think we can add it later.

Copy link
Contributor

@ryanaoleary ryanaoleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through and everything LGTM!

@dstrodtman
Copy link
Contributor Author

This is passing in my local build, and I think ready to merge.

Note there's an open PR to resolve some docs issues that might fail some tests here: #57163

@aslonnie aslonnie removed request for a team and aslonnie October 6, 2025 20:37
Copy link
Contributor

@MengjinYan MengjinYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last suggestion on the namespace reservation. Thanks for fixing the build error!

@MengjinYan
Copy link
Contributor

@jjyao @edoakes @angelinalg ping to merge

@jjyao jjyao merged commit 44a9732 into master Oct 8, 2025
6 checks passed
@jjyao jjyao deleted the doc-127-ray-labels branch October 8, 2025 04:56
MengjinYan pushed a commit to MengjinYan/ray that referenced this pull request Oct 8, 2025
aslonnie pushed a commit that referenced this pull request Oct 9, 2025
(cherry picked from commit 44a9732)

Co-authored-by: Douglas Strodtman <[email protected]>
liulehui pushed a commit to liulehui/ray that referenced this pull request Oct 9, 2025
joshkodi pushed a commit to joshkodi/ray that referenced this pull request Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.