-
Notifications
You must be signed in to change notification settings - Fork 623
RayJob Volcano Integration #3972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
bc3811c
to
d591688
Compare
cf98064
to
f820ab8
Compare
Signed-off-by: Troy Chiu <[email protected]>
Signed-off-by: win5923 <[email protected]>
26af624
to
ace94b2
Compare
c10c53e
to
9cd200c
Compare
9cd200c
to
8b8a9e5
Compare
) | ||
|
||
const ( | ||
PodGroupName = "podgroups.scheduling.volcano.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable is unused and can be removed i think.
Signed-off-by: win5923 <[email protected]>
8b8a9e5
to
2ee7e8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should separate this PR into two. One for interface migration and one for RayJob Volcano Integration. This makes both review and rollback easier IMO.
Got it! I will change this PR to focus on the RayJob Volcano integration, and will create a separate PR for the interface migration. |
d921269
to
dfeed19
Compare
Signed-off-by: win5923 <[email protected]>
dfeed19
to
92e7d95
Compare
Created a issue for interface migration. |
} | ||
|
||
// handleRayJob calculates the PodGroup MinMember and MinResources for a RayJob | ||
// The submitter pod is intentionally excluded from MinMember calculation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind elaborating on why we want to exclude the submitter pod? Also, if we exclude the submitter pod from resource calculation, what is the main goal of this RayJob integration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The submitter pod needs to wait until all RayCluster pods are ready before it is created. However, this leads to a minMember mismatch in the PodGroup, causing the RayCluster to remain in a Pending state.
As a temporary workaround, I’ve configured the RayJob to use the same PodGroup settings as the RayCluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, currently gang scheduling does not consider submitter pod? ( If I am right it doesn't consider submitter pod resource according to the pod group spec below. )
k describe podgroup ray-rayjob-sample-1-pg
Name: ray-rayjob-sample-1-pg
Namespace: default
Labels: <none>
Annotations: <none>
API Version: scheduling.volcano.sh/v1beta1
Kind: PodGroup
Spec:
Min Member: 3
Min Resources:
Cpu: 3
Memory: 4Gi
Queue: kuberay-test-queue
I think even though the min member mismatch we could still calculate the correct min resource so that the gang scheduling can consider submitter's resource requirement.
In other word, can minMember = worker num+ 1 (head ) and minResource = worker resource + head resource + submitter resource solve the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think even though the min member mismatch we could still calculate the correct min resource so that the gang scheduling can consider submitter's resource requirement.
In other word, can minMember = worker num+ 1 (head ) and minResource = worker resource + head resource + submitter resource solve the problem?
Sure, Thanks your suggestion! Updated in a96d3a4
$ k get podgroup ray-rayjob-sample-0-pg -o yaml
apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
metadata:
creationTimestamp: "2025-09-25T15:16:14Z"
generation: 3
name: ray-rayjob-sample-0-pg
namespace: default
ownerReferences:
- apiVersion: ray.io/v1
blockOwnerDeletion: true
controller: true
kind: RayJob
name: rayjob-sample-0
uid: e7652cc7-7593-4bd1-8ab1-bc043e62d7e5
resourceVersion: "8779"
uid: 84247ace-fcb5-4bce-9e18-b33e3769b941
spec:
minMember: 3
minResources:
cpu: 3500m
memory: 4296Mi
queue: kuberay-test-queue
status:
conditions:
- lastTransitionTime: "2025-09-25T15:16:15Z"
reason: tasks in gang are ready to be scheduled
status: "True"
transitionID: 6ccaf1db-e4f6-4cfa-ad71-f3abf039e03c
type: Scheduled
phase: Running
running: 1
} | ||
|
||
return v.syncPodGroup(ctx, app, minMember, totalResource) | ||
totalResourceList := []corev1.ResourceList{{}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For curiosity, why is there an empty ResourceList
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To passed to sumResourceList(list []corev1.ResourceList) corev1.ResourceList for calculating the total required resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, there is no name
and quantity
in an empty ResourceList
. It just skip the inner loop and further to the next element in the slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right! I was just following the code from util.go
kuberay/ray-operator/controllers/ray/utils/util.go
Lines 429 to 458 in eb66a26
func CalculateDesiredResources(cluster *rayv1.RayCluster) corev1.ResourceList { | |
desiredResourcesList := []corev1.ResourceList{{}} | |
headPodResource := CalculatePodResource(cluster.Spec.HeadGroupSpec.Template.Spec) | |
desiredResourcesList = append(desiredResourcesList, headPodResource) | |
for _, nodeGroup := range cluster.Spec.WorkerGroupSpecs { | |
if nodeGroup.Suspend != nil && *nodeGroup.Suspend { | |
continue | |
} | |
podResource := CalculatePodResource(nodeGroup.Template.Spec) | |
calculateReplicaResource(&podResource, nodeGroup.NumOfHosts) | |
for i := int32(0); i < *nodeGroup.Replicas; i++ { | |
desiredResourcesList = append(desiredResourcesList, podResource) | |
} | |
} | |
return sumResourceList(desiredResourcesList) | |
} | |
func CalculateMinResources(cluster *rayv1.RayCluster) corev1.ResourceList { | |
minResourcesList := []corev1.ResourceList{{}} | |
headPodResource := CalculatePodResource(cluster.Spec.HeadGroupSpec.Template.Spec) | |
minResourcesList = append(minResourcesList, headPodResource) | |
for _, nodeGroup := range cluster.Spec.WorkerGroupSpecs { | |
podResource := CalculatePodResource(nodeGroup.Template.Spec) | |
calculateReplicaResource(&podResource, nodeGroup.NumOfHosts) | |
for i := int32(0); i < *nodeGroup.MinReplicas; i++ { | |
minResourcesList = append(minResourcesList, podResource) | |
} | |
} | |
return sumResourceList(minResourcesList) | |
} |
I think we can open a follow-up PR to clean up these redundant empty ResourceList initializations.
Signed-off-by: win5923 <[email protected]>
a96d3a4
to
dd38daa
Compare
Signed-off-by: win5923 <[email protected]>
ray-operator/controllers/ray/batchscheduler/volcano/volcano_scheduler.go
Outdated
Show resolved
Hide resolved
f4c9d27
to
a716f20
Compare
…notations Signed-off-by: win5923 <[email protected]>
a716f20
to
42479c2
Compare
Why are these changes needed?
RayJob Volcano support
: Adds Volcano scheduler support for RayJob CRD.Gang scheduling
: Ensures Ray pods and submitter pod are scheduled together as a unit, preventing partial scheduling issues.E2E
volcano
:PodGroup
Queue
Testing RayJob HTTPMode
Related issue number
Closes #1580
Checks