You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the creation events of large-scale jobs, such as PyTorch jobs, use multiple goroutines to initiate createPod and createSvc, which can reduce the time for jobs to create Kubernetes (k8s) resources.
Why is this needed?
If there are a large number of pods under a large-scale job, initializing the pods and services (svc) will take a lot of time. This will seriously affect the startup of a large-scale training job.
Love this feature?
Give it a 👍 We prioritize the features with most 👍
The text was updated successfully, but these errors were encountered:
lishangyuzi
changed the title
Optimize the time for creating pods and services when creating new Jobs.
Optimize the time for creating pods and services when creating new Job.
Dec 23, 2024
What you would like to be added?
For the creation events of large-scale jobs, such as PyTorch jobs, use multiple goroutines to initiate createPod and createSvc, which can reduce the time for jobs to create Kubernetes (k8s) resources.
Why is this needed?
If there are a large number of pods under a large-scale job, initializing the pods and services (svc) will take a lot of time. This will seriously affect the startup of a large-scale training job.
Love this feature?
Give it a 👍 We prioritize the features with most 👍
The text was updated successfully, but these errors were encountered: