Skip to content

Releases: kubernetes-sigs/kueue

Kueue v0.10.0

16 Dec 11:42
v0.10.0
5ff057f
Compare
Choose a tag to compare

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • PodSets for RayJobs now account for submitter Job when spec.submissionMode=k8sJob is used.

    if you used the RayJob integration you may need to revisit your quota settings,
    because now Kueue accounts for the resources required by the KubeRay submitter Job
    when the spec.submissionMode=k8sJob (by default 500m CPU and 200Mi memory) (#3729, @andrewsykim)

  • Removed the v1alpha1 Visibility API.

The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:
    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

Feature

  • Adds a managedJobsNamespaceSelector to the Kueue configuration that enables namespace-based control of whether Jobs submitted without a kueue.x-k8s.io/queue-name label are managed by Kueue for all supported Job Kinds. (#3712, @dgrove-oss)
  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • Extend the GenericJob interface to allow implementations of custom Job CRDs to use
    Topology-Aware Scheduling with rank-based ordering. (#3704, @PBundyra)
  • Introduce alpha feature, behind the LocalQueueMetrics feature gate, which allows users to get the prometheus LocalQueues metrics:
    local_queue_pending_workloads
    local_queue_quota_reserved_workloads_total
    local_queue_quota_reserved_wait_time_seconds
    local_queue_admitted_workloads_total
    local_queue_admission_wait_time_seconds
    local_queue_admission_checks_wait_time_seconds
    local_queue_evicted_workloads_total
    local_queue_reserving_active_workloads
    local_queue_admitted_active_workloads
    local_queue_status
    local_queue_resource_reservation
    local_queue_resource_usage (#3673, @KPostOffice)
  • Introduce the LocalQueue defaulting, enabled by the LocalQueueDefaulting feature gate.
    When a new workload is created without the "queue-name" label, and the LocalQueue
    with name "default" name exists in the workload's namespace, then the value of the
    "queue-name" is defaulted to "default". (#3610, @yaroslava-serdiuk)
  • Kueue-viz: A Dashboard for kueue (#3727, @akram)
  • Optimize the size of the Workload object when Topology-Aware Scheduling is used, and the
    kubernetes.io/hostname is defined as the lowest Topology level. In that case the TopologyAssignment
    in the Workload's Status contains value only for this label, rather than for all levels defined. (#3677, @PBundyra)
  • Promote MultiplePreemptions feature gate to stable, and drop the legacy preemption logic. (#3602, @gabesaba)
  • Promoted ConfigurableResourceTransformations and WorkloadResourceRequestsSummary to Beta and enabled by default. (#3616, @dgrove-oss)
  • ResourceFlavorSpec that defines topologyName is not immutable (#3738, @PBundyra)
  • Respect node taints in Topology-Aware Scheduling when the lowest topology level is kubernetes.io/hostname. (#3678, @mimowo)
  • Support .featureGates field in the configuration API to enable and disable the Kueue features (#3805, @kannon92)
  • Support rank-based ordering of Pods with Topology-Aware Scheduling.
    The Pod indexes are determined based on the "kueue.x-k8s.io/pod-group-index" label which
    can be set by an external controller managing the group. (#3649, @PBundyra)
  • TAS: Support rank-based ordering for StatefulSet. (#3751, @mbobrovskyi)
  • TAS: The CQ referencing a Topology is deactivated if the topology does not exist. (#3770, @mimowo)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)
  • TAS: validate that kubernetes.io/hostname can only be at the lowest level (#3714, @mbobrovskyi)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix Kueue crashing when the node for an admitted workload is deleted. (#3715, @mimowo)
  • Fix a bug which occasionally prevented updates to the PodTemplate of the Job on the management cluster
    when starting a Job (e.g. updating nodeSelectors), when using MultiKueueBatchJobWithManagedBy enabled. (#3685, @IrvingMg)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix memory leak due to workload entries left in MultiKueue cache. The leak affects the 0.9.0 and 0.9.1
    releases which enable MultiKueue by default, even if MultiKueue is not explicitly used on the cluster. (#3835, @mimowo)
  • Fix misleading log messages from workload_controller indicating not existing LocalQueue or
    Cluster Queue. For example "LocalQueue for workload didn't exist or not active; ignored for now"
    could also be logged the ClusterQueue does not exist. (#3605, @7h3-3mp7y-m4n)
  • Fix preemption when using Hierarchical Cohorts by considering as preemption candidates workloads
    from ClusterQueues located further in the hierarchy tree than direct siblings. (#3691, @gabesaba)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix scheduling in TAS by considering tolerations specified in the ResourceFlavor. (#3723, @mimowo)
  • Fix scheduling of workload which does not include the toleration for the taint in ResourceFlavor's spec.nodeTaints,
    if the toleration is specified on the ResourceFlavor itself. (#3722, @PBundyra)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Fixed rolling update for StatefulSet integration (#3684, @mbobrovskyi)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)
  • TAS: Fixed bug that doesn't allow to update cache on delete Topology. (#3615, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)
  • Replace deprecated gcr.io/kubebuilder/kube-rbac-proxy with registry.k8s.io/kubebuilder/kube-rbac-proxy. (#3747, @mbobrovskyi)

v0.9.2

16 Dec 12:28
v0.9.2
174aaa7
Compare
Choose a tag to compare

Changes since v0.9.1:

Changes by Kind

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3580, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3585, @mbobrovskyi)
  • Fix a bug which occasionally prevented updates to the PodTemplate of the Job on the management cluster
    when starting a Job (e.g. updating nodeSelectors), when using MultiKueueBatchJobWithManagedBy enabled. (#3731, @IrvingMg)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3613, @mimowo)
  • Fix memory leak due to workload entries left in MultiKueue cache. The leak affects the 0.9.0 and 0.9.1
    releases which enable MultiKueue by default, even if MultiKueue is not explicitly used on the cluster. (#3843, @mimowo)
  • Fix misleading log messages from workload_controller indicating not existing LocalQueue or
    Cluster Queue. For example "LocalQueue for workload didn't exist or not active; ignored for now"
    could also be logged the ClusterQueue does not exist. (#3832, @PBundyra)
  • Fix preemption when using Hierarchical Cohorts by considering as preemption candidates workloads
    from ClusterQueues located further in the hierarchy tree than direct siblings. (#3705, @gabesaba)
  • Fix scheduling of workload which does not include the toleration for the taint in ResourceFlavor's spec.nodeTaints,
    if the toleration is specified on the ResourceFlavor itself. (#3724, @PBundyra)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3631, @mszadkow)
  • TAS: Fixed bug that doesn't allow to update cache on delete Topology. (#3655, @mbobrovskyi)
  • TAS: The CQ referencing a Topology is deactivated if the topology does not exist. (#3819, @mimowo)

Other (Cleanup or Flake)

  • Replace deprecated gcr.io/kubebuilder/kube-rbac-proxy with registry.k8s.io/kubebuilder/kube-rbac-proxy. (#3749, @mbobrovskyi)

v0.10.0-rc.4

11 Dec 19:28
v0.10.0-rc.4
e544dc8
Compare
Choose a tag to compare
v0.10.0-rc.4 Pre-release
Pre-release

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Removed the v1alpha1 Visibility API.

    The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:

    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

API Change

  • Addition of configuration that allows users to get prometheus metrics about LQ info, including the LQ status and the status of pending workloads

    metrics added:
    local_queue_pending_workloads
    local_queue_quota_reserved_workloads_total
    local_queue_quota_reserved_wait_time_seconds
    local_queue_admitted_workloads_total
    local_queue_admission_wait_time_seconds
    local_queue_admission_checks_wait_time_seconds
    local_queue_evicted_workloads_total
    local_queue_reserving_active_workloads
    local_queue_admitted_active_workloads
    local_queue_status
    local_queue_resource_reservation
    local_queue_resource_usage (#3673, @KPostOffice)

Feature

  • Adds a managedJobsNamespaceSelector to the Kueue configuration that enables namespace-based control of whether Jobs submitted without a kueue.x-k8s.io/queue-name label are managed by Kueue for all supported Job Kinds. (#3712, @dgrove-oss)
  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • Extend the GenericJob interface to allow implementations of custom Job CRDs to use
    Topology-Aware Scheduling with rank-based ordering. (#3704, @PBundyra)
  • If LocalQueueDefaulting feature gate is set to true, the LocalQueue with name default will be used as default LocalQueue in the corresponding namespace" (#3610, @yaroslava-serdiuk)
  • Kueue-viz: A Dashboard for kueue (#3727, @akram)
  • Optimize the size of the Workload object when Topology-Aware Scheduling is used, and the
    kubernetes.io/hostname is defined as the lowest Topology level. In that case the TopologyAssignment
    in the Workload's Status contains value only for this label, rather than for all levels defined. (#3677, @PBundyra)
  • Promote MultiplePreemptions feature gate to stable, and drop the legacy preemption logic. (#3602, @gabesaba)
  • Promoted ConfigurableResourceTransformations and WorkloadResourceRequestsSummary to Beta and enabled by default. (#3616, @dgrove-oss)
  • ResourceFlavorSpec that defines topologyName is not immutable (#3738, @PBundyra)
  • Respect node taints in Topology-Aware Scheduling when the lowest topology level is kubernetes.io/hostname. (#3678, @mimowo)
  • Support .featureGates field in the configuration API to enable and disable the Kueue features (#3805, @kannon92)
  • Support rank-based ordering of Pods with Topology-Aware Scheduling.
    The Pod indexes are determined based on the "kueue.x-k8s.io/pod-group-index" label which
    can be set by an external controller managing the group. (#3649, @PBundyra)
  • TAS: Support rank-based ordering for StatefulSet. (#3751, @mbobrovskyi)
  • TAS: The CQ referencing a Topology is deactivated if the topology does not exist. (#3770, @mimowo)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)
  • TAS: validate that kubernetes.io/hostname can only be at the lowest level (#3714, @mbobrovskyi)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix Kueue crashing when the node for an admitted workload is deleted. (#3715, @mimowo)
  • Fix a bug which occasionally prevented updates to the PodTemplate of the Job on the management cluster
    when starting a Job (e.g. updating nodeSelectors), when using MultiKueueBatchJobWithManagedBy enabled. (#3685, @IrvingMg)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix preemption when using Hierarchical Cohorts by considering as preemption candidates workloads
    from ClusterQueues located further in the hierarchy tree than direct siblings. (#3691, @gabesaba)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix scheduling in TAS by considering tolerations specified in the ResourceFlavor. (#3723, @mimowo)
  • Fix scheduling of workload which does not include the toleration for the taint in ResourceFlavor's spec.nodeTaints,
    if the toleration is specified on the ResourceFlavor itself. (#3722, @PBundyra)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Fixed rolling update for StatefulSet integration (#3684, @mbobrovskyi)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)
  • TAS: Fixed bug that doesn't allow to update cache on delete Topology. (#3615, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)
  • Replace deprecated gcr.io/kubebuilder/kube-rbac-proxy with registry.k8s.io/kubebuilder/kube-rbac-proxy. (#3747, @mbobrovskyi)

v0.10.0-rc.3

03 Dec 16:34
v0.10.0-rc.3
298fb37
Compare
Choose a tag to compare
v0.10.0-rc.3 Pre-release
Pre-release

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Removed the v1alpha1 Visibility API.

    The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:

    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

Feature

  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • Extend the GenericJob interface to allow implementations of custom Job CRDs to use
    Topology-Aware Scheduling with rank-based ordering. (#3704, @PBundyra)
  • Optimize the size of the Workload object when Topology-Aware Scheduling is used, and the
    kubernetes.io/hostname is defined as the lowest Topology level. In that case the TopologyAssignment
    in the Workload's Status contains value only for this label, rather than for all levels defined. (#3677, @PBundyra)
  • Promote MultiplePreemptions feature gate to stable, and drop the legacy preemption logic. (#3602, @gabesaba)
  • Promoted ConfigurableResourceTransformations and WorkloadResourceRequestsSummary to Beta and enabled by default. (#3616, @dgrove-oss)
  • Respect node taints in Topology-Aware Scheduling when the lowest topology level is kubernetes.io/hostname. (#3678, @mimowo)
  • Support rank-based ordering of Pods with Topology-Aware Scheduling.
    The Pod indexes are determined based on the "kueue.x-k8s.io/pod-group-index" label which
    can be set by an external controller managing the group. (#3649, @PBundyra)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)
  • TAS: validate that kubernetes.io/hostname can only be at the lowest level (#3714, @mbobrovskyi)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix Kueue crashing when the node for an admitted workload is deleted. (#3715, @mimowo)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix preemption when using Hierarchical Cohorts by considering as preemption candidates workloads
    from ClusterQueues located further in the hierarchy tree than direct siblings. (#3691, @gabesaba)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix scheduling in TAS by considering tolerations specified in the ResourceFlavor. (#3723, @mimowo)
  • Fix scheduling of workload which does not include the toleration for the taint in ResourceFlavor's spec.nodeTaints,
    if the toleration is specified on the ResourceFlavor itself. (#3722, @PBundyra)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)
  • TAS: Fixed bug that doesn't allow to update cache on delete Topology. (#3615, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)

Kueue v0.10.0-rc.2

29 Nov 12:23
v0.10.0-rc.2
c4e82fd
Compare
Choose a tag to compare
Kueue v0.10.0-rc.2 Pre-release
Pre-release

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Removed the v1alpha1 Visibility API.

    The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:

    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

Feature

  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • Optimize the size of the Workload object when Topology-Aware Scheduling is used, and the
    kubernetes.io/hostname is defined as the lowest Topology level. In that case the TopologyAssignment
    in the Workload's Status contains value only for this label, rather than for all levels defined. (#3677, @PBundyra)
  • Promote MultiplePreemptions feature gate to stable, and drop the legacy preemption logic. (#3602, @gabesaba)
  • Promoted ConfigurableResourceTransformations and WorkloadResourceRequestsSummary to Beta and enabled by default. (#3616, @dgrove-oss)
  • Respect node taints in Topology-Aware Scheduling when the lowest topology level is kubernetes.io/hostname. (#3678, @mimowo)
  • Support rank-based ordering of Pods with Topology-Aware Scheduling.
    The Pod indexes are determined based on the "kueue.x-k8s.io/pod-group-index" label which
    can be set by an external controller managing the group. (#3649, @PBundyra)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)
  • TAS: Fixed bug that doesn't allow to update cache on delete Topology. (#3615, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)

Kueue v0.10.0-rc.1

26 Nov 09:55
v0.10.0-rc.1
f486009
Compare
Choose a tag to compare
Kueue v0.10.0-rc.1 Pre-release
Pre-release

Changes since v0.9.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Removed the v1alpha1 Visibility API.

    The v1alpha1 Visibility API is deprecated. Please use v1beta1 instead. (#3499, @mbobrovskyi)

  • The InactiveWorkload reason for the Evicted condition is renamed to Deactivated.
    Also, the reasons for more detailed situations are renamed:

    • InactiveWorkloadAdmissionCheck -> DeactivatedDueToAdmissionCheck
    • InactiveWorkloadRequeuingLimitExceeded -> DeactivatedDueToRequeuingLimitExceeded

If you were watching for the "InactiveWorkload" reason in the "Evicted" condition, you need
to start watching for the "Deactivated" reason. (#3593, @mbobrovskyi)

Changes by Kind

Feature

  • Allow mutating the queue-name label for non-running Deployments. (#3528, @mbobrovskyi)
  • Allowed StatefulSet scaling down to zero and scale up from zero. (#3487, @mbobrovskyi)
  • TAS: support rank-based ordering for JobSet (#3591, @mimowo)
  • TAS: support rank-based ordering for Kubeflow (#3604, @mbobrovskyi)
  • TAS: support rank-ordering of Pods for the Kubernetes batch Job. (#3539, @mimowo)

Bug or Regression

  • Added validation for Deployment queue-name to fail fast (#3555, @mbobrovskyi)
  • Added validation for StatefulSet queue-name to fail fast. (#3575, @mbobrovskyi)
  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3536, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3547, @gabesaba)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3490, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3514, @mimowo)
  • Fix dropping of reconcile requests for non-leading replica, which was resulting in workloads
    getting stuck pending after the rolling restart of Kueue. (#3612, @mimowo)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3559, @mimowo)
  • Fix the bug which prevented the use of MultiKueue if there is a CRD which is not installed
    and removed from the list of enabled integrations. (#3603, @mszadkow)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3350, @KPostOffice)
  • Make topology levels immutable to prevent issues with inconsistent state of the TAS cache. (#3641, @mbobrovskyi)

Other (Cleanup or Flake)

  • Eliminate webhook validation in case Pod integration is used on 1.26 or earlier versions of Kubernetes. (#3247, @vladikkuzn)

v0.9.1

18 Nov 10:40
v0.9.1
c6c50ba
Compare
Choose a tag to compare

Changes since v0.9.0:

Note

The previously anticipated feature for Topology Aware Scheduling (TAS) Rank Ordering is not part of this
patch release. This functionality has been deferred and will be included in an upcoming release.

Changes by Kind

Bug or Regression

  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity. (#3549, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3550, @gabesaba)
  • Fix accounting for usage coming from TAS workloads using multiple resources. The usage was multiplied
    by the number of resources requested by a workload, which could result in under-utilization of the cluster.
    It also manifested itself in the message in the workload status which could contain negative numbers. (#3513, @mimowo)
  • Fix computing the topology assignment for workloads using multiple PodSets requesting the same
    topology. In particular, it was possible for the set of topology domains in the assignment to be empty,
    and as a consequence the pods would remain gated forever as the TopologyUngater would not have
    topology assignment information. (#3524, @mimowo)
  • Fix running Job when parallelism < completions, before the fix the replacement pods for the successfully
    completed Pods were not ungated. (#3561, @mimowo)
  • Fix the flow of deactivation for workloads due to rejected AdmissionChecks.
    Now, all AdmissionChecks are reset back to the Pending state on eviction (and deactivation in particular),
    and so an admin can easily re-activate such a workload manually without tweaking the checks. (#3518, @KPostOffice)

v0.8.4

18 Nov 10:29
v0.8.4
2c3ad2f
Compare
Choose a tag to compare

Changes since v0.8.3:

Changes by Kind

Bug or Regression

  • Change, and in some scenarios fix, the status message displayed to user when a workload doesn't fit in available capacity (#3551, @gabesaba)
  • Determine borrowing more accurately, allowing preempting workloads which fit in nominal quota to schedule faster (#3551, @gabesaba)

Kueue v0.9.0

05 Nov 16:07
v0.9.0
d3b8af0
Compare
Choose a tag to compare

Changes since v0.8.0:

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Changed the type of Pending events, emitted when a Workload can't be admitted, from Normal to Warning.

    Update tools that process this event if they depend on the event type. (#3264, @kebe7jun)

  • Deprecated SingleInstanceInClusterQueue and FlavorIndependent status conditions.

the Admission check status conditions “FlavorIndependent” and “SingleInstanceInClusterQueue” are no longer supported by default.
If you were using any of these conditions for your external AdmissionCheck you need to enable the AdmissionCheckValidationRules feature gate.
For the future releases you will need to provide validation by an external controller. (#3254, @mszadkow)

  • Promote MultiKueue API and feature gate to Beta. The MultiKueue feature gate is now beta and enabled by default.

The MultiKueue specific types are now part of the Kueue's v1beta1 API. v1alpha types are no longer supported. (#3230, @trasc)

  • Promoted VisibilityOnDemand to Beta and enabled by default.

The v1alpha1 Visibility API is deprecated and will be removed in the next release. Please use v1beta1 instead. (#3008, @mbobrovskyi)

  • Provides more details on the reasons for ClusterQueues being inactive.
    If you were watching for the reason CheckNotFoundOrInactive in the ClusterQueue condition, watch AdmissionCheckNotFound and AdmissionCheckInactive instead. (#3127, @trasc)
  • The QueueVisibility feature and its corresponding API was deprecated.

The QueueVisibility feature and its corresponding API was deprecated and will be removed in the v1beta2. Please use VisibilityOnDemand (https://kueue.sigs.k8s.io/docs/tasks/manage/monitor_pending_workloads/pending_workloads_on_demand/) instead. (#3110, @mbobrovskyi)

Upgrading steps

1. Backup MultiKueue Resources (skip if you are not using MultiKueue):

kubectl get multikueueclusters.kueue.x-k8s.io,multikueueconfigs.kueue.x-k8s.io -A -o yaml > mk.yaml

2. Update apiVersion in Backup File (skip if you are not using MultiKueue):

Replace v1alpha1 with v1beta1 in mk.yaml for all resources:

sed -i -e 's/v1alpha1/v1beta1/g' mk.yaml

3. Delete old CRDs:

kubectl delete crd multikueueclusters.kueue.x-k8s.io
kubectl delete crd multikueueconfigs.kueue.x-k8s.io

4. Install Kueue v0.9.x:

Follow the instruction here to install.

5. Restore MultiKueue Resources (skip if you are not using MultiKueue):

kubectl apply -f mk.yaml

Changes by Kind

Feature

  • Add gauge metric admission_cycle_preemption_skips that reports the number of Workloads in a ClusterQueue
    that got preemptions candidates, but had to be skipped in the last cycle. (#2919, @alculquicondor)

  • Add integration for Deployment, where each Pod is treated as a separate Workload. (#2813, @vladikkuzn)

  • Add integration for StatefulSet where Pods are managed by the pod-group integration. (#3001, @vladikkuzn)

  • Added FlowSchema and PriorityLevelConfiguration for Visibility API. (#3043, @mbobrovskyi)

  • Added a new optional resource.transformations section to the Configuration API that enables limited customization
    of how the resource requirements of a Workload are computed from the resource requests and limits of a Job. (#3026, @dgrove-oss)

  • Added a way to specify dependencies between job integrations. (#2768, @trasc)

  • Best effort support for scenarios when the Job is created at the same time as prebuilt workload or momentarily before the workload. In that case an error is logged to indicate that creating a Job before prebuilt-workload is outside of the intended use. (#3255, @mbobrovskyi)

  • CLI: Added EXEC TIME column on kueuectl list workload command. (#2977, @mbobrovskyi)

  • CLI: Added list pods for a job command. (#2280, @Kavinraja-G)

  • CLI: Use protobuf encoding for core K8s APIs in kueuectl. (#3077, @tosi3k)

  • Calculate AllocatableResourceGeneration more accurately. This fixes a bug where a workload might not have the Flavors it was assigned in a previous scheduling cycle invalidated, when the resources in the Cohort had changed. This bug could occur when other ClusterQueues were deleted from the Cohort. (#2984, @gabesaba)

  • Detect and enable support for job CRDs installed after Kueue starts. (#2574, @ChristianZaccaria)

  • Exposed available ResourceFlavors from the ClusterQueue in the LocalQueue status. (#3143, @mbobrovskyi)

  • Graduated LendingLimit to Beta and enabled by default. (#2909, @macsko)

  • Graduated MultiplePreemptions to Beta and enabled by default. (#2864, @macsko)

  • Helm: Support the topologySpreadConstraints and PodDisruptionBudget (#3295, @woehrl01)

  • Hierarchical Cohorts, introduced with the v1alpha1 Cohorts API, allow users to group resources in an arbitrary tree structure. Additionally, quotas and limits can now be defined directly at the Cohort level. See #79 for more details. (#2693, @gabesaba)

  • Included visibility-api.yaml as a part of main.yaml (#3084, @mbobrovskyi)

  • Introduce the "kueue.x-k8s.io/pod-group-fast-admission" annotation to Plain Pod integration.

    If the PlainPod has the annotation and is part of the Plain PodGroup, the Kueue will admit the Plain Pod regardless of whether all PodGroup Pods are created. (#3189, @vladikkuzn)

  • Introduce the new PodTemplate annotation kueue.x-k8s.io/workload, and label kueue.x-k8s.io/podset.
    The annotation and label are alpha-level and gated by the new TopologyAwareScheduling feature gate. (#3228, @mimowo)

  • Label kueue.x-k8s.io/managed is now added to PodTemplates created via ProvisioningRequest by Kueue (#2877, @PBundyra)

  • MultiKueue: Add support for MPIJob spec.runPolicy.managedBy field (#3289, @mszadkow)

  • MultiKueue: Support for the Kubeflow MPIJob (#2880, @mszadkow)

  • MultiKueue: Support for the Kubeflow PaddleJob (#2744, @mszadkow)

  • MultiKueue: Support for the Kubeflow PyTorchJob (#2735, @mszadkow)

  • MultiKueue: Support for the Kubeflow TFJob (#2626, @mszadkow)

  • MultiKueue: Support for the Kubeflow XGBoostJob (#2746, @mszadkow)

  • ProvisioningRequest: Record the ProvisioningRequest creation errors to event and ProvisioningRequest status. (#3056, @IrvingMg)

  • ProvisioningRequestConfig API has now RetryStrategy field that allows users to configure retries per ProvisioningRequest class. By default retry releases allocated quota in Kueue. (#3375, @PBundyra)

  • Publish images via artifact registry (#2476, @IrvingMg)

  • Support Topology Aware Scheduling (TAS) in Kueue in the Alpha version, along with the new Topology API
    to specify the ordered list of node labels corresponding to the different levels of hierarchy in data-centers
    (like racks or blocks).

    Additionally, we introduce the pair of Job-level annotations: http://kueue.x-k8s.io/podset-required-topology
    and kueue.x-k8s.io/podset-preferred-topology which users can use to indicate their preference for the
    Jobs to run all their Pods within a topology domain at the indicated level. (#3235, @mimowo)

  • Support for JobSet 0.6 (#3034, @kannon92)

  • Support for Kubernetes 1.31 (#2402, @mbobrovskyi)

  • Support the Job-level API label, called kueue.x-k8s.io/max-exec-time-seconds, that users
    can use to enforce the maximum execution time for their job. The execution time is only
    accumulated when the Job is running (the corresponding Workload is admitted).
    The corresponding Workload is deactivated after the time is exceeded. (#3191, @trasc)

Documentation

Bug or Regression

  • CLI: Delete the corresponding Job when deleting a Workload. (#2992, @mbobrovskyi)
  • CLI: Support - and . in the resource flavor name on create cq (#2703, @trasc)
  • Fix a bug that could delay the election of a new leader in the Kueue with multiple replicas env. (#3093, @tenzen-y)
  • Fix over-admission after deleting resources from borrowing ClusterQueue. (#2873, @mbobrovskyi)
  • Fix resource consumption computation for partially admitted workloads. (#3118, @trasc)
  • Fix restoring parallelism on eviction for partially admitted batch/Jobs. (#3153, @trasc)
  • Fix some scenarios for partial admission which are affected by wrong calculation of resources
    used by the incoming workload which is partially admitted and preempting. (#2826, @trasc)
  • Fix support for kuberay 1.2.x (#2960, @mbobrovskyi)
  • Fix webook validation for batch/Job to allow partial admission of a Job to use all available resources.
    It also fixes a scenario of partial re-admission when some of the Pods are already reclaimed. (#3152, @trasc)
  • Helm: Fix a bug for "unclosed action error". (#2683, @mbobrovskyi)
  • Prevent infinite preemption loop when PrioritySortingWithinCohort=false
    is used together with borrowWithinCohort. (#2807, @mimowo)
  • Prevent job webhooks from dropping fields for newer API fields when Kueue libraries are behind the latest released CRDs. (#3132, @alculquicondor)
  • RayJob's implementation of Finished() now inspects at JobDeploymentStatus (#3120, @andrewsykim)
  • Support for helm charts in the us-central1-docker.pkg.dev/k8s-staging-images/charts repository (#2680, @IrvingMg)
  • Update Flavor selection logic to prefer Flavors which allow reclamation of lent nominal quota, over Flavors which require preempting workloads within the ClusterQueue. This matches the behavior in the single Flavor case. (#2811, @gabesaba)
  • Workload is requeued with all AdmissionChecks set to Pending if there was an AdmissionCheck in Retry state. (#3323, @PBundyra)
  • Account for NumOfHosts when calculating PodSet assignments for RayJob and RayCluster (#3384, @andrewsykim)

Other (Cleanup or...

Read more

v0.8.3

05 Nov 08:47
v0.8.3
982a9f3
Compare
Choose a tag to compare

Changes since v0.8.2:

Changes by Kind

Bug or Regression

  • Workload is requeued with all AdmissionChecks set to Pending if there was an AdmissionCheck in Retry state. (#3323, @PBundyra)
  • Account for NumOfHosts when calculating PodSet assignments for RayJob and RayCluster (#3384, @andrewsykim)