karpenter.azure.com/zone
requirement causes continuous drift
#205
Labels
area/availability-zones
Issues or PRs related to availability zones
area/drift
Issues or PRs related to Drift
area/nodeclaim
Issues or PRs related to NodeClaim lifecycle management
kind/bug
Categorizes issue or PR as related to a bug.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Version
Karpenter Version: 99d1bb0 (current main)
Kubernetes Version: v1.27.9
Expected Behavior
One should be able to specify requirements with
karpenter.azure.com/zone
constraint, for example to only provision nodes in a specific zone, without adverse effects.Actual Behavior
Specifying any kind of
karpenter.azure.com/zone
constraint in a NodePool current triggers continuous drift.Here is what I am thinking is going on. Right now, we cannot (and do not) record this requirement/constraint as a label on NodeClaim. This is because Karpenter will try applying all of these as labels to Node object - and
topology.kubernetes.io/zone
is a protected label in AKS. (It will be applied to a new Node correctly, but by a different component). So for now, as a workaround, we use an alternative label
karpenter.azure.com/zone
. I suspect that it is this discrepancy that causes Karpenter to detect Requirements Drift: Based of NodePool, the NodeClaim is expected to have the zone label, and it does not => out of spec, to be replaced. I also suspect that, while we do have E2E tests in this area, they likely only test that the node gets provisioned, and don't notice the subsequent drift.Steps to Reproduce the Problem
Use NodePool with any kind of
karpenter.azure.com/zone
requirement.Resource Specs and Logs
Continuous drift observed.
Workaround
Specify zone-based constraints (including
topologySpreadConstraint
, if needed) via workload, rather than NodePool.Community Note
The text was updated successfully, but these errors were encountered: