-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Webhook validation for Topology NodeDeletionTimeout and NodeDrainTimeout #7104
Comments
/area topology |
What would be the valid range for those fields? |
We don't have these defined right now in the machine webhook (and I don't know if there's any need to), but defining a min/max is an optional part of this. I think the main part is to ensure that we do enough validation to catch errors like #7047 on object creation, instead of during the reconcile. |
Yup. The problem is that If it would also use But given the recent trend we would instead of the marker implement it in the webhook. (the format godoc sounds like we should use |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
/triage accepted |
/priority important-soon |
This issue is labeled with You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/triage accepted |
/assign |
What happens to the existing users who have persisted bad values when we update the validation here? Has it been considered to use ratcheting validation at all? |
I think it was not considered |
Ratcheting validation exists directly within the API server from Kube 1.30, but since we need to support older versions, ratcheting can either be implemented in a webhook, or, within a couple of well crafted CEL transition rules (though these aren't perfect as they don't cover the create case). Without ratcheting, this does have the potential to break users on upgrade, they wouldn't be able to write anything to the object until the values of these broken fields were fixed. |
If it's enabled per default it could be okay to just wait until 1.30 is the min supported version (Cluster API v1.10, basically we could then merge in December) |
NodeDeletionTimeout and NodeDrainTimeout were added to Topology managed clusters in #7098 and #6278. Currently the values of these fields are not validated on creation, and validation is instead done when the templates are turned into objects.
This lack of up-front validation lead to the unexpected failure in #7047. We could do some basic validation in the webhook on object creation to ensure these values are correctly formatted and in a given range before creation.
/kind feature
The text was updated successfully, but these errors were encountered: