Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ScyllaDBCluster webhook validation #2271

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zimnx
Copy link
Collaborator

@zimnx zimnx commented Dec 16, 2024

Description of your changes:

Adds validation rules for v1alpha1.ScyllaDBCluster to existing validating webhook.

Which issue is resolved by this Pull Request:
Resolves #2280

@zimnx zimnx added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Dec 16, 2024
Copy link
Contributor

@zimnx: GitHub didn't allow me to request PR reviews from the following users: zimnx.

Note that only scylladb members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Description of your changes:

Adds validation rules for v1alpha1.ScyllaDBCluster to existing validating webhook.

Which issue is resolved by this Pull Request:
Resolves #

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@scylla-operator-bot scylla-operator-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 16, 2024
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zimnx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@scylla-operator-bot scylla-operator-bot bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 16, 2024
@zimnx zimnx force-pushed the scylladbcluster-webhook branch 3 times, most recently from 9a65306 to 6ae7f48 Compare December 16, 2024 19:57
@zimnx zimnx force-pushed the scylladbcluster-webhook branch from 6ae7f48 to 38ef9a7 Compare December 17, 2024 16:15
@zimnx zimnx changed the title [WIP] Add ScyllaDBCluster webhook validation Add ScyllaDBCluster webhook validation Dec 18, 2024
@scylla-operator-bot scylla-operator-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 18, 2024
@zimnx
Copy link
Collaborator Author

zimnx commented Dec 18, 2024

/cc rzetelskik
/cc tnozicka

Ready to review

Copy link
Member

@rzetelskik rzetelskik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first batch

if spec.MinReadySeconds != nil && *spec.MinReadySeconds < 0 {
allErrs = append(allErrs, apimachineryvalidation.ValidateNonnegativeField(int64(*spec.MinReadySeconds), fldPath.Child("minReadySeconds"))...)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


func ValidateScyllaDBClusterDatacenterTemplate(dcTemplate *scyllav1alpha1.ScyllaDBClusterDatacenterTemplate, fldPath *field.Path) field.ErrorList {
allErrs := field.ErrorList{}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metadata validation is missing (labels, annotations, especially pod-specific)


func ValidateScyllaDBClusterDatacenterTemplate(dcTemplate *scyllav1alpha1.ScyllaDBClusterDatacenterTemplate, fldPath *field.Path) field.ErrorList {
allErrs := field.ErrorList{}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like placement should be validated as well?

Comment on lines +109 to +132
if dcTemplate.ScyllaDB.Storage != nil {
if dcTemplate.ScyllaDB.Storage.Metadata != nil {
allErrs = append(allErrs, apimachinerymetav1validation.ValidateLabels(dcTemplate.ScyllaDB.Storage.Metadata.Labels, fldPath.Child("scyllaDB", "storage", "metadata", "labels"))...)
allErrs = append(allErrs, apimachineryvalidation.ValidateAnnotations(dcTemplate.ScyllaDB.Storage.Metadata.Annotations, fldPath.Child("scyllaDB", "storage", "metadata", "annotations"))...)
}

storageCapacity, err := resource.ParseQuantity(dcTemplate.ScyllaDB.Storage.Capacity)
if err != nil {
allErrs = append(allErrs, field.Invalid(fldPath.Child("scyllaDB", "storage", "capacity"), dcTemplate.ScyllaDB.Storage.Capacity, fmt.Sprintf("unable to parse capacity: %v", err)))
} else if storageCapacity.CmpInt64(0) <= 0 {
allErrs = append(allErrs, field.Invalid(fldPath.Child("scyllaDB", "storage", "capacity"), dcTemplate.ScyllaDB.Storage.Capacity, "must be greater than zero"))
}

if dcTemplate.ScyllaDB.Storage.StorageClassName != nil {
for _, msg := range apimachineryvalidation.NameIsDNSSubdomain(*dcTemplate.ScyllaDB.Storage.StorageClassName, false) {
allErrs = append(allErrs, field.Invalid(fldPath.Child("scyllaDB", "storage", "storageClassName"), *dcTemplate.ScyllaDB.Storage.StorageClassName, msg))
}
}
}

if dcTemplate.ScyllaDB.CustomConfigMapRef != nil {
for _, msg := range apimachineryvalidation.NameIsDNSSubdomain(*dcTemplate.ScyllaDB.CustomConfigMapRef, false) {
allErrs = append(allErrs, field.Invalid(fldPath.Child("scyllaDB", "customConfigMapRef"), *dcTemplate.ScyllaDB.CustomConfigMapRef, msg))
}
Copy link
Member

@rzetelskik rzetelskik Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to duplicate with types_scylladbdatacenter.go. Also why isn't this split into functions (nit)?

Comment on lines +137 to +141
if dcTemplate.ScyllaDBManagerAgent.CustomConfigSecretRef != nil {
for _, msg := range apimachineryvalidation.NameIsDNSSubdomain(*dcTemplate.ScyllaDBManagerAgent.CustomConfigSecretRef, false) {
allErrs = append(allErrs, field.Invalid(fldPath.Child("scyllaDBManagerAgent", "customConfigSecretRef"), *dcTemplate.ScyllaDBManagerAgent.CustomConfigSecretRef, msg))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines +164 to +167
var supportedServiceTypes = []scyllav1alpha1.NodeServiceType{
scyllav1alpha1.NodeServiceTypeHeadless,
scyllav1alpha1.NodeServiceTypeLoadBalancer,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +195 to +203
var allowedNodeServiceTypesByBroadcastAddressType = map[scyllav1alpha1.BroadcastAddressType][]scyllav1alpha1.NodeServiceType{
scyllav1alpha1.BroadcastAddressTypePodIP: {
scyllav1alpha1.NodeServiceTypeHeadless,
scyllav1alpha1.NodeServiceTypeLoadBalancer,
},
scyllav1alpha1.BroadcastAddressTypeServiceLoadBalancerIngress: {
scyllav1alpha1.NodeServiceTypeLoadBalancer,
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

}

if !isDatacenterStatusUpToDate(old, oldDCStatus) {
allErrs = append(allErrs, field.InternalError(fldPath.Child("datacenters").Index(i), fmt.Errorf("datacenters %q can't be removed because its status, that's used to determine node count, is not yet up to date with the generation of this resource; please retry later", removedDCName)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
allErrs = append(allErrs, field.InternalError(fldPath.Child("datacenters").Index(i), fmt.Errorf("datacenters %q can't be removed because its status, that's used to determine node count, is not yet up to date with the generation of this resource; please retry later", removedDCName)))
allErrs = append(allErrs, field.InternalError(fldPath.Child("datacenters").Index(i), fmt.Errorf("datacenter %q can't be removed because its status, that's used to determine node count, is not yet up to date with the generation of this resource; please retry later", removedDCName)))

return dcStatus.Name == removedDCName
})
if !ok {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this also result in an error? Isn't no status equivalent to out-of-date status?

}

if oldDCStatus.Nodes != nil && *oldDCStatus.Nodes != 0 {
allErrs = append(allErrs, field.Forbidden(fldPath.Child("datacenters").Index(i), fmt.Sprintf("datacenter %q can't be removed because the nodes are being scaled down", removedDCName)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: its nodes

@zimnx zimnx removed the request for review from tnozicka January 7, 2025 12:46
Copy link
Contributor

@zimnx: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke-multi-datacenter-parallel 38ef9a7 link true /test e2e-gke-multi-datacenter-parallel

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@zimnx zimnx requested a review from mflendrich January 13, 2025 12:27
Copy link
Collaborator

@mflendrich mflendrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me on the high level, I really like thow the validating webhooks are structured in the codebase 👍.
Verifying for idempotency, nothing stands out as problematic to me.

@rzetelskik is it reasonable to defer the new checks you've requested (and possibly other wishlist checks) to a separate issue? I don't have enough context to have an opinion here. But having a PR stuck on scope expansion lacking bandwidth feels like a choice inferior to merging a "good enough" webhook.
lgtm
/assign rzetelskik

@rzetelskik
Copy link
Member

rzetelskik commented Jan 14, 2025

But having a PR stuck on scope expansion

Which comments exactly are asking for a scope expansion? Unless I'm missing something, they either refer to missing (or incorrect) validation, or are nits, which can be ignored.

@mflendrich
Copy link
Collaborator

@rzetelskik I meant specifically ReadinessGates validation, metadata validation, placement validation - looks like they ask that new checks be implemented by the author of the PR.

I'm just floating an idea of merging a "good enough" webhook and deferring those to a separate issue. Note that I do not know if a webhook without those is "good enough".

@zimnx
Copy link
Collaborator Author

zimnx commented Jan 15, 2025

The thing with validation is that we cannot make it stricter between releases. Splitting it into followup issues could lead to forgeting about one. I'll add those validations as part of this effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add webhook validation to v1alpha1.ScyllaDBCluster
3 participants