Validation of ClusterRole with `system:` Prefix Fails #115

ricoberger · 2024-04-17T07:19:44Z

What happened?

The validation of ClusterRoles with the system: prefix as used by the Vertical Pod Autoscaler in the name fails:

kubectl validate vpa-actor.yaml

vpa-actor.yaml...ERROR
ClusterRole.rbac.authorization.k8s.io "system:vpa-actor" is invalid: metadata.name: Invalid value: "system:vpa-actor": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
Error: validation failed

What did you expect to happen?

The validation for ClusterRoles with the system: prefix in the name shouldn't fail.

How can we reproduce it (as minimally and precisely as possible)?

Save the following yaml as vpa-actor.yaml file and validate it with kubectl validate vpa-actor.yaml

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:vpa-actor
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - nodes
      - limitranges
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - get
      - list
      - watch
      - create
  - apiGroups:
      - "poc.autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "autoscaling.k8s.io"
    resources:
      - verticalpodautoscalers
    verbs:
      - get
      - list
      - watch

Anything else we need to know?

No response

Kubernetes version

Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5

The text was updated successfully, but these errors were encountered:

alexzielenski · 2024-04-17T13:53:27Z

This is a bug. Unfortunately Kubernetes native type schemas do not include information for how the resource should be validated.

We can workaround this for now until they are populated by hardcoding them for the embedded schemas, since they do not change except for new resources

/assign

nootr · 2024-05-03T09:14:06Z

Hi @alexzielenski,

I'm also running into this issue and I've been trying to create a workaround, but can't seem to make it work.

My idea was to write a schema patch:

{
  "components": {
    "schemas": {
      "io.k8s.api.rbac.v1.ClusterRole": {
        "properties": {
          "metadata": {
            "allOf": [
              {
                "$ref": "#/components/schemas/CustomObjectMeta"
              }
            ]
          }
        }
      },
      "CustomObjectMeta": {
        "properties": {
          "name": {
            "type": "string"
          }
        },
        "x-kubernetes-validations": [
            {
                "rule": "1 == 2"
            }
        ]
      }
    }
  }
}

When I run kubectl validate with this patch, I see the new validation rule is appended, but does not replace the validation of metadata; the "lowercase RFC 1123 subdomain" validation is still applied, even though #/components/schemas/CustomObjectMeta is a new schema.

Is it possible for me to write a temporary workaround or should this be fixed in kubectl-validate instead (in which case, I would be happy to help)?

Thanks in advance!

k8s-triage-robot · 2024-08-01T09:15:33Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

monotek · 2024-08-01T13:38:52Z

not stale

k8s-triage-robot · 2024-08-31T14:20:28Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jplitza · 2024-09-13T12:27:03Z

Looking at the schema files in pkg/openapiclient/builtins, particularly pkg/openapiclient/builtins/1.29/api/v1.json, I don't see where this error even comes from, since "name" does not have any requirements specified there. So this doesn't seem to be solvable with schema overrides.

/remove-lifecycle rotten

max-frank · 2024-11-28T05:10:24Z

Same seems to apply for ClusterRoleBinding, Role and RoleBinding which use : to prefix strings.

I specifically ran into this issue with cert-manager which is using cert-manager: prefixes and similar extensively (e.g., webhook-rbac.yaml)

max-frank · 2024-11-28T07:27:27Z

@jplitza So to my understanding this bug happens because of the following

The Validate uses customresource.NewStrategy.
This in turn is using customResourceValidator for doing the actual validation.
As part of the validation it registers the validation.NameIsDNSSubdomain function as one of the validation steps.
Which gets us to this bug since while the RFC 1123 limitation is applied to many resource types it is in fact not applicable for Roles and RoleBindings at seems (at least not on the server side).

@alexzielenski I am willing to work on creating a work around for this, but I am not sure whats the best approach here.
Simplest approach I could see would be wrapping the strategy and filtering out any metadata.name related RFC 1123 errors based on Group/Version/Kind which is available to the Validation function via the runtime.Object.

I am not versed enough in the K8S libraries to know if there is a way that doesn't really on a custom exception list, but rather relies on the logic already present on the server side. 🤔

ricoberger added the kind/bug label Apr 17, 2024

k8s-ci-robot assigned alexzielenski Apr 17, 2024

k8s-ci-robot added the lifecycle/stale label Aug 1, 2024

k8s-ci-robot added lifecycle/rotten and removed lifecycle/stale labels Aug 31, 2024

k8s-ci-robot removed the lifecycle/rotten label Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation of ClusterRole with `system:` Prefix Fails #115

Validation of ClusterRole with `system:` Prefix Fails #115

ricoberger commented Apr 17, 2024

alexzielenski commented Apr 17, 2024 •

edited

Loading

nootr commented May 3, 2024

k8s-triage-robot commented Aug 1, 2024

monotek commented Aug 1, 2024

k8s-triage-robot commented Aug 31, 2024

jplitza commented Sep 13, 2024

max-frank commented Nov 28, 2024

max-frank commented Nov 28, 2024

Validation of ClusterRole with system: Prefix Fails #115

Validation of ClusterRole with system: Prefix Fails #115

Comments

ricoberger commented Apr 17, 2024

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

alexzielenski commented Apr 17, 2024 • edited Loading

nootr commented May 3, 2024

k8s-triage-robot commented Aug 1, 2024

monotek commented Aug 1, 2024

k8s-triage-robot commented Aug 31, 2024

jplitza commented Sep 13, 2024

max-frank commented Nov 28, 2024

max-frank commented Nov 28, 2024

Validation of ClusterRole with `system:` Prefix Fails #115

Validation of ClusterRole with `system:` Prefix Fails #115

alexzielenski commented Apr 17, 2024 •

edited

Loading