Skip to content

Commit

Permalink
fix(manifests): fix full CRDs and update tests to use them
Browse files Browse the repository at this point in the history
The full CRDs at `manifests/base/crds/full` are only intended for usage
in editors
(#11266 (comment))
and are unusable otherwise. Trying to install them will cause spurious
validation errors with nearly every workflow.

Having the full CRDs would enable many useful features, such as `kubectl
explain` output (#8190),
[validating admission
policies](https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy/)
(#13503), and
integration with tools like cdk8s
(#8532).

As explained at
#13754 (comment),
there's two issues that prevent us from using the full CRDs everywhere:
1. [Kubernetes size
   limits](kubernetes/kubernetes#82292) when
   using client-side apply. This is not a problem when using
   .
2. Wrong validation information due to kubebuilder limitations.

For the first issue, I verified that this is no longer an issue when
using [server-side
apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/),
so I updated the `make install` target to use it. Since `kubectl apply --server-side`
isn't compatible with `kubectl apply --prune`, I had to switch to use
[apply
sets](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#alternative-kubectl-apply-f-directory-prune),
which is intended to replace `--prune`, and seems to work well.

For the second issue, I went through and added workarounds to
`hack/manifests/crds.go` for all the kubebuilder issues I could find.
Since the E2E test suite now uses the full CRDs, it should tell us if
there's anything I missed.

I didn't update the release manifests to use the full CRDs, since that's
a big change and we probably should wait awhile to make sure there
aren't any unexpected issues. Users can easily opt into the full
CRDs if they want.

Signed-off-by: Mason Malone <[email protected]>
  • Loading branch information
MasonM committed Jan 2, 2025
1 parent 81993b5 commit 57ba14a
Show file tree
Hide file tree
Showing 70 changed files with 38,994 additions and 22,696 deletions.
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -480,7 +480,10 @@ install: githooks
kubectl get ns $(KUBE_NAMESPACE) || kubectl create ns $(KUBE_NAMESPACE)
kubectl config set-context --current --namespace=$(KUBE_NAMESPACE)
@echo "installing PROFILE=$(PROFILE)"
kubectl kustomize --load-restrictor=LoadRestrictionsNone test/e2e/manifests/$(PROFILE) | sed 's|quay.io/argoproj/|$(IMAGE_NAMESPACE)/|' | sed 's/namespace: argo/namespace: $(KUBE_NAMESPACE)/' | kubectl -n $(KUBE_NAMESPACE) apply --prune -l app.kubernetes.io/part-of=argo -f -
kubectl kustomize --load-restrictor=LoadRestrictionsNone test/e2e/manifests/$(PROFILE) \
| sed 's|quay.io/argoproj/|$(IMAGE_NAMESPACE)/|' \
| sed 's/namespace: argo/namespace: $(KUBE_NAMESPACE)/' \
| KUBECTL_APPLYSET=true kubectl -n $(KUBE_NAMESPACE) apply --applyset=configmaps/install --server-side --prune -f -
ifeq ($(PROFILE),stress)
kubectl -n $(KUBE_NAMESPACE) apply -f test/stress/massive-workflow.yaml
endif
Expand Down
8 changes: 0 additions & 8 deletions api/jsonschema/schema.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions api/openapi-spec/swagger.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 2 additions & 6 deletions docs/executor_swagger.md
Original file line number Diff line number Diff line change
Expand Up @@ -983,7 +983,7 @@ ConfigMap volumes support ownership management and SELinux relabeling.
| template | string| `string` | | | Name of template to execute | |
| templateRef | [TemplateRef](#template-ref)| `TemplateRef` | | | | |
| when | string| `string` | | | When is an expression in which the task should conditionally execute | |
| withItems | [][Item](#item)| `[]Item` | | | WithItems expands a task into multiple parallel tasks from the items in the list | |
| withItems | [][Item](#item)| `[]Item` | | | WithItems expands a task into multiple parallel tasks from the items in the list</br>+kubebuilder:validation:Schemaless</br>+kubebuilder:pruning:PreserveUnknownFields | |
| withParam | string| `string` | | | WithParam expands a task into multiple parallel tasks from the value in the parameter,</br>which is expected to be a JSON list. | |
| withSequence | [Sequence](#sequence)| `Sequence` | | | | |

Expand Down Expand Up @@ -1856,7 +1856,6 @@ ISCSI volumes support ownership management and SELinux relabeling.


> +protobuf.options.(gogoproto.goproto_stringer)=false
+kubebuilder:validation:Type=object


Expand Down Expand Up @@ -2511,11 +2510,8 @@ be cluster-scoped, so there is no namespace field.
### <span id="parallel-steps"></span> ParallelSteps


> +kubebuilder:validation:Type=array




[interface{}](#interface)

### <span id="parameter"></span> Parameter
Expand Down Expand Up @@ -3391,7 +3387,7 @@ cause implementors to also use a fixed point implementation.
| resources | [ResourceRequirements](#resource-requirements)| `ResourceRequirements` | | | | |
| restartPolicy | [ContainerRestartPolicy](#container-restart-policy)| `ContainerRestartPolicy` | | | | |
| securityContext | [SecurityContext](#security-context)| `SecurityContext` | | | | |
| source | string| `string` | | | Source contains the source code of the script to execute | |
| source | string| `string` | | | Source contains the source code of the script to execute</br>+optional | |
| startupProbe | [Probe](#probe)| `Probe` | | | | |
| stdin | boolean| `bool` | | | Whether this container should allocate a buffer for stdin in the container runtime. If this</br>is not set, reads from stdin in the container will always result in EOF.</br>Default is false.</br>+optional | |
| stdinOnce | boolean| `bool` | | | Whether the container runtime should close the stdin channel after it has been opened by</br>a single attach. When stdin is true the stdin stream will remain open across multiple attach</br>sessions. If stdinOnce is set to true, stdin is opened on container start, is empty until the</br>first client attaches to stdin, and then remains open and accepts data until the client disconnects,</br>at which time stdin is closed and remains closed until the container is restarted. If this</br>flag is false, a container processes that reads from stdin will never receive an EOF.</br>Default is false</br>+optional | |
Expand Down
10 changes: 10 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,16 @@ You can use Kustomize to patch your preferred [configurations](managed-namespace

You can install Argo Workflows using the community maintained [Helm charts](https://github.com/argoproj/argo-helm).

### Full CRDs

The official release manifests come with stripped-down CRDs that omit validation information.
This is a workaround for [Kubernetes size limitations](https://github.com/kubernetes/kubernetes/issues/82292) when using client-side apply.
As of version 3.7, the full CRDs can be installed using [server-side apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/) via the following command:

```bash
kubectl apply --server-side --kustomize https://github.com/argoproj/argo-workflows/manifests/base/crds/full?ref=v3.7.0
```

## Installation options

Determine your base installation option.
Expand Down
2 changes: 1 addition & 1 deletion hack/manifests/crdgen.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ add_header() {
mv tmp "$1"
}

controller-gen crd:maxDescLen=0 paths=./pkg/apis/... output:dir=manifests/base/crds/full
controller-gen crd:maxDescLen=0,generateEmbeddedObjectMeta=true paths=./pkg/apis/... output:dir=manifests/base/crds/full

find manifests/base/crds/full -name 'argoproj.io*.yaml' | while read -r file; do
# remove junk fields
Expand Down
148 changes: 62 additions & 86 deletions hack/manifests/crds.go
Original file line number Diff line number Diff line change
@@ -1,108 +1,84 @@
package main

import (
"fmt"
"os"
"path/filepath"

"sigs.k8s.io/yaml"
)
import "fmt"

// cleanCRD does post-processing of the CRDs generated by kubebuilder to workaround limitations
// (see https://github.com/kubernetes-sigs/controller-tools/issues/461)
func cleanCRD(filename string) {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
crd := make(obj)
err = yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
delete(crd, "status")
metadata := crd["metadata"].(obj)
delete(metadata, "annotations")
delete(metadata, "creationTimestamp")
spec := crd["spec"].(obj)
versions := spec["versions"].([]interface{})
version := versions[0].(obj)
schema := version["schema"].(obj)["openAPIV3Schema"].(obj)
name := crd["metadata"].(obj)["name"].(string)
switch name {
crd := ParseYaml(Read(filename))
crd.RemoveNestedField("status")
crd.RemoveNestedField("metadata", "annotations")
crd.RemoveNestedField("metadata", "creationTimestamp")
schema := crd.OpenAPIV3Schema()
switch crd.Name() {
case "cronworkflows.argoproj.io":
properties := schema["properties"].(obj)["spec"].(obj)["properties"].(obj)["workflowSpec"].(obj)["properties"].(obj)["templates"].(obj)["items"].(obj)["properties"]
properties.(obj)["container"].(obj)["required"] = []string{"image"}
properties.(obj)["script"].(obj)["required"] = []string{"image", "source"}
patchWorkflowSpecTemplateFields(&schema, "spec", "properties", "workflowSpec", "properties")
case "clusterworkflowtemplates.argoproj.io", "workflows.argoproj.io", "workflowtemplates.argoproj.io":
properties := schema["properties"].(obj)["spec"].(obj)["properties"].(obj)["templates"].(obj)["items"].(obj)["properties"]
properties.(obj)["container"].(obj)["required"] = []string{"image"}
properties.(obj)["script"].(obj)["required"] = []string{"image", "source"}
patchWorkflowSpecTemplateFields(&schema, "spec", "properties")
}
data, err = yaml.Marshal(crd)
if err != nil {
panic(err)
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
if crd.Name() == "workflows.argoproj.io" {
patchTemplateFields(&schema, "status", "properties", "storedTemplates", "additionalProperties")
patchWorkflowSpecTemplateFields(&schema, "status", "properties", "storedWorkflowTemplateSpec", "properties")
}
crd.WriteYaml(filename)
}

// minimizeCRD is a workaround for two separate issues:
// 1. "Request entity too large: limit is 3145728" errors due to https://github.com/kubernetes/kubernetes/issues/82292
// 2. "spec.validation.openAPIV3Schema.properties[spec].properties[tasks].additionalProperties.properties[steps].items.items: Required value: must be specified" due to kubebuilder issues (https://github.com/argoproj/argo-workflows/pull/3809#discussion_r472383090)
func minimizeCRD(filename string) {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
func patchWorkflowSpecTemplateFields(schema *obj, baseFields ...string) {
patchTemplateFields(schema, append(baseFields, "templateDefaults")...)
patchTemplateFields(schema, append(baseFields, "templates", "items")...)
}

shouldMinimize := false
if len(data) > 1024*1024 {
fmt.Printf("Minimizing %s due to CRD size (%d) exceeding 1MB\n", filename, len(data))
shouldMinimize = true
}
func patchTemplateFields(schema *obj, baseFields ...string) {
// "container" and "script" templates embed the k8s.io/api/core/v1/Container
// struct, which requires the "name" field to be specified, but it's not actually required,
// and there's no easy way to override validating rules for embedded structs in kubebuilder.
schema.RemoveNestedField(append(baseFields, "properties", "container", "required")...)
schema.RemoveNestedField(append(baseFields, "properties", "script", "required")...)

crd := make(obj)
err = yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
// Hack to transform "steps" from an array of objects, e.g.
// steps:
// - steps:
// - name: foo
// template: bar
// to an array of arrays, e.g.:
// steps:
// - - name: foo
// template: bar
//
// The reason it's represented as an array of objects in "workflow_types.go"
// is because that's the only way to get kubebuilder to generate the spec
// without breaking API compatibility.
stepFields := append(baseFields, "properties", "steps", "items")
schema.CopyNestedField(append(stepFields, "properties", "steps"), stepFields)
}

if !shouldMinimize {
name := crd["metadata"].(obj)["name"].(string)
switch name {
case "cronworkflows.argoproj.io", "clusterworkflowtemplates.argoproj.io", "workflows.argoproj.io", "workflowtemplates.argoproj.io", "workflowtasksets.argoproj.io":
fmt.Printf("Minimizing %s due to kubebuilder issues\n", filename)
shouldMinimize = true
}
// minimizeCRD generates a stripped-down CRD as a workaround for "Request entity too large: limit is 3145728" errors due to https://github.com/kubernetes/kubernetes/issues/82292.
// This isn't an issue if you use server-side apply, but not everyone has that available.
func minimizeCRD(filename string) {
data := Read(filename)
shouldMinimize := false
// TODO: The 512KB limit is just to avoid changing too much in https://github.com/argoproj/argo-workflows/pull/14044
// Increase this back to 1MB after that's merged.
if len(data) > 512*1024 {
fmt.Printf("Minimizing %s due to CRD size (%d) exceeding 512KB\n", filename, len(data))
shouldMinimize = true
}

if !shouldMinimize {
return
}

crd = stripSpecAndStatusFields(crd)

data, err = yaml.Marshal(crd)
if err != nil {
panic(err)
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
}
crd := ParseYaml(data)
stripSpecAndStatusFields(crd)
crd.WriteYaml(filename)
}

// stripSpecAndStatusFields strips the "spec" and "status" fields from the CRD, as those are usually the largest.
func stripSpecAndStatusFields(crd obj) obj {
spec := crd["spec"].(obj)
versions := spec["versions"].([]interface{})
version := versions[0].(obj)
properties := version["schema"].(obj)["openAPIV3Schema"].(obj)["properties"].(obj)
for k := range properties {
if k == "spec" || k == "status" {
properties[k] = obj{"type": "object", "x-kubernetes-preserve-unknown-fields": true, "x-kubernetes-map-type": "atomic"}
}
func stripSpecAndStatusFields(crd *obj) {
schema := crd.OpenAPIV3Schema()
preserveMarker := obj{"type": "object", "x-kubernetes-preserve-unknown-fields": true, "x-kubernetes-map-type": "atomic"}
if _, ok := schema["spec"]; ok {
schema["spec"] = preserveMarker
}
if _, ok := schema["status"]; ok {
schema["status"] = preserveMarker
}
return crd
}
71 changes: 71 additions & 0 deletions hack/manifests/helpers.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
package main

import (
"fmt"
"os"
"path/filepath"

"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"sigs.k8s.io/yaml"
)

type obj map[string]interface{}

func (o *obj) RemoveNestedField(fields ...string) {
unstructured.RemoveNestedField(*o, fields...)
}

func (o *obj) CopyNestedField(sourceFields []string, targetFields []string) {
value := nestedFieldNoCopy[any](o, sourceFields...)
parentField := nestedFieldNoCopy[map[string]interface{}](o, targetFields[:len(targetFields)-1]...)
parentField[targetFields[len(targetFields)-1]] = value
}

func (o *obj) Name() string {
return nestedFieldNoCopy[string](o, "metadata", "name")
}

func (o *obj) OpenAPIV3Schema() obj {
versions := nestedFieldNoCopy[[]interface{}](o, "spec", "versions")
version := obj(versions[0].(map[string]interface{}))
return nestedFieldNoCopy[map[string]interface{}](&version, "schema", "openAPIV3Schema", "properties")
}

func nestedFieldNoCopy[T any](o *obj, fields ...string) T {
value, found, err := unstructured.NestedFieldNoCopy(*o, fields...)
if !found {
panic(fmt.Sprintf("failed to find field %v", fields))
}
if err != nil {
panic(err.Error())
}
return value.(T)
}

func (o *obj) WriteYaml(filename string) {
data, err := yaml.Marshal(o)
if err != nil {
panic(err)
}
err = os.WriteFile(filename, data, 0o600)
if err != nil {
panic(err)
}
}

func Read(filename string) []byte {
data, err := os.ReadFile(filepath.Clean(filename))
if err != nil {
panic(err)
}
return data
}

func ParseYaml(data []byte) *obj {
crd := make(obj)
err := yaml.Unmarshal(data, &crd)
if err != nil {
panic(err)
}
return &crd
}
3 changes: 0 additions & 3 deletions hack/manifests/types.go

This file was deleted.

4 changes: 2 additions & 2 deletions manifests/base/crds/full/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Full CRDs

These CRDs have the full schema.
As a result, they are large and probably not suitable to be used in your cluster due to [kubernetes/kubernetes#82292](https://github.com/kubernetes/kubernetes/issues/82292).
These CRDs have the full schema and must be applied using [server-side apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/) to avoid hitting size limits.
See [Full CRDs](https://argo-workflows.readthedocs.io/en/latest/installation/#full-crds) for more details.
Loading

0 comments on commit 57ba14a

Please sign in to comment.