Skip to content

PEP: Replace kubectl calls in E2E tests with controller-runtime client #5155

@camilamacedo86

Description

@camilamacedo86

Currently, our E2E tests (see testdata/project-v4/test/e2e
) use Go utils that shell out to kubectl for cluster operations.

We could instead use the controller-runtime client to perform the same actions directly in Go.

Goal:
We should discuss the best approach. While the current appraoch is great for those who start, using controller runtime might be more idiomatic for SDK development and controller authors and better for re-use. If we decide to move to B then we must ensure that it is still backwards compatible.

Options:

  • A. Keep kubectl calls: easier for beginners, mirrors real CLI use, but less robust and harder to maintain. ( as it is today )
  • B. Use controller-runtime client: more idiomatic Go and SDK-aligned, richer object handling, but steeper learning curve.

Two candidate approaches

Approach Description Pros Cons
A. Use kubectl via Go exec / helper utilities Current approach: tests shell out to kubectl, capture output, parse results, etc. • Familiar to many users who know kubectl usage.
• Mirrors how many users do manual operations.
• Possibly simpler to write when verifying end state via CLI.
• Lower barrier to entry for contributors who know kubectl.
• Less idiomatic Go & Kubernetes SDK usage.
• Shelling and parsing output is brittle (format changes, internationalisation, errors).
• Harder to compose complex scenarios at object-level (e.g., handle watch events, retry logic).
• Won’t scale so well for large tests or rich assertions.
B. Use controller-runtime/client (or client-go) directly in Go Upgrade tests to use Kubernetes API client calls (CRUD on objects, List/Watch, etc.) instead of shelling kubectl. • More idiomatic for SDK development and controller authors.
• Direct object access enables richer assertions (fields, conditions, statuses).
• Better reuse of SDK & shared mechanisms.
• Less reliance on CLI format; easier to run headless.
• Slightly steeper learning curve for contributors unfamiliar with client code.
• Might require more setup (e.g., scheme registration, context, client configuration).
• Less “what user would type” perspective — maybe less relatable for beginners.

Example proposal

Following how that would end up such as:

package e2e

import (
	"fmt"
	"os"
	"os/exec"
	"testing"

	. "github.com/onsi/ginkgo/v2"
	. "github.com/onsi/gomega"

	admv1 "k8s.io/api/admissionregistration/v1"
	apixv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
	clientgoscheme "k8s.io/client-go/kubernetes/scheme"

	ctrl "sigs.k8s.io/controller-runtime"

	"sigs.k8s.io/kubebuilder/testdata/project-v4/test/utils"
)

var (
	// Optional Environment Variables:
	// - CERT_MANAGER_INSTALL_SKIP=true
	skipCertManagerInstall      = os.Getenv("CERT_MANAGER_INSTALL_SKIP") == "true"
	isCertManagerAlreadyInstalled = false

	projectImage = "example.com/project-v4:v0.0.1"
)

func TestE2E(t *testing.T) {
	RegisterFailHandler(Fail)
	_, _ = fmt.Fprintf(GinkgoWriter, "Starting project-v4 integration test suite\n")
	RunSpecs(t, "e2e suite")
}

var _ = BeforeSuite(func() {
	// --- Register all schemes needed by tests (used via ctrl.Scheme) ---
	Expect(clientgoscheme.AddToScheme(ctrl.Scheme)).To(Succeed())
	Expect(admv1.AddToScheme(ctrl.Scheme)).To(Succeed())
	Expect(apixv1.AddToScheme(ctrl.Scheme)).To(Succeed())
	// -------------------------------------------------------------------

	By("building the manager(Operator) image")
	cmd := exec.Command("make", "docker-build", fmt.Sprintf("IMG=%s", projectImage))
	_, err := utils.Run(cmd)
	ExpectWithOffset(1, err).NotTo(HaveOccurred(), "Failed to build the manager(Operator) image")

	By("loading the manager(Operator) image on Kind")
	err = utils.LoadImageToKindClusterWithName(projectImage)
	ExpectWithOffset(1, err).NotTo(HaveOccurred(), "Failed to load the manager(Operator) image into Kind")

	if !skipCertManagerInstall {
		By("checking if cert manager is installed already")
		isCertManagerAlreadyInstalled = utils.IsCertManagerCRDsInstalled()
		if !isCertManagerAlreadyInstalled {
			_, _ = fmt.Fprintf(GinkgoWriter, "Installing CertManager...\n")
			Expect(utils.InstallCertManager()).To(Succeed(), "Failed to install CertManager")
		} else {
			_, _ = fmt.Fprintf(GinkgoWriter, "WARNING: CertManager is already installed. Skipping installation...\n")
		}
	}
})

var _ = AfterSuite(func() {
	if !skipCertManagerInstall && !isCertManagerAlreadyInstalled {
		_, _ = fmt.Fprintf(GinkgoWriter, "Uninstalling CertManager...\n")
		utils.UninstallCertManager()
	}
})
//go:build e2e
// +build e2e

.....

package e2e

import (
	"context"
	"fmt"
	"io"
	"os/exec"
	"time"

	. "github.com/onsi/ginkgo/v2"
	. "github.com/onsi/gomega"

	admv1 "k8s.io/api/admissionregistration/v1"
	appsv1 "k8s.io/api/apps/v1"
	apixv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
	authv1 "k8s.io/api/authentication/v1"
	corev1 "k8s.io/api/core/v1"
	rbacv1 "k8s.io/api/rbac/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	klabels "k8s.io/apimachinery/pkg/labels"
	"k8s.io/apimachinery/pkg/types"
	"k8s.io/client-go/kubernetes"
	"k8s.io/utils/ptr"

	ctrl "sigs.k8s.io/controller-runtime"
	crclient "sigs.k8s.io/controller-runtime/pkg/client"
)

const (
	namespace              = "project-v4-system"
	serviceAccountName     = "project-v4-controller-manager"
	metricsServiceName     = "project-v4-controller-manager-metrics-service"
	metricsRoleBindingName = "project-v4-metrics-binding"
)

var _ = Describe("Manager (controller-runtime client style)", Ordered, func() {
	var (
		ctx               context.Context
		k8sClient         crclient.Client
		clientset         *kubernetes.Clientset
		controllerPodName string
	)

	BeforeAll(func() {
		cfg := ctrl.GetConfigOrDie()

		// Use ctrl.Scheme (types are registered in suite.go)
		cl, err := crclient.New(cfg, crclient.Options{Scheme: ctrl.Scheme})
		Expect(err).NotTo(HaveOccurred())
		k8sClient = cl

		cs, err := kubernetes.NewForConfig(cfg)
		Expect(err).NotTo(HaveOccurred())
		clientset = cs

		ctx = context.Background()

		By("creating/ensuring the manager namespace with restricted label")
		ns := &corev1.Namespace{ObjectMeta: metav1.ObjectMeta{Name: namespace}}
		_ = k8sClient.Create(ctx, ns) // ignore AlreadyExists
		Eventually(func() error {
			var cur corev1.Namespace
			if err := k8sClient.Get(ctx, types.NamespacedName{Name: namespace}, &cur); err != nil {
				return err
			}
			if cur.Labels == nil {
				cur.Labels = map[string]string{}
			}
			cur.Labels["pod-security.kubernetes.io/enforce"] = "restricted"
			return k8sClient.Update(ctx, &cur)
		}).Should(Succeed())

		By("installing CRDs (kept via make target)")
		Expect(exec.Command("make", "install").Run()).To(Succeed())

		By("deploying the controller-manager")
		Expect(exec.Command("make", "deploy", fmt.Sprintf("IMG=%s", projectImage)).Run()).To(Succeed())
	})

	AfterAll(func() {
		By("best-effort cleanup of curl-metrics pod")
		_ = k8sClient.Delete(ctx, &corev1.Pod{
			ObjectMeta: metav1.ObjectMeta{Name: "curl-metrics", Namespace: namespace},
		})

		By("undeploying the controller-manager")
		_ = exec.Command("make", "undeploy").Run()

		By("uninstalling CRDs")
		_ = exec.Command("make", "uninstall").Run()
	})

	AfterEach(func() {
		if CurrentSpecReport().Failed() && controllerPodName != "" {
			By("Fetching controller manager pod logs")
			req := clientset.CoreV1().Pods(namespace).GetLogs(controllerPodName, &corev1.PodLogOptions{})
			rc, err := req.Stream(ctx)
			if err == nil {
				defer rc.Close()
				b, _ := io.ReadAll(rc)
				_, _ = fmt.Fprintf(GinkgoWriter, "Controller logs:\n%s\n", string(b))
			}
		}
	})

	SetDefaultEventuallyTimeout(2 * time.Minute)
	SetDefaultEventuallyPollingInterval(time.Second)

	Context("Manager", func() {
		It("should run successfully", func() {
			By("waiting for the controller-manager pod to be Running")
			Eventually(func(g Gomega) {
				var pods corev1.PodList
				selector := klabels.SelectorFromSet(map[string]string{"control-plane": "controller-manager"})
				err := k8sClient.List(ctx, &pods, crclient.InNamespace(namespace), crclient.MatchingLabelsSelector{Selector: selector})
				g.Expect(err).NotTo(HaveOccurred())
				g.Expect(pods.Items).To(HaveLen(1), "expected 1 controller pod running")

				p := pods.Items[0]
				controllerPodName = p.Name
				g.Expect(controllerPodName).To(ContainSubstring("controller-manager"))
				g.Expect(p.Status.Phase).To(Equal(corev1.PodRunning))
			}).Should(Succeed())
		})

		It("should ensure the metrics endpoint is serving metrics", func() {
			By("creating ClusterRoleBinding for metrics access")
			crb := &rbacv1.ClusterRoleBinding{
				ObjectMeta: metav1.ObjectMeta{Name: metricsRoleBindingName},
				Subjects: []rbacv1.Subject{{
					Kind:      rbacv1.ServiceAccountKind,
					Name:      serviceAccountName,
					Namespace: namespace,
				}},
				RoleRef: rbacv1.RoleRef{
					APIGroup: "rbac.authorization.k8s.io",
					Kind:     "ClusterRole",
					Name:     "project-v4-metrics-reader",
				},
			}
			_ = k8sClient.Create(ctx, crb) // ignore if exists

			By("verifying the metrics Service exists")
			var svc corev1.Service
			Expect(k8sClient.Get(ctx, types.NamespacedName{Name: metricsServiceName, Namespace: namespace}, &svc)).To(Succeed())

			By("requesting a token for the ServiceAccount (TokenRequest API)")
			tr, err := clientset.CoreV1().ServiceAccounts(namespace).CreateToken(ctx, serviceAccountName, &authv1.TokenRequest{
				Spec: authv1.TokenRequestSpec{},
			}, metav1.CreateOptions{})
			Expect(err).NotTo(HaveOccurred())
			token := tr.Status.Token
			Expect(token).NotTo(BeEmpty())

			By("waiting for Endpoints to expose port 8443")
			Eventually(func(g Gomega) {
				var ep corev1.Endpoints
				err := k8sClient.Get(ctx, types.NamespacedName{Name: metricsServiceName, Namespace: namespace}, &ep)
				g.Expect(err).NotTo(HaveOccurred())
				found := false
				for _, subset := range ep.Subsets {
					for _, port := range subset.Ports {
						if port.Port == 8443 {
							found = true
						}
					}
				}
				g.Expect(found).To(BeTrue(), "metrics endpoint not ready on 8443")
			}).Should(Succeed())

			By("ensuring metrics server started (scan controller logs)")
			Eventually(func(g Gomega) {
				req := clientset.CoreV1().Pods(namespace).GetLogs(controllerPodName, &corev1.PodLogOptions{})
				rc, err := req.Stream(ctx)
				g.Expect(err).NotTo(HaveOccurred())
				defer rc.Close()
				b, _ := io.ReadAll(rc)
				g.Expect(string(b)).To(ContainSubstring("controller-runtime.metrics\tServing metrics server"))
			}).Should(Succeed())

			By("creating curl-metrics Pod to curl /metrics")
			curl := &corev1.Pod{
				ObjectMeta: metav1.ObjectMeta{
					Name:      "curl-metrics",
					Namespace: namespace,
				},
				Spec: corev1.PodSpec{
					RestartPolicy:      corev1.RestartPolicyNever,
					ServiceAccountName: serviceAccountName,
					Containers: []corev1.Container{{
						Name:    "curl",
						Image:   "curlimages/curl:latest",
						Command: []string{"/bin/sh", "-c"},
						Args: []string{
							fmt.Sprintf("curl -v -k -H 'Authorization: Bearer %s' https://%s.%s.svc.cluster.local:8443/metrics", token, metricsServiceName, namespace),
						},
						SecurityContext: &corev1.SecurityContext{
							ReadOnlyRootFilesystem:   ptr.To(true),
							AllowPrivilegeEscalation: ptr.To(false),
							Capabilities: &corev1.Capabilities{
								Drop: []corev1.Capability{"ALL"},
							},
							RunAsNonRoot: ptr.To(true),
							RunAsUser:    ptr.To(int64(1000)),
							SeccompProfile: &corev1.SeccompProfile{
								Type: corev1.SeccompProfileTypeRuntimeDefault,
							},
						},
					}},
				},
			}
			_ = k8sClient.Delete(ctx, curl)
			Expect(k8sClient.Create(ctx, curl)).To(Succeed())

			By("waiting for curl-metrics to Succeed and verifying HTTP 200 in logs")
			Eventually(func(g Gomega) {
				var p corev1.Pod
				err := k8sClient.Get(ctx, types.NamespacedName{Name: "curl-metrics", Namespace: namespace}, &p)
				g.Expect(err).NotTo(HaveOccurred())
				g.Expect(p.Status.Phase).To(Equal(corev1.PodSucceeded))
			}, 5*time.Minute, time.Second).Should(Succeed())

			Eventually(func(g Gomega) {
				req := clientset.CoreV1().Pods(namespace).GetLogs("curl-metrics", &corev1.PodLogOptions{})
				rc, err := req.Stream(ctx)
				g.Expect(err).NotTo(HaveOccurred())
				defer rc.Close()
				b, _ := io.ReadAll(rc)
				g.Expect(string(b)).To(ContainSubstring("< HTTP/1.1 200 OK"))
			}, 2*time.Minute, time.Second).Should(Succeed())
		})

		It("should have cert-manager Secret for webhook", func() {
			Eventually(func(g Gomega) {
				var s corev1.Secret
				err := k8sClient.Get(ctx, types.NamespacedName{Name: "webhook-server-cert", Namespace: namespace}, &s)
				g.Expect(err).NotTo(HaveOccurred())
			}).Should(Succeed())
		})

		It("should have CA injection for mutating and validating webhooks", func() {
			Eventually(func(g Gomega) {
				var mwh admv1.MutatingWebhookConfiguration
				err := k8sClient.Get(ctx, types.NamespacedName{Name: "project-v4-mutating-webhook-configuration"}, &mwh)
				g.Expect(err).NotTo(HaveOccurred())
				found := false
				for _, wh := range mwh.Webhooks {
					if len(wh.ClientConfig.CABundle) > 10 {
						found = true
					}
				}
				g.Expect(found).To(BeTrue(), "mutating webhook CA bundle missing")
			}).Should(Succeed())

			Eventually(func(g Gomega) {
				var vwh admv1.ValidatingWebhookConfiguration
				err := k8sClient.Get(ctx, types.NamespacedName{Name: "project-v4-validating-webhook-configuration"}, &vwh)
				g.Expect(err).NotTo(HaveOccurred())
				found := false
				for _, wh := range vwh.Webhooks {
					if len(wh.ClientConfig.CABundle) > 10 {
						found = true
					}
				}
				g.Expect(found).To(BeTrue(), "validating webhook CA bundle missing")
			}).Should(Succeed())
		})

		It("should have CA injection for FirstMate conversion webhook", func() {
			Eventually(func(g Gomega) {
				var crd apixv1.CustomResourceDefinition
				err := k8sClient.Get(ctx, types.NamespacedName{Name: "firstmates.crew.testproject.org"}, &crd)
				g.Expect(err).NotTo(HaveOccurred())
				b := crd.Spec.Conversion
				g.Expect(b).NotTo(BeNil())
				g.Expect(b.Webhook).NotTo(BeNil())
				g.Expect(b.Webhook.ClientConfig).NotTo(BeNil())
				g.Expect(len(b.Webhook.ClientConfig.CABundle)).To(BeNumerically(">", 10))
			}).Should(Succeed())
		})
	})
})

Acceptance Criteria

  • The change must remain backward compatible. See that we have markers to inject code for webhooks checks
  • The new approach should preserve the same E2E test behavior and outcomes, only changing the internal mechanism to use the controller-runtime / client-go client instead of invoking kubectl commands.
  • Update the hack scripts and documentation to reflect the new workflow, including any generator logic or additional setup needed for the client-go–based tests.
  • Ensure that the migration path is clear, and contributors can still run existing projects without modification.
  • (Optional) If feasible, remove the existing utils helper used for kubectl exec calls, simplifying maintenance and reducing duplicated logic.

💡 If someone from the community wants to work on this, that would be great! This proposal will serve as the basis for discussion and potential inclusion in the scaffolds.

Metadata

Metadata

Assignees

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/designCategorizes issue or PR as related to design.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.testing

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions