Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions e2e/machine_migration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package e2e
import (
"context"
"fmt"
"strings"
"time"

. "github.com/onsi/ginkgo/v2"
Expand Down Expand Up @@ -114,6 +115,27 @@ var _ = Describe("[sig-cluster-lifecycle][OCPFeatureGate:MachineAPIMigration] Ma
})
})

var _ = Describe("VAP Validation Tests Core", Ordered, func() {
Context("CAPI Machine field validation", func() {
It("should prevent setting of capi fields that are not supported by mapi(version)", func() {
verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
version := "v1"
capiMachine.Spec.Version = &version
}, "spec.version")
})

It("should prevent setting of capi fields that are not supported by mapi(readinessGates)", func() {
verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
capiMachine.Spec.ReadinessGates = []clusterv1.MachineReadinessGate{
{
ConditionType: "READY",
},
}
}, "spec.readinessGates")
})
})
})
Copy link

@coderabbitai coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Do not mutate an arbitrary, live CAPI Machine; create a dedicated test Machine.

Selecting the first machine from the cluster can alter production-like nodes if the VAP isn’t present yet (as noted, tests intentionally fail pre-merge). Make this block hermetic by creating its own CAPI Machine and passing it to the helper.

Apply:

 var _ = Describe("VAP Validation Tests Core", Ordered, func() {
-    Context("CAPI Machine field validation", func() {
+    var vapTarget *clusterv1.Machine
+    BeforeAll(func() {
+        name := fmt.Sprintf("vap-validate-%d", time.Now().UnixNano())
+        vapTarget = createCAPIMachine(ctx, cl, name)
+        DeferCleanup(func() {
+            cleanupMachineResources(ctx, cl, []*clusterv1.Machine{vapTarget}, []*machinev1beta1.Machine{})
+        })
+    })
+    Context("CAPI Machine field validation", func() {
         It("should prevent setting of capi fields that are not supported by mapi(version)", func() {
-            verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
+            verifyCAPIMachineUpdateBlocked(ctx, cl, vapTarget.Name, func(capiMachine *clusterv1.Machine) {
                 version := "v1"
                 capiMachine.Spec.Version = &version
             }, "spec.version")
         })
 
         It("should prevent setting of capi fields that are not supported by mapi(readinessGates)", func() {
-            verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
+            verifyCAPIMachineUpdateBlocked(ctx, cl, vapTarget.Name, func(capiMachine *clusterv1.Machine) {
                 capiMachine.Spec.ReadinessGates = []clusterv1.MachineReadinessGate{
                     {
                         ConditionType: "READY",
                     },
                 }
             }, "spec.readinessGates")
         })
     })
 })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var _ = Describe("VAP Validation Tests Core", Ordered, func() {
Context("CAPI Machine field validation", func() {
It("should prevent setting of capi fields that are not supported by mapi(version)", func() {
verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
version := "v1"
capiMachine.Spec.Version = &version
}, "spec.version")
})
It("should prevent setting of capi fields that are not supported by mapi(readinessGates)", func() {
verifyCAPIMachineUpdateBlocked(cl, func(capiMachine *clusterv1.Machine) {
capiMachine.Spec.ReadinessGates = []clusterv1.MachineReadinessGate{
{
ConditionType: "READY",
},
}
}, "spec.readinessGates")
})
})
})
var _ = Describe("VAP Validation Tests Core", Ordered, func() {
var vapTarget *clusterv1.Machine
BeforeAll(func() {
name := fmt.Sprintf("vap-validate-%d", time.Now().UnixNano())
vapTarget = createCAPIMachine(ctx, cl, name)
DeferCleanup(func() {
cleanupMachineResources(ctx, cl, []*clusterv1.Machine{vapTarget}, []*machinev1beta1.Machine{})
})
})
Context("CAPI Machine field validation", func() {
It("should prevent setting of capi fields that are not supported by mapi(version)", func() {
verifyCAPIMachineUpdateBlocked(ctx, cl, vapTarget.Name, func(capiMachine *clusterv1.Machine) {
version := "v1"
capiMachine.Spec.Version = &version
}, "spec.version")
})
It("should prevent setting of capi fields that are not supported by mapi(readinessGates)", func() {
verifyCAPIMachineUpdateBlocked(ctx, cl, vapTarget.Name, func(capiMachine *clusterv1.Machine) {
capiMachine.Spec.ReadinessGates = []clusterv1.MachineReadinessGate{
{
ConditionType: "READY",
},
}
}, "spec.readinessGates")
})
})
})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai creating a new machine uses significant amount of time and costs as well , we are reverting the machines in case no error is received due to VAP failure.Also BeforeAll is already present using it again will cause repetition .

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miyadav You make excellent points! You're absolutely right:

  1. Cost and time efficiency: Creating new machines in AWS is indeed expensive and time-consuming, especially for tests that are expected to fail due to VAP blocking the changes.

  2. Proper cleanup mechanism: Your current verifyCAPIMachineUpdateBlocked function already has robust cleanup with the defer block that restores the original machine state regardless of test outcome.

  3. BeforeAll structure: You're correct that adding another nested BeforeAll would create unnecessary complexity and repetition.

Your approach is actually well-designed for this specific scenario where:

  • The VAP should prevent any actual changes from being persisted
  • If VAP isn't working (pre-merge), the defer cleanup ensures the original state is restored
  • It's much more resource-efficient than creating dedicated test machines

I withdraw my previous suggestion. Your current implementation strikes the right balance between test safety and resource efficiency. The existing cleanup mechanism in verifyCAPIMachineUpdateBlocked provides adequate protection against unintended mutations.

🐰✨ (appreciating the practical engineering considerations!)


var _ = Describe("Deleting MAPI/CAPI Machines", Ordered, func() {
var mapiMachineAuthCAPINameDeletion = "machine-authoritativeapi-capi-deletion"
var mapiMachineAuthMAPINameDeleteMAPIMachine = "machine-authoritativeapi-mapi-delete-mapi"
Expand Down Expand Up @@ -458,6 +480,64 @@ func verifyCAPIMachinePausedCondition(capiMachine *clusterv1.Machine, authority
)
}

func verifyCAPIMachineUpdateBlocked(cl client.Client, updateFunc func(*clusterv1.Machine), fieldName string) {
By(fmt.Sprintf("Attempting to update CAPI Machine %s field", fieldName))

// Get any existing CAPI machine from the namespace
capiMachineList, err := capiframework.GetMachines(cl)
Expect(err).NotTo(HaveOccurred(), "Failed to list CAPI machines")
Expect(capiMachineList).NotTo(BeEmpty(), "No CAPI machines found in the openshift-cluster-api namespace")

// Use the first machine from the list
existingMachine := capiMachineList[0]
By(fmt.Sprintf("Using existing CAPI machine %s for VAP testing", existingMachine.Name))

// Store original machine state for cleanup
originalMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
Expect(err).NotTo(HaveOccurred(), "Failed to get original CAPI machine")

// Ensure cleanup happens even if test fails
defer func() {
By("Restoring original machine state")
// Get current machine state
currentMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
if err != nil {
// If machine doesn't exist, that's fine - no cleanup needed
return
}

// Restore original spec
currentMachine.Spec = originalMachine.Spec

// Update back to original state
err = cl.Update(ctx, currentMachine)
if err != nil {
// Log the error but don't fail the test
fmt.Printf("Warning: Failed to restore original machine state: %v\n", err)
}
}()

// Get the current CAPI machine
capiMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
Expect(err).NotTo(HaveOccurred(), "Failed to get CAPI machine")

// Apply the update function
updateFunc(capiMachine)

// Try to update the machine - this should fail due to VAP
err = cl.Update(ctx, capiMachine)
Expect(err).To(HaveOccurred(), "Update should fail due to VAP")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden the helper: explicit ctx, deterministic target, conflict-safe update, and better logging.

  • Avoid hidden package-level ctx; pass it in.
  • Target a specific machine by name (pairs with the test change) instead of “first in the list”.
  • Use Patch with MergeFrom to reduce update conflicts; retry restore on conflict.
  • Log via GinkgoWriter so CI captures it.

Apply:

-func verifyCAPIMachineUpdateBlocked(cl client.Client, updateFunc func(*clusterv1.Machine), fieldName string) {
-    By(fmt.Sprintf("Attempting to update CAPI Machine %s field", fieldName))
-
-    // Get any existing CAPI machine from the namespace
-    capiMachineList, err := capiframework.GetMachines(cl)
-    Expect(err).NotTo(HaveOccurred(), "Failed to list CAPI machines")
-    Expect(capiMachineList).NotTo(BeEmpty(), "No CAPI machines found in the openshift-cluster-api namespace")
-
-    // Use the first machine from the list
-    existingMachine := capiMachineList[0]
-    By(fmt.Sprintf("Using existing CAPI machine %s for VAP testing", existingMachine.Name))
-
-    // Store original machine state for cleanup
-    originalMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
+func verifyCAPIMachineUpdateBlocked(ctx context.Context, cl client.Client, machineName string, updateFunc func(*clusterv1.Machine), fieldName string) {
+    By(fmt.Sprintf("Attempting to update CAPI Machine field %s on %s", fieldName, machineName))
+
+    // Store original machine state for cleanup
+    originalMachine, err := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
     Expect(err).NotTo(HaveOccurred(), "Failed to get original CAPI machine")
 
     // Ensure cleanup happens even if test fails
     defer func() {
         By("Restoring original machine state")
-        // Get current machine state
-        currentMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
+        // Get current machine state
+        currentMachine, err := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
         if err != nil {
             // If machine doesn't exist, that's fine - no cleanup needed
             return
         }
 
         // Restore original spec
         currentMachine.Spec = originalMachine.Spec
 
         // Update back to original state
-        err = cl.Update(ctx, currentMachine)
-        if err != nil {
-            // Log the error but don't fail the test
-            fmt.Printf("Warning: Failed to restore original machine state: %v\n", err)
-        }
+        if err := cl.Update(ctx, currentMachine); apierrors.IsConflict(err) {
+            // Refetch and retry once on conflict
+            latest, e := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
+            if e == nil {
+                latest.Spec = originalMachine.Spec
+                err = cl.Update(ctx, latest)
+            } else {
+                err = e
+            }
+        }
+        if err != nil {
+            // Log the error but don't fail the test
+            fmt.Fprintf(GinkgoWriter, "Warning: failed to restore original machine state: %v\n", err)
+        }
     }()
 
     // Get the current CAPI machine
-    capiMachine, err := capiframework.GetMachine(cl, existingMachine.Name, capiframework.CAPINamespace)
+    capiMachine, err := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
     Expect(err).NotTo(HaveOccurred(), "Failed to get CAPI machine")
 
     // Apply the update function
-    updateFunc(capiMachine)
+    updateFunc(capiMachine)
 
     // Try to update the machine - this should fail due to VAP
-    err = cl.Update(ctx, capiMachine)
+    base := capiMachine.DeepCopy()
+    err = cl.Patch(ctx, capiMachine, client.MergeFrom(base))
     Expect(err).To(HaveOccurred(), "Update should fail due to VAP")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In e2e/machine_migration_test.go around lines 483-530, the helper
verifyCAPIMachineUpdateBlocked must be hardened: change the signature to accept
a context.Context and a target machine name (avoid discovering "first" machine),
use client.Patch with client.MergeFrom to apply the update (instead of
cl.Update) to minimize conflicts, and when restoring the original spec use a
retry-on-conflict loop that re-reads the current machine and applies a MergeFrom
patch until success; replace fmt.Printf logging with GinkgoWriter prints for CI
visibility and ensure all client calls use the passed ctx.

// The VAP is working correctly - verify the error message contains the expected VAP policy name
errorMsg := fmt.Sprintf("%v", err)
Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
"Error message should contain VAP policy name: %s", errorMsg)

// Also check that the error message contains "forbidden" and "denied request"
Expect(strings.Contains(errorMsg, "forbidden")).To(BeTrue(), "Error message should contain 'forbidden': %s", errorMsg)
Expect(strings.Contains(errorMsg, "denied request")).To(BeTrue(), "Error message should contain 'denied request': %s", errorMsg)
}
Copy link

@coderabbitai coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Brittle message checks will flake (phrase differs: “denied the request” vs “denied request”). Assert Forbidden via API and relax the message.

Rely on apierrors.IsForbidden(err). Also match “denied” case-insensitively; keep policy-name check.

Apply:

-    // The VAP is working correctly - verify the error message contains the expected VAP policy name
-    errorMsg := fmt.Sprintf("%v", err)
-    Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
-        "Error message should contain VAP policy name: %s", errorMsg)
-
-    // Also check that the error message contains "forbidden" and "denied request"
-    Expect(strings.Contains(errorMsg, "forbidden")).To(BeTrue(), "Error message should contain 'forbidden': %s", errorMsg)
-    Expect(strings.Contains(errorMsg, "denied request")).To(BeTrue(), "Error message should contain 'denied request': %s", errorMsg)
+    // The VAP is working correctly - verify the error is Forbidden and contains the expected VAP policy name
+    errorMsg := fmt.Sprintf("%v", err)
+    Expect(apierrors.IsForbidden(err)).To(BeTrue(), "Expected forbidden error from VAP, got: %s", errorMsg)
+    Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
+        "Error message should contain the VAP policy name: %s", errorMsg)
+    Expect(strings.Contains(strings.ToLower(errorMsg), "denied")).To(BeTrue(), "Error message should contain 'denied': %s", errorMsg)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// The VAP is working correctly - verify the error message contains the expected VAP policy name
errorMsg := fmt.Sprintf("%v", err)
Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
"Error message should contain VAP policy name: %s", errorMsg)
// Also check that the error message contains "forbidden" and "denied request"
Expect(strings.Contains(errorMsg, "forbidden")).To(BeTrue(), "Error message should contain 'forbidden': %s", errorMsg)
Expect(strings.Contains(errorMsg, "denied request")).To(BeTrue(), "Error message should contain 'denied request': %s", errorMsg)
}
// The VAP is working correctly - verify the error is Forbidden and contains the expected VAP policy name
errorMsg := fmt.Sprintf("%v", err)
Expect(apierrors.IsForbidden(err)).To(BeTrue(), "Expected forbidden error from VAP, got: %s", errorMsg)
Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
"Error message should contain the VAP policy name: %s", errorMsg)
Expect(strings.Contains(strings.ToLower(errorMsg), "denied")).To(BeTrue(), "Error message should contain 'denied': %s", errorMsg)
}
🤖 Prompt for AI Agents
In e2e/machine_migration_test.go around lines 531 to 539, the test currently
asserts exact substrings in the error message which is brittle; change the
checks to use apierrors.IsForbidden(err) to assert the error is a Forbidden API
error, keep the policy-name assertion as-is, and replace the strict "denied
request" substring check with a case-insensitive check for "denied" (e.g.,
strings.Contains(strings.ToLower(errorMsg), "denied")) so the test accepts
variations like "denied the request".

Copy link
Member Author

@miyadav miyadav Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai what http code does apierrors.IsForbidden(err) expects , is it same as what VAP returns ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

So yes, apierrors.IsForbidden(err) will correctly detect VAP denials since both use the same HTTP 403 status code. This makes the suggestion to use apierrors.IsForbidden(err) instead of brittle string matching a good approach - it's more robust and semantically correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai I get response code 422 , see below logs -
I0912 18:10:02.658582 125954 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"machines.cluster.x-k8s.io "miyadav-0912-85v48-worker-us-east-2b-wfrg9" is forbidden: ValidatingAdmissionPolicy 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' with binding 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' denied request: spec.readinessGates is a forbidden field","reason":"Invalid","details":{"name":"miyadav-0912-85v48-worker-us-east-2b-wfrg9","group":"cluster.x-k8s.io","kind":"machines","causes":[{"message":"ValidatingAdmissionPolicy 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' with binding 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' denied request: spec.readinessGates is a forbidden field"}]},"code":422}
The machines "miyadav-0912-85v48-worker-us-east-2b-wfrg9" is invalid: : ValidatingAdmissionPolicy 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' with binding 'openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi' denied request: spec.readinessGates is a forbidden field

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miyadav Thank you for testing this and providing the actual logs! You're absolutely right - VAP returns HTTP 422 (Unprocessable Entity) with reason "Invalid", not HTTP 403 (Forbidden) as I incorrectly assumed.

Looking at your logs, the correct API error check should be apierrors.IsInvalid(err) instead of apierrors.IsForbidden(err).

Here's the corrected suggestion:

// The VAP is working correctly - verify the error is Invalid and contains the expected VAP policy name
errorMsg := fmt.Sprintf("%v", err)
Expect(apierrors.IsInvalid(err)).To(BeTrue(), "Expected invalid error from VAP, got: %s", errorMsg)
Expect(strings.Contains(errorMsg, "openshift-cluster-api-prevent-setting-of-capi-fields-unsupported-by-mapi")).To(BeTrue(),
	"Error message should contain the VAP policy name: %s", errorMsg)
Expect(strings.Contains(strings.ToLower(errorMsg), "denied")).To(BeTrue(), "Error message should contain 'denied': %s", errorMsg)

This approach is still more robust than the original string-only checks because:

  1. apierrors.IsInvalid(err) correctly identifies the HTTP 422 status code semantically
  2. The policy name check remains exact as needed
  3. The "denied" check becomes case-insensitive and flexible for variations like "denied request" vs "denied the request"

Thanks for catching my error about the HTTP status code! 🐰

ᕕ( ᐛ )ᕗ


func verifyCAPIMachineRemoved(cl client.Client, machineName string) {
By(fmt.Sprintf("Verifying the CAPI Machine %s is removed", machineName))
Eventually(komega.Get(&clusterv1.Machine{
Expand Down