-
Notifications
You must be signed in to change notification settings - Fork 4.8k
degraded TNF testing - pdb + MCO reboot validation #30510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: nhamza <[email protected]>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Neilhamza The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/payload-job periodic-ci-openshift-release-master-nightly-4.21-e2e-metal-ovn-two-node-fencing-recovery-techpreview |
|
@eggfoobar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/3aa42b50-c656-11f0-9bed-855fb2bc4f66-0 |
|
@Neilhamza: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
eggfoobar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far, we will need to rebase once Jeremy's PR merges in, then we can use some of those helpers here. Left some comments and questions
| ensureTNFDegradedOrSkip(oc) | ||
| }) | ||
|
|
||
| g.It("PDB minAvailable=1 should allow a single eviction and block the second in TNF degraded mode [apigroup:policy]", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets change this wording to
should allow a single eviction and block the second in TNF degraded mode when PDB minAvailable=1 [apigroup:policy]
Also is TNF degraded mode a specific status or can we remove the aroncym TNF from this test name?
| o.Expect(err).NotTo(o.HaveOccurred()) | ||
| o.Expect(currentPDB.Status.DisruptionsAllowed).To(o.Equal(int32(0)), "expected disruptionsAllowed=0 after second eviction attempt") | ||
| }) | ||
| g.It("should block a reboot-required MachineConfig rollout on the remaining master in TNF degraded mode [Serial] [apigroup:machineconfiguration.openshift.io]", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question regarding TNF degraded mode, and should both of these be Serial tests?
| deploy, err := createPauseDeployment(ctx, kubeClient, ns, pdbDeploymentName, 2, labels) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) | ||
|
|
||
| err = waitForDeploymentAvailable(ctx, kubeClient, ns, deploy.Name, 2, 3*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets use the util helper here, https://github.com/openshift/origin/blob/main/test/extended/util/deployment.go#L58
| return client.PolicyV1().PodDisruptionBudgets(ns).Create(ctx, pdb, metav1.CreateOptions{}) | ||
| } | ||
|
|
||
| func waitForDeploymentAvailable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here, using the https://github.com/openshift/origin/blob/main/test/extended/util/deployment.go#L58 should help remove this
| Namespace: ns, | ||
| }, | ||
| Spec: policyv1.PodDisruptionBudgetSpec{ | ||
| MinAvailable: intOrStringPtr(intstr.FromInt(minAvailable)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets use the library "k8s.io/utils/ptr" that's already being pulled in origin, so this should just change to ptr.To(intstr.FromInt(minAvailable)
| return client.CoreV1().Pods(pod.Namespace).EvictV1(ctx, eviction) | ||
| } | ||
|
|
||
| func intOrStringPtr(v intstr.IntOrString) *intstr.IntOrString { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should then be removed once we use the utils package
| Labels: labels, | ||
| }, | ||
| Spec: corev1.PodSpec{ | ||
| Containers: []corev1.Container{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets update this so we use the shell images ref, this will avoid any issues pulling down the image in the future,
try something like this, borrowed from the cpu_partitioning
import (
...
"github.com/openshift/origin/test/extended/util/image"
...
)
...
Containers: []corev1.Container{
{
Name: "busy-work",
Image: image.ShellImage(),
Command: []string{
"/bin/bash",
"-c",
`while true; do echo "Busy working, cycling through the ones and zeros"; sleep 5; done`,
},
a new origin file that is dedicated for degraded TNF test cases
in this PR we have 2 new test cases
1-PDB behavior
2- MCO rollouts
the test plan is under [Suite:openshift/two-node] HOWEVER our current prow ci lane for TNF degraded testing runs conformance e2e
meaning we need to create a new lane that targets only two-node suite
new test cases for Degraded TNF can be added to this file