-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-14346: Fix when DNS operator reports Degraded #373
base: master
Are you sure you want to change the base?
OCPBUGS-14346: Fix when DNS operator reports Degraded #373
Conversation
@candita: This pull request references Jira Issue OCPBUGS-14346, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e077885
to
e6cc811
Compare
/jira refresh |
@candita: This pull request references Jira Issue OCPBUGS-14346, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@candita: This pull request references Jira Issue OCPBUGS-14346, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ad5a5f1
to
bfd73b8
Compare
@candita: This pull request references Jira Issue OCPBUGS-14346, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @gcs278 |
bfd73b8
to
950992b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm following discussion in #wg-operator-degraded-condition, but wanted to add this one comment initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still trying to wrap my head around the issue
/remove-lifecycle rotten |
/jira refresh |
@candita: This pull request references Jira Issue OCPBUGS-14346, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/tide refresh |
/test e2e-aws-ovn-techpreview |
@candita I did a pre-merge testing and here is the result.
I saw different behavior in latest 4.16 nightly and cluster build using this PR.
Then removed the dns pod.
So the in the cluster using this PR, the DNS pod is not going through degrade state and providing wrong message even though we have a DNS pod available. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/lifecycle frozen |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@candita: This pull request references Jira Issue OCPBUGS-14346. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/reopen |
@candita: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@candita: The In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@candita: This pull request references Jira Issue OCPBUGS-14346, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@candita: This pull request references Jira Issue OCPBUGS-14346, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/remove-lifecycle rotten |
@candita: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Fix when DNS operator reports Degraded. Incorporate expected conditions and a grace period to allow it to be faulty for the tolerated duration (40s) before transitioning to Degraded. Don't allow the cluster operator status to be Progressing while Degraded.
Add packages like those used in the ingress controller to compare expected conditions and to use retryable errors.
Use the same heuristics on node resolver pod count as dns pod count.
Add unit test for computing degraded condition. Fix unit tests that expect Degraded to be true while Progressing is true, making sure that some observe a sense of time by adding variable previous conditions.