Skip to content

Commit

Permalink
increase machine health check node unready timeout to 15m (Azure#3133)
Browse files Browse the repository at this point in the history
* increase machine health check node unready timeout to 15m

* update mhc docs

* increase machine health check node startup timeout to 25m
  • Loading branch information
s-amann authored and SrinivasAtmakuri committed Sep 18, 2023
1 parent a71394c commit 1479169
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
4 changes: 2 additions & 2 deletions pkg/operator/controllers/machinehealthcheck/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ aro.machinehealthcheck.managed
- When set to false, the controller will attempt to remove the aro-machinehealthcheck CR and the MHC Remediation alert from the cluster.
This should effectively disable the MHC we deploy and prevent the automatic reconciliation of nodes.
- When set to true, the controller will deploy/overwrite the aro-machinehealthcheck CR and the MHC Remediation alert to the cluster.
This enables the cluster to self heal when at most 1 worker node goes not ready for at least 5 minutes and alert when remediation
This enables the cluster to self heal when at most 1 worker node goes not ready for at least 15 minutes and alert when remediation
occurs 2 or more times within an hour.
The aro-machinehealth check is configured in a way that if 2 worker nodes go not ready it will not take any action.
More information about how the MHC works can be found here:
https://docs.openshift.com/container-platform/4.9/machine_management/deploying-machine-health-checks.html
https://docs.openshift.com/container-platform/4.12/machine_management/deploying-machine-health-checks.html
*/
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ spec:
operator: Exists
unhealthyConditions:
- type: "Ready"
timeout: "300s"
timeout: "15m"
status: "False"
- type: "Ready"
timeout: "300s"
timeout: "15m"
status: "Unknown"
maxUnhealthy: "1"
nodeStartupTimeout: "20m"
nodeStartupTimeout: "25m"

0 comments on commit 1479169

Please sign in to comment.