-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for retry logic - Min Worker Node Err #3464
base: master
Are you sure you want to change the base?
Conversation
cnd, cndErr := c.f(ctx) | ||
if errors.Is(cndErr, wait.ErrWaitTimeout) { | ||
cnd, retry, cndErr := c.f(ctx) | ||
if errors.Is(cndErr, wait.ErrWaitTimeout) && !retry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this extra retry
over the existing retry logic this has?
Also this might be helpful with the CI: https://redhat-internal.slack.com/archives/C02ULBRS68M/p1710774666598769 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it lacks the explanation how is the retry
handled.
At first I thought that it indicates whether the error is retryable or not, but the logic doesn't seem so.
If you want to tell the wait.Poll*
function if the condition is retryable, ConditionFunc
can return (false, nil) for retryable error and (false, err) for non-retryable error.
https://pkg.go.dev/k8s.io/apimachinery/pkg/util/wait#PollImmediateUntil
Please rebase pull request. |
What's the current state of this? Can I help? |
Which issue this PR addresses:
https://issues.redhat.com/browse/ARO-4247
Fixes
Report meaningful error for "minimum worker nodes not created" errors
What this PR does / why we need it:
As of now RP is providing only generic error if the cluster install got failed due to insufficient number of worker nodes. If meaningful error or specific error is provided then will be easy to correct it and proceed the cluster installation.
Test plan for issue:
Is there any documentation that needs to be updated for this PR?