-
Notifications
You must be signed in to change notification settings - Fork 67
🐛 (fix): unhandle changes for crd upgrade safety ( OCPBUGS-59518 ) #2179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
🐛 (fix): unhandle changes for crd upgrade safety ( OCPBUGS-59518 ) #2179
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
It seems to be the real fix for OCPBUGS-59518. Moreover, IMHO we should perform additional checks and carefully evaluate the errors in those unhandled scenarios. It might be worth verifying if a newer version of the library addresses this, or if there are other improvements we could apply. In short, we should confirm whether there’s truly nothing else we can do here to make the handling more robust. |
7cd2923
to
3e12e5f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2179 +/- ##
==========================================
+ Coverage 72.05% 72.31% +0.26%
==========================================
Files 85 85
Lines 8405 8506 +101
==========================================
+ Hits 6056 6151 +95
Misses 1948 1948
- Partials 401 407 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
3e12e5f
to
3932b0e
Compare
if result.Name == "unhandled" { | ||
msg = "unhandled changes found" | ||
} | ||
errs = append(errs, fmt.Errorf("%s: %s: %s: %s", version, property, result.Name, msg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like we might be occluding the error. Is it possible to format it it in a better way? Then maybe at the end we limit the number of errors we print out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I could understand your suggestion here.
Could you please clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main point is that, the way it is, it's kinda just an HTTP 500 without any explanation. As a user, I don't know what to do with that. Maybe this is an issue with the underlying library we are using and the fix needs to go there. But as a user, I would like to know: what, where, why, and how to fix it.
The second part of the comment can be ignored as a brain fart. I was in the context of the "we're generating too many errors" problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed, it looks like an unexpected error. Maybe we could explore whether the library can handle fewer scenarios as unhandled cases. That said, you raise a great point — I was able to extract some information from the JSON, and I think this is the best we can do for now. What do you think?
when the error occurs, we can’t update the CR with the status. This leaves the user without any visibility into why the upgrade didn’t happen. Here’s the bug for reference: https://issues.redhat.com/browse/OCPBUGS-59518 so this PR should fix it.
- Keep unhandled spec changes as errors; message: "unhandled changes found" Assisted-by: Cursor
3932b0e
to
44cef57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@camilamacedo86 is it possible to get a
Problem : __________________________ (eg before this PR, crd upgrade safety feature did not do _____ right)
Solution: _________________________
statement in the description?
Helps with reviewing the PR.
@anik120 can you please check the description? |
Problem
We use the crddiff (https://github.com/kubernetes-sigs/crdify) library to check API differences and ensure CRD upgrade safety. See: CRD upgrade safety docs
When crddiff cannot handle the diff, it reports an unhandled scenario.
In this case, a full JSON diff is returned which causes the problem
Too long: may not be more than 32768 bytes</code>
-As a result:
More info: OCPBUGS-59518
Solution
Instead of outputting the full JSON, we:
This allows:
Example (from unit tests):
This way:
Note
As discussed, any further improvement beyond this would need to be done directly in the crddiff library, to reduce the number of unhandled scenarios.
Reviewer Checklist