-
-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds watch-list implementation without breaking changes #1255
Conversation
Hey, thanks a lot for this, and while this is going further than the original help-wanted issue (as the runtime is a little less clear), I appreciate the taking the initiative in an open-minded way with several options. I'll leave some comments on the PR, but some high level comments first: Personally, I am leaning towards option 3 also for the
Note also that in general the aim is not to abstract away the k8s api. Particularly, kube-core and kube-client should just do the k8s api (where copy-pasted doc-strings are often a thing for the same parameters). So i don't think 3 and 4 are necessarily in conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick points
e2e/Cargo.toml
Outdated
latest = ["k8s-openapi/v1_26"] | ||
latest = ["k8s-openapi/v1_27"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a just bump-k8s
target in the justfile which will catch everything (and bump the MK8SV in the process) if you manually updated it might be worth checking that to ensure you catch the stragglers (such as .github/workflow pins and README pins)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI You can do "k8s-openapi/latest"
now.
Thanks for the feedback, I'll apply your suggestions manually |
I was thinking that because it's an API change, we could create a watcher2 or streaming_watcher, that has the event contract from option 3. That way we can mark the previous implementation as deprecated, and carry it forward a few versions before removing it. That would give people who depend on this library time to move to the new implementation instead of it being forced on them. |
Initially, I feel it's probably better to keep it as a single entrypoint. The breakage is mostly to a lower level interface (that is mostly relevant to custom store implementors - for whom handling the new event should not be too hard) and not the watcher itself, and it's nice to just be able to flick a switch in the config that's available immediately rather than find a new import. There is potentially another change in the watcher that might make its way in (arc wrapping objects - a remainder from #1080), maybe if both of those need to land we can copy out an old version and do a full deprecation cycle. I'll see where we are there, but in the mean time, don't worry about it in this PR, that's more a release problem. |
I think option 3 makes sense as a longer-term goal, but I'd rather get this PR in using option 1 and then focus on moving the buffering into |
b3fa2d9
to
fd8d6e9
Compare
I've addressed the comments |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1255 +/- ##
==========================================
- Coverage 72.66% 72.15% -0.51%
==========================================
Files 75 75
Lines 6186 6249 +63
==========================================
+ Hits 4495 4509 +14
- Misses 1691 1740 +49
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General setup looks good! Some issues with resource versions, missing version gates, and partially bumped k8s versions. It's also currently not working on the 1.27 cluster i tested this on:
2023-07-22T21:10:53.985375Z DEBUG kube_runtime::utils::stream_backoff: Error received, backing off deadline=Instant { tv_sec: 37803, tv_nsec: 770027883 } duration=85.857327ms
Error: watch stream failed: ApiError: ListOptions.meta.k8s.io "" is invalid: resourceVersionMatch: Forbidden: resourceVersionMatch is forbidden for watch unless sendInitialEvents is provided: Invalid (ErrorResponse { status: "Failure", message: "ListOptions.meta.k8s.io \"\" is invalid: resourceVersionMatch: Forbidden: resourceVersionMatch is forbidden for watch unless sendInitialEvents is provided", reason: "Invalid", code: 422 })
Caused by:
0: ApiError: ListOptions.meta.k8s.io "" is invalid: resourceVersionMatch: Forbidden: resourceVersionMatch is forbidden for watch unless sendInitialEvents is provided: Invalid (ErrorResponse { status: "Failure", message: "ListOptions.meta.k8s.io \"\" is invalid: resourceVersionMatch: Forbidden: resourceVersionMatch is forbidden for watch unless sendInitialEvents is provided", reason: "Invalid", code: 422 })
1: ListOptions.meta.k8s.io "" is invalid: resourceVersionMatch: Forbidden: resourceVersionMatch is forbidden for watch unless sendInitialEvents is provided: Invalid
when using
diff --git a/examples/pod_watcher.rs b/examples/pod_watcher.rs
index e4e79fc4..a485892f 100644
--- a/examples/pod_watcher.rs
+++ b/examples/pod_watcher.rs
@@ -13,7 +13,7 @@ async fn main() -> anyhow::Result<()> {
let client = Client::try_default().await?;
let api = Api::<Pod>::default_namespaced(client);
- watcher(api, watcher::Config::default())
+ watcher(api, watcher::Config::default().streaming_lists().any_semantic())
.applied_objects()
.default_backoff()
.try_for_each(|p| async move {
cb41dc3
to
e835c70
Compare
I'll add a namespace example reflector that shows the watch list behavior. Do I feature gate all the doc comments I made? |
I've added an example now and updated the k3d configuration that enables the feature gate |
e61da70
to
145a1cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does actually work for me now! Thanks for the k3d arg.
Left a few more suggestions on serialization/builders/configuration docs. This is veering a bit more into the whole philosophical discussion of "making invalid states unrepresentable" (which is something we want to do for the abstractions, but we not necessarily something we sensible can completely disallow at the serialization layer because we don't necessarily know what comes next - or if Kubernetes' docs are correct/will stay correct).
Have in general suggested moving our own safety checks and preconfigured "good defaults" into validators + and added some suggestions for watcher::Config
to make the correct choice more clear, and less implicit.
(There may be some follow-up issues that are worth raising here on our end as well; maybe we should not allow disabling bookmarks in watcher, and maybe we shouldn't be hard-enforcing a timeout of 295s in validate.)
@@ -347,6 +384,7 @@ impl Default for WatchParams { | |||
label_selector: None, | |||
field_selector: None, | |||
timeout: None, | |||
send_initial_events: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a sensible default for now. I think we should maybe add a new constructor that sets correct flags rather than mess with the builders/serialization. Something like:
impl WatchParams
/// Constructor for doing Kubernetes 1.27 Streaming List watches
///
/// Enables [`VersionMatch::NotGreaterThan`] semantics and [`WatchParams::send_initial_events`].
pub fn streaming_lists() -> Self {
Self {
send_initial_events: true,
version_match: Some(VersionMatch::NotGreaterThan), // required
bookmarks: true, // required
..WatchParams::default()
}
}
@casualjim are you still working on this? I see you've gone down the rabbit hole of fiddling with resolver things and making CI throw a fit. There are a few outstanding pr comments to be resolved, so please let me know your intentions. Your PR is ultimately very good and it's heavily appreciated (even though we are stuck here on minutia and CI stuff atm). There's nothing really blocking from the |
Cool. I'll release it this weekend then. |
I would like to get past this hurdle of CI yes. Unfortunately I have to do planning and roadmap work for my project (this library is part of said project) and will need to spend a lot of time in google docs instead of in code this weekend. The failures are about building k8s-openapi so I'm going to wait for a release on that. I think the tests etc passed when I ran them locally so hopefully this will build here too. If not I can take a look again next weekend perhaps. |
8b80567
to
651b526
Compare
I don't know what to do with CI. So it's difficult to get this to work for me atm. This could use some help from somebody who knows the entire toolchain |
I think a lot of the CI failures atm is due to resolver (because it's getting the multiple |
I undid the resolver = 2 thing, hopefully that makes it go further. When I run locally most things seem to succeed but there is one thing that can not find kubectl and it will just hang on that |
0d6d310
to
da4885d
Compare
Signed-off-by: Ivan Porto Carrero <[email protected]>
da4885d
to
a567547
Compare
Had another look over of this and did another test, and this seems fine to me. Resolved a few comments as per strategy in https://github.com/kube-rs/kube/pull/1281/files (which is nice to be able to justify as it does make invalid states unrepresentable). One doc nit remains (config member doc suggestion), and the other two threads are just notes for the future. I am otherwise happy with this. Easy to opt-in, hard to do the wrong thing, works well, CI now green. Clean addition. Thanks a lot for doing this! |
I updated the doc and added that extra constructor method |
Thanks, looks good! But needs the DCO signoff again now. |
Signed-off-by: Ivan Porto Carrero <[email protected]>
1a15dcf
to
85eda87
Compare
Motivation
Implement support for watch-list.
We have 18000 namespaces in our cluster, we store custom resources where some collections have large schemas and the cardinality is also in the 100k's. For example, when a client doesn't use pagination our API server memory usage can balloon up to 2-300 GB under certain conditions
Solution
The most important goal is to reduce memory pressure on the API server. It would be nice to be able to reduce it on the client side too, in that case, we also care about knowing when the initial sync and or resync is in progress or at the very least completed.
Since this is for a new k8s version, I think it might be ok to change the watch event contract slightly. I've come up with a couple of approaches
Option 1: do nothing, still accumulate everything in a vec and emit
Event::Restarted
as the first event, when this streaming model is enabled. The name of the event is a bit weird, and it could maybe be renamed to express intent better although that would introduce breaking changesOption 2: Modify the applied event to have an option or an enum value to indicate if we're still in the initial listing state
Option 3: Introduce 2 new events, one that indicates a relist is starting and one that indicates the relist is completed, this can then remove the need for the Restarted event because these 2 new events give enough signal.
Option 4: stop being helpful and just expose the same events as the plain k8s API
Personally I'm in favor of option 3. Option 3 will allow for keeping client-side memory usage low, and it would allow for flagging a store as in or out of sync.
This PR currently holds option 1
This is a draft, so it's not signed. I'll work through what is required to submit PR's from my work email for the final PR. I just wanted to gather feedback.
closes: #1254