-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add backoff to watcher reconnection #577
Comments
Pasting comments here for people off discord:
Do we think the solution here is to try to bake an exponential backoff (btw one way I have gotten around on this in the |
I think it would make sense to just have |
If that's the case then I think we should consider creating builders for |
rewording a bit; the I do still think there's room for a |
To be honest, I don't really see the problem with the import, Rust has been smart enough to do the right thing in all the cases I've tried. |
Rustdoc needs explicit disambiguation when linking to either module or the fn. Generally i just think we should organise it in the way that is less confusing: -use runtime::{reflector, reflector::Store};
+use runtime::cache::{reflector, Store}; relatively minor point though. |
That makes sense to me. |
moves the imports of kube_runtime reflector around a bit first as per comments in #577 then implements a slightly more ergonomic builder struct `Cache<K>` that wraps `reflector`. This encapsulates the writer and result unpacking within the struct, and creates 3 ways to consume it: - `Cache::run(self)` - if wanting to run forever - `Cache::applies(self)` - also exposes flattened applies stream - `Cache::touches(self)` - also exposes flattened touches stream Signed-off-by: clux <[email protected]>
Coming back to this after thinking more about it in #698 If we put a if RBAC is wrong, then no amount of retrying should occur - we need to bubble this up from the I can't think of any other non-ephemeral issues that can cause a watcher retry loop to happen, but if there are we should also catch them. So, what we should probably try to do:
The last point here because most users thing |
I'm not so sure that we can make a hard distinction here. For example:
And so on, and so forth. Ultimately, we need to percolate the problem up to the administrator and let them handle the problem. And even if it is a human-caused misconfiguration error, it'd still be useful to retry since we probably won't get notified once the issue is resolved. |
True. These configuration errors can happen, and it's nice if we can just keep retrying through it. But it's also important for us to provide that first signal that something has gone wrong upstream (like in the load balancer policy example - if we silently try to retry for that, then we'll never get correct percolation). I checked
So these would be up to users to handle atm. Pre- But this means that means the RBAC retry we get from If we inject backoffs into |
* implement backoff for watcher - for #577 Signed-off-by: clux <[email protected]> * move magic number into strategy Signed-off-by: clux <[email protected]> * expose backoff from watcher and semi-propagate into controller awkward. will write a comment Signed-off-by: clux <[email protected]> * potential abstraction Signed-off-by: clux <[email protected]> * another builder layer; allow eliding ListParams Signed-off-by: clux <[email protected]> * forgot to add file Signed-off-by: clux <[email protected]> * easy parts of code review Signed-off-by: clux <[email protected]> * rewrite as a helper (take N) jesus this stuff is hard. Signed-off-by: clux <[email protected]> * rename as suggested Signed-off-by: clux <[email protected]> * Reimplement watcher backoff as FSM (#720) * Fix clippy warnings Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Reimplement watch backoff as FSM Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Remove useless lifetime bounds Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Silence clippy size warning Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Silence clippy properly this time around Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Split StreamBackoff into a separate utils module Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Backoff tests Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Add stream close test Signed-off-by: Teo Klestrup Röijezon <[email protected]> * remove backoff pin, fix docs Signed-off-by: clux <[email protected]> * newline Signed-off-by: clux <[email protected]> * Add `Backoff` wrapper that implements client-go's reset timer behaviour (#729) Signed-off-by: Teo Klestrup Röijezon <[email protected]> * use new reset backoff and replicate client-go reflector values Signed-off-by: clux <[email protected]> * fix node watcher example Signed-off-by: clux <[email protected]> * Use released `backoff` Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Factor out default `Backoff` Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Add note to `watcher` about backoff Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Added backoff to Controller Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Changelog Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Revert `Observer` for now Signed-off-by: Teo Klestrup Röijezon <[email protected]> * The clippyman comes for us all, eventually And we must all pay our due respects, or pay the price. Signed-off-by: Teo Klestrup Röijezon <[email protected]> * Fix build warnings Signed-off-by: Teo Klestrup Röijezon <[email protected]> * remove backoff_watch Signed-off-by: clux <[email protected]> * doc tweaks Signed-off-by: clux <[email protected]> * sentence Signed-off-by: clux <[email protected]> * upgrading backoff is not actually breaking Signed-off-by: clux <[email protected]> Co-authored-by: Teo Klestrup Röijezon <[email protected]> Co-authored-by: Teo Klestrup Röijezon <[email protected]>
Currently
watcher
will always keep a reconnection attempt in-flight when polled (and there is no active connection already). That's a bit excessive, and pretty unkind to the apiserver.This isn't too bad for
watcher
itself since you can add a delay on the consumer side instead, but we should at least add something forController
...The text was updated successfully, but these errors were encountered: