ReconcilerAction should be an enum #317

ctron · 2020-09-02T15:43:53Z

IMHO ReconcilerAction should be an enum, rather than a struct. This would handling the outcome much simpler.

I would imagine something like:

pub enum ReconcilerAction {
  Done,
  Requeue,
  RequeueAfter(Duration),
}

Or maybe:

pub enum ReconcilerAction {
  Done,
  Requeue(Option<Duration>),
}

This would allow you to handle outcomes like this:

match do_reconcile() {
  None => {},
  Requeue => {},
  RequeueAfter(duration) => {},
}

The text was updated successfully, but these errors were encountered:

nightkr · 2020-09-02T16:26:13Z

From the reconciler's PoV all of those cases are already supported. (Done => requeue_after: None, Requeue => requeue_after: Some(Duration::zero()), RequeueAfter => Some(duration)). I'm not sure I'm comfortable introducing a distinction between "requeue soon" and "requeue now", especially considering that the old Soon(now) representation would still work too.

Do you have a specific use-case in mind that would require this?

ctron · 2020-09-02T17:01:58Z

From the reconciler's PoV all of those cases are already supported.

I didn't say that anything isn't support. I just looks awkward to me from a Rust perspective.

Take the Result for example, that is an enum as well and makes it easy to handle the outcome e.g. with a match.

Do you have a specific use-case in mind that would require this?

I think the applier function could benefit from it. Making it easier to understand how things get processed.

The same for the actual implementation returning a result:

fn reconcile() -> Result<ReconcilerAction> {
  Ok(ReconcileResult{requeue_after: None})
}
// or
fn reconcile() -> Result<ReconcilerAction> {
  Ok(ReconcileResult::default())
}

That looks simple awkward to me, versus:

fn reconcile() -> Result<ReconcilerAction> {
  Ok(Done)
}

So again, it is not about "making it work", it is about making a nice API that aligns with Rust concepts.

nightkr · 2020-09-07T13:31:03Z

Take the Result for example, that is an enum as well and makes it easy to handle the outcome e.g. with a match.

Yeah.. but as a user you should rarely be matching on the ReconcileResult anyway.

The same for the actual implementation returning a result:
That looks simple awkward to me, versus:

Disabling periodic reconciliation should (IMO) feel somewhat awkward, since it's rarely something that you want to do. External systems change, and it's easy to forget to watch some subresource. Periodic reconciliation helps mitigate both of those cases (but is of course strictly worse than a proper watch).

There are only a few cases where you are truly done with a resource, and legitimately don't care anymore (for example: a Job that has completed).

nightkr · 2020-09-07T13:34:07Z

I could agree that it would be slightly more clear to have something like:

struct ReconcilerAction {
    requeue: ReconcileDelay,
}
enum ReconcileDelay {
    Never,
    After(Duration),
}

But that seems like a very marginal gain to me. I'm not so sure about making ReconcilerAction itself into an enum, since any future actions that we'd want to enable would likely not be mutually exclusive with periodic reconciliation.

Pscheidl · 2020-11-10T18:26:25Z

I got interested in this issue :) Well, I created a small showcase pullrequest to better imagine how the change would look like. #338

I don't have all the @teozkr has,but with my limited knowledge from user's perspective, the enum is also what I would expect and the struct was a little bit unexpected. Secondly, you mention there will be a lot more actions in the future, mutually exclusive with re-scheduling the reconciliation after some delay. That seems fine to me, as there will one be one enum level with rescheduling. The ReconcilerAction enum could in future look like:

pub enum ReconcilerAction {
    Requeue(Duration),
    None,
    ActionX,
    ActionY,
    ActionZ,
    SomeOtherDelayedTaskThatOccured(Duration),
    ///etc
}

Also, the original proposal had Requeue and RequeueAfter levels, I think this could be solved with just one RequeueAfter(Duration) - just like the struct does it.

Just my two cents.

nightkr · 2020-11-11T12:29:34Z

Secondly, you mention there will be a lot more actions in the future, mutually exclusive with re-scheduling the reconciliation after some delay.

I said they will likely not be mutually exclusive with rescheduling.

Apart from that, I don't want to promote "never requeue" as the common or easy option.. the existing ReconcilerAction API is already way to close to that..

clux · 2020-11-11T12:42:25Z

Thanks a lot @Pscheidl . I agree with the enum being the more expected solution compared to master, and your PR illustrates that.

This does intersect with another case that was in development; if we wanted to add a case like ReconcilerAction::AutomaticRequeue, and use backoff crate internally to retry at increasing intervals, then this pr would clash with @teozkr 's work in #314. I had a revisit of the state of that PR now, which has legs, but it comes with some complexity.

I feel that either we expose backoff as the main setup to deal with rescheduling (#314 - which uses backoff::constant as a new way of doing RequeueAfter), or we don't deal with backoffs at all internally (and perhaps add more modes).

So the question I think is really, what are our use cases for other ReconcilerActions? Some thoughts:

None - should really not be the default, and should come with a warning (as teo says ^)
RequeAfter(static_duration) - 300s default should probably be the main default, as per controller-runtime
Manual - and instead we have a user method to inject a requeue, perhaps in response to an external web event?
ExponentialBackoff(MinimumTime) - using ideas from or all of Add Backoff-based error policy #314

Which of these are the most useful?

Pscheidl · 2020-11-11T19:16:15Z

The PR of mine can be discarded, it was meant just to try it out (talk is cheap, I wanted the code). My apologies to @teozkr for misreading the info above.

How will the exponential backoff work ? Ideally, as far as I understand, the controller should be failure-tolerant. E.g. finalizers stay in the resource metadata unless explicitely removed by the controller, making sure if the controller is down, nothing happens. There seems to be a hashmap in this PR: https://github.com/clux/kube-rs/pull/314/files#diff-e8024b5d2267b4aa47ca3f4fcc5d2338edbee63a282057b55fe3b13ebc28cabaR137

If the controller fails, the time of last reconciliation is lost. That's acceptable probably. Not sure. Just thinking out loud.

Another thought is the RequeueAfter - if we were somehow able to persist the times in the Kubernetes object itself, it would be easy provide users with the option to specify their own function, which would provide time of next reconciliation. This seems like a generic way to solve many problems - instead of having two levels of that enum just for requeing after certain amount of time, there could be one accepting a RequeueStrategy enum, which would provide some default options like exponential or static, and there would always be the Custom option to provide with your own fn.

Anyway, I do think exponential backoff is useful, as working deployments need to be checked periodically, often they can fail (in ours cases, users might cause them to fail) and resources have to be cleaned in case of failure. Such a period can be prolonged after a while, not to stress Kubernetes that much. At least that's how I think about it - I always try for the code to be nice to its surroundings.

clux · 2022-03-17T09:28:54Z

Revisiting this after writing docs. It's clear that this interface is still one of the awkward ones, and it's front and centre, so would like to address this.

In the lifespan of this issue we have NOT had any other reasonable requests for other modes of reconcile actions to be taken. Controller-runtime has exactly mode (requeAfter) to configure - albeit with an awkward bool to set it. The only potential option that makes sense to me would be a backoff based one, but I think we already support this by adding a Backoff struct to the Context and bumping it from error_policy (and resetting it from reconcile).

Ultimately, the two options we have here are either:

requeue with a duration
wait for a (occasionally miss-able due to desyncs) change

So I propose that rather than have the escape hook be a magic requeueAfter: None have a documented enum variant with a longer name, that lets the owner bear the responsibility for retaining eventual consistency. E.g. like having a hook for reconcile_all_on or having a fast moving object.

I.e.:

pub enum Action {
    /// When to next trigger the reconciliation if no external watch triggers hit
    ///
    /// This is the best-practice action that ensures eventual consistency of your controller
    /// even in the case of missed changes (which can happen).
    ///
    /// Watch events are not normally missed, so running this once per hour as a fallback is reasonable.
    Requeue(Duration),

    /// Do nothing until a change is detected
    ///
    /// This stops the controller periodically reconciling this object until a relevant watch event
    /// was **detected**.
    ///
    /// **Warning**: If you have watch desyncs, it is possible to miss changes entirely.
    /// It is therefore not recommended to disable requeuing this way, unless you have
    /// frequent changes to the underlying object, or some other hook to retain eventual consistency.
    AwaitChange,
}

as a benefit we get the much snappier reconcile impls:

-    Ok(ReconcilerAction {
-        requeue_after: Some(Duration::from_secs(300)),
-    })
+    Ok(Action::Requeue(Duration::from_secs(300)))

nightkr · 2022-03-17T09:43:22Z

How about adding associated functions, so it'd be Ok(Action::requeue(Duration::from_secs(300))? That way we would get the added convenience for the simple cases, while still not locking us into having a single action forever.

clux · 2022-03-17T09:48:04Z

Sure, that makes sense to me. Also makes the change in applier easier.

fixes #317 Signed-off-by: clux <[email protected]>

* Change ReconcileAction to Action and add associated ctors fixes #317 Signed-off-by: clux <[email protected]> * clippy Signed-off-by: clux <[email protected]> * clippy + docs Signed-off-by: clux <[email protected]> * fix extraneous doc warnings in entry Signed-off-by: clux <[email protected]> * remove Default for Action Signed-off-by: clux <[email protected]> * fix secret-syncer Signed-off-by: clux <[email protected]> * fix remaining references to reconciler action Signed-off-by: clux <[email protected]>

Pscheidl added a commit to Pscheidl/kube-rs that referenced this issue Nov 10, 2020

kube-rs#317 - ReconcilerAction should be an enum

4d7b316

Pscheidl mentioned this issue Nov 10, 2020

#317 - ReconcilerAction should be an enum #338

Closed

clux added a commit that referenced this issue Mar 17, 2022

Change ReconcileAction to Action and add associated ctors

90dbd17

fixes #317 Signed-off-by: clux <[email protected]>

clux mentioned this issue Mar 17, 2022

Change ReconcileAction to Action and add associated ctors #851

Merged

clux closed this as completed in #851 Mar 17, 2022

clux added runtime controller runtime related ergonomics ergonomics of the public interface labels Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReconcilerAction should be an enum #317

ReconcilerAction should be an enum #317

ctron commented Sep 2, 2020

nightkr commented Sep 2, 2020

ctron commented Sep 2, 2020

nightkr commented Sep 7, 2020

nightkr commented Sep 7, 2020

Pscheidl commented Nov 10, 2020 •

edited

Loading

nightkr commented Nov 11, 2020

clux commented Nov 11, 2020 •

edited

Loading

Pscheidl commented Nov 11, 2020

clux commented Mar 17, 2022

nightkr commented Mar 17, 2022

clux commented Mar 17, 2022

ReconcilerAction should be an enum #317

ReconcilerAction should be an enum #317

Comments

ctron commented Sep 2, 2020

nightkr commented Sep 2, 2020

ctron commented Sep 2, 2020

nightkr commented Sep 7, 2020

nightkr commented Sep 7, 2020

Pscheidl commented Nov 10, 2020 • edited Loading

nightkr commented Nov 11, 2020

clux commented Nov 11, 2020 • edited Loading

Pscheidl commented Nov 11, 2020

clux commented Mar 17, 2022

nightkr commented Mar 17, 2022

clux commented Mar 17, 2022

Pscheidl commented Nov 10, 2020 •

edited

Loading

clux commented Nov 11, 2020 •

edited

Loading