-
Notifications
You must be signed in to change notification settings - Fork 15
fix: feature interval #1238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: feature interval #1238
Conversation
Co-authored-by: Simon Hornby <[email protected]>
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
| pub async fn start_refresh_features_background_task( | ||
| refresher: Arc<FeatureRefresher>, | ||
| refresh_state_rx: Receiver<RefreshState>, | ||
| mut rx: Receiver<RefreshState>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small change but you can just declare your param as mut, it's the same as the mutable shadow that was previously being used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still call it refresh_state_rx for readability? To know at a glance what it's receiving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, good suggestion!
| let failure_count: u32 = min(self.failure_count + 1, 10); | ||
| let now = Utc::now(); | ||
| let next_refresh = calculate_next_refresh(now, *refresh_interval, failure_count as u64); | ||
| let base = self.next_refresh.unwrap_or(now); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would previously update the next check time to be now + poll interval but this is called after we've polled so this was causing the poll interval to drift by the latency to upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our thought earlier was that this was fine, because 100/200ms delay for a refresh was no biggie.
| } | ||
|
|
||
| impl TokenRefreshStatus for TokenRefreshSet { | ||
| fn next_due_refresh(&self) -> Option<Instant> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find time that the next token needs to be refreshed. If there's a token that's never been refreshed, it should be refreshed now, although I'm pretty sure that can't happen right now
|
|
||
| let due_now = refresher.tokens_to_refresh.get_tokens_due_for_refresh(); | ||
| if !due_now.is_empty() { | ||
| for token in due_now { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still icky. I'd honestly like this to move this to a Future that's composed of a never-ending stream of token batches to refresh but that breaks a whole bunch of the backoff code (which incidentally would probably be easier to handle as a a Future that's composed of a never-ending stream of token batches to refresh). That's too big to do in a bugfix PR imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think this is a part of rust/code maturity, what you're describing sounds somewhat similar to what we would've done if we were to write it now.. It feels like a signal/effect system wanting to break out... :P
| // but our modelling is a little wonky here | ||
| .unwrap_or_else(|| Instant::now() + Duration::from_secs(5)); | ||
|
|
||
| tokio::select! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we're in a nice place. We either have a sleep until next duration or we're in a sleeping state pending being woken up. If either of these futures completes (we're woken or the timeout expires), we can immediately wake up and do our thang
| mut rx: Receiver<RefreshState>, | ||
| ) { | ||
| let mut rx = refresh_state_rx; | ||
| let mut alive = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably fine, but we're assuming, right? However if we wanted to be super sure, couldn't we do something like let mut alive = *rx.borrow() == RefreshState::Running;?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard my suggestion, I see now that this has a different meaning. The refresher will always start alive.
| if rx.changed().await.is_err() { | ||
| alive = false; | ||
| } | ||
| continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're alive and paused, won't this loop right away, indefinitely? Probably okay, but just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should block on this rx.changed().await until we get a change notification
nunogois
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| let now = chrono::Utc::now(); | ||
| self.iter() | ||
| .filter_map(|e| e.value().next_refresh) | ||
| .map(|t| (t - now).to_std().unwrap_or(Duration::from_secs(0))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the map to std here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, std implementes Ord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, but so does Chrono's TimeDelta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that it matters that much, just curious to learn why you chose to map it.
chriswk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a 👍 from me as well
Fixes a few interesting and subtle bugs in the feature refresh code. Previously we were always adding 5 seconds onto the poll interval for fetching features, so a poll interval of 10 seconds was actually 15 seconds. We were also not ever accounting for latency to upstream so our polling interval was more like
poll_interval+latency+5 seconds