Add Bifrost trim gap handling support by fast-forwarding to the latest partition snapshot #2456

pcholakov · 2024-12-23T14:05:54Z

Closes: #2247

Primary reviewer: @tillrohrmann (since you have worked on these parts most recently)
Optional: @AhmedSoliman (only if you have spare capacity to take a look)

github-actions · 2024-12-23T14:22:32Z

Test Results

7 files ±0 7 suites ±0 4m 20s ⏱️ -12s
47 tests ±0 46 ✅ ±0 1 💤 ±0 0 ❌ ±0
182 runs ±0 179 ✅ ±0 3 💤 ±0 0 ❌ ±0

Results for commit acd5c37. ± Comparison against base commit 25a89c0.

♻️ This comment has been updated with latest results.

This allows us to signal the PPM about log trim gaps that the PP may encounter at runtime, which require special handling.

pcholakov · 2024-12-23T14:11:31Z

crates/worker/src/partition/mod.rs

-                    unimplemented!("Handling trim gap is currently not supported")
-                };
-                anyhow::Ok((lsn, envelope?))
+                if entry.is_data_record() {


At the moment, LogEntry.record is not public, and neither are bifrost::{MaybeRecord, TrimGap} - would we prefer to make those public and use pattern-matching directly?

We can, but it doesn't sound like you need that. See my comments below.

pcholakov · 2024-12-23T14:18:18Z

crates/worker/src/partition_processor_manager/spawn_processor_task.rs

+    let snapshot = match snapshot_repository {
+        Some(repository) => {
+            debug!("Looking for partition snapshot from which to bootstrap partition store");
+            // todo(pavel): pass target LSN to repository


We can optimize this by not downloading a snapshot that's older than the target LSN; I'll tackle this as a separate follow-up PR.

crates/worker/src/partition/mod.rs

pcholakov · 2024-12-23T14:54:33Z

crates/worker/src/partition_processor_manager/spawn_processor_task.rs

+                );
+            }
+
+            // We expect the processor startup attempt will fail, avoid spinning too fast.


This seems reasonable to me. I chose to rather delay and try start again, just in case something has changed in the log - but at this point we're unlikely to get this processor going again by following the log. What's a good way to post a metric that we're spinning?

AhmedSoliman · 2024-12-23T17:01:00Z

crates/worker/src/partition/mod.rs

+                    Ok(stopped) => {
+                        match stopped {


Tip: You can remove one level of nesting:

Ok(ProcessorStopReason::LogTrimGap { to_lsn }) => .... Ok(_) => warn... Err(err) => warn...

Much better, thank you! <3

AhmedSoliman · 2024-12-23T17:06:04Z

crates/worker/src/partition/mod.rs

-                    unimplemented!("Handling trim gap is currently not supported")
-                };
-                anyhow::Ok((lsn, envelope?))
+                if entry.is_data_record() {


We can, but it doesn't sound like you need that. See my comments below.

AhmedSoliman · 2024-12-23T17:06:43Z

crates/worker/src/partition/mod.rs

+                    anyhow::Ok(Record::TrimGap(
+                        entry
+                            .trim_gap_to_sequence_number()
+                            .expect("trim gap has to-LSN"),
+                    ))


Would it make sense to stop the read stream at the first gap and return Err instead?

I understand this to mean the map_ok function translates a trim-gap into an Err(TrimGap {..}) instead? Maybe! The way we use anyhow::Result pervasively makes this a deeper change than I wanted to tackle right away; but it also makes more sense to treat trim gaps as just another record in the stream, with errors reserved for actual failure conditions.

Zooming out a bit, modeling the Partition Processor overall outcome as Result<Canceled | StoppedAtTrimGap, ProcessingError> seems accurate: the Ok / left path is an expected if rare reason to halt; the Err / right path is an exceptional failure condition.

If you have a few minutes, I'd love to hear more about how you'd solve this? I'm certain I am also missing some subtlety around properly consuming the log stream!

We discussed offline and agreed that it's best to represent this case as an error case.

AhmedSoliman · 2024-12-23T17:09:02Z

crates/core/src/task_center.rs

+    pub fn start_runtime<F, R>(
        self: &Arc<Self>,
        root_task_kind: TaskKind,
        runtime_name: &'static str,
        partition_id: Option<PartitionId>,
        root_future: impl FnOnce() -> F + Send + 'static,
-    ) -> Result<RuntimeTaskHandle<anyhow::Result<()>>, RuntimeError>
+    ) -> Result<RuntimeTaskHandle<R>, RuntimeError>
    where
-        F: Future<Output = anyhow::Result<()>> + 'static,
+        F: Future<Output = R> + 'static,
+        R: Send + 'static,


To me, it seems more than you want to have control over the error type rather than make the runtime behave like an async task with a return value.

In that case, your PartitionProcessorStopReason becomes the error type.

Maybe! We use anyhow::Error quite a bit in the PP now, so it would be difficult to disentangle the errors I care about, from other failure conditions. That aside, I still like modeling this as an outcome of either a known stop reason, or some other failure condition. I am treating PartitionProcessorStopReason as a normal return since both canceling the PP, or encountering a trim gap, are expected over a long enough timeline.

AhmedSoliman · 2024-12-23T17:11:12Z

crates/worker/src/partition_processor_manager/spawn_processor_task.rs

+        .await
+        && fast_forward_lsn.is_none()
+    {
+        return Ok(partition_store_manager


tip: remove return

Without the return statement, I need to pull the rest of the method body into an else arm - and I specifically wanted to keep it this way. I find it easier to read without the extra nesting. Open to change it back if you feel about using the if expression as the returned value of course :-)

AhmedSoliman · 2024-12-23T17:13:00Z

crates/worker/src/partition_processor_manager/spawn_processor_task.rs

+            tokio::time::sleep(Duration::from_millis(
+                10_000 + rand::random::<u64>() % 10_000,
+            ))


would RetryPolicy and its internal jitter logic work for you here?

Definitely! I didn't want to plumb a retry count through just yet but maybe even without it, we can leverage the retry policy already.

AhmedSoliman · 2024-12-23T17:13:57Z

crates/worker/src/partition_processor_manager/spawn_processor_task.rs

+                warn!(
+                    partition_id = %partition_id,
+                    ?snapshot_path,
+                    "Failed to remove local snapshot directory, continuing with startup: {:?}",


let's try and avoid using Debug values in log messages higher than debug!

Very insightful rule of thumb to keep in mind, thank you!

pcholakov added 4 commits December 23, 2024 16:29

Add stopped-reason to ProcessorManager root future

35837be

This allows us to signal the PPM about log trim gaps that the PP may encounter at runtime, which require special handling.

Add trim-gap handling by fast-forwarding the partition state on startup

f1ecd67

Simplify open partition store logic

11754c6

Self-review

140e7bf

pcholakov force-pushed the feat/trim-gap-handling branch from 8a82f9e to 140e7bf Compare December 23, 2024 14:29

pcholakov added 2 commits December 23, 2024 16:44

Stop reading commands from the log if a TrimGap is encountered

da44a6e

Delay startup when we have a fast-forward target we can't reach

39401bf

pcholakov requested review from tillrohrmann and AhmedSoliman December 23, 2024 14:58

pcholakov marked this pull request as ready for review December 23, 2024 14:58

pcholakov commented Dec 23, 2024

View reviewed changes

AhmedSoliman reviewed Dec 23, 2024

View reviewed changes

PR feedback misc

acd5c37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Bifrost trim gap handling support by fast-forwarding to the latest partition snapshot #2456

Add Bifrost trim gap handling support by fast-forwarding to the latest partition snapshot #2456

pcholakov commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 23, 2024 •

edited

Loading

pcholakov Dec 23, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 23, 2024

pcholakov Dec 23, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

AhmedSoliman Dec 23, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

AhmedSoliman Dec 24, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

AhmedSoliman Dec 23, 2024

pcholakov Dec 24, 2024

Add Bifrost trim gap handling support by fast-forwarding to the latest partition snapshot #2456

Are you sure you want to change the base?

Add Bifrost trim gap handling support by fast-forwarding to the latest partition snapshot #2456

Conversation

pcholakov commented Dec 23, 2024 • edited Loading

github-actions bot commented Dec 23, 2024 • edited Loading

Test Results

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pcholakov commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 23, 2024 •

edited

Loading