[Pebble Storage] Refactor indexing new blocks #6360

zhangchiqing · 2024-08-15T23:32:11Z

This PR makes it concurrent safe to index new blocks.

Running the concurrency test 100 times and passed:

go test --failfast --tags=relic  -run=TestIndexConcurrent -count=100
PASS
ok      github.com/onflow/flow-go/storage/pebble/procedure      6.850s

jordanschalm

I don't think the approach here provides any concurrency safety. We need to hold the lock until the write batch is committed.

To address these cases in general, I think we need to define and acquire the lock at the level where the write batch is instantiated and committed. For example, one lock protecting access to the entire block insertion procedure.

jordanschalm · 2024-08-16T15:47:38Z

storage/pebble/procedure/children.go

 // IndexNewBlock will add parent-child index for the new block.
 //   - Each block has a parent, we use this parent-child relationship to build a reverse index
 //   - for looking up children blocks for a given block. This is useful for forks recovery
 //     where we want to find all the pending children blocks for the lastest finalized block.
 //
+// # It's concurrent safe to call this function by multiple goroutines, as it will acquire a lock


Suppose we have 2 threads A and B concurrently writing blocks with a common parent (that initially has no children), the sequence diagram would look like:

sequenceDiagram par A ->> DB: starts write batch and B ->> DB: starts write batch end A -> DB: acquire lock activate A A -> DB: read Children (nil) A -> DB: write Chidren [c1] A -> DB: release lock deactivate A B -> DB: acquire lock activate B B -> DB: read Children (nil) B -> DB: write Chidren [c2] B -> DB: release lock deactivate B par A ->> DB: commit write batch and B ->> DB: commit write batch end

Loading

We should end up with children [c1,c2], but instead we will end up with either [c1] or [c2].

The lock added here does not provide any guarantee that this function will operate correctly when called in parallel, because the lock is released before the write is committed.

Suppose we have some database operation P consisting of a set of reads R and a set of writes W, where the writes are dependent on the values read in R. In general, to safely synchronize P, I think we need to:

acquire Lock_P, the lock protecting access to P

perform all reads in R, and memorize the values

instantiate the write batch

perform all writes in W

commit the write batch

release Lock_P

Nice diagram. And thanks for the feedback.

It's addressed here

I used the callback to release the lock after the batch update is committed.

Also added a test case to reproduce and verify the fix.

zhangchiqing · 2024-09-13T18:09:45Z

Consolidate and merge all changes to leo/v0.33-pebble-storage-to-review. The changes in this PR can still be a reference.

zhangchiqing added 2 commits August 15, 2024 17:33

refactor indexing new block

789d2bb

refactor new follower state

148c877

zhangchiqing force-pushed the leo/v0.33-pebble-storage-block-indexer branch from b541fbe to 148c877 Compare August 16, 2024 00:42

zhangchiqing changed the base branch from v0.33 to leo/v0.33-pebble-storage-to-review August 16, 2024 01:30

zhangchiqing marked this pull request as ready for review August 16, 2024 01:30

zhangchiqing requested review from AlexHentschel, durkmurder and jordanschalm as code owners August 16, 2024 01:30

zhangchiqing requested review from peterargue and removed request for durkmurder August 16, 2024 01:30

jordanschalm requested changes Aug 16, 2024

View reviewed changes

not releasing the lock until the pebblw batch update is committed

77b64c0

zhangchiqing requested a review from jordanschalm August 16, 2024 19:04

zhangchiqing merged commit ce2775e into leo/v0.33-pebble-storage-to-review Sep 13, 2024

zhangchiqing deleted the leo/v0.33-pebble-storage-block-indexer branch September 13, 2024 18:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pebble Storage] Refactor indexing new blocks #6360

[Pebble Storage] Refactor indexing new blocks #6360

zhangchiqing commented Aug 15, 2024 •

edited

Loading

jordanschalm left a comment

jordanschalm Aug 16, 2024 •

edited

Loading

zhangchiqing Aug 16, 2024

zhangchiqing commented Sep 13, 2024

[Pebble Storage] Refactor indexing new blocks #6360

[Pebble Storage] Refactor indexing new blocks #6360

Conversation

zhangchiqing commented Aug 15, 2024 • edited Loading

jordanschalm left a comment

Choose a reason for hiding this comment

jordanschalm Aug 16, 2024 • edited Loading

Choose a reason for hiding this comment

zhangchiqing Aug 16, 2024

Choose a reason for hiding this comment

zhangchiqing commented Sep 13, 2024

zhangchiqing commented Aug 15, 2024 •

edited

Loading

jordanschalm Aug 16, 2024 •

edited

Loading