fix(server): Process empty buckets in snapshot#7185
fix(server): Process empty buckets in snapshot#7185dranikpg merged 1 commit intodragonflydb:mainfrom
Conversation
|
augment review |
🤖 Augment PR SummarySummary: Ensures snapshot traversal also processes empty physical buckets so they get version-stamped and aren’t repeatedly treated as “unprocessed” (fix for #7060). Changes:
Technical Notes: Empty-bucket version stamping prevents repeated skip/rehit behavior during snapshotting and change-listener updates. 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Pull request overview
This PR updates snapshot bucket traversal/versioning so that empty buckets are also processed and stamped with the snapshot version, enabling correctness needed for journal-omit behavior (#7060).
Changes:
- Extend
DashTable::TraverseBucketswith an option to visit empty buckets. - Change DashTable iterators to retain the owning table pointer even when “done”, via a
done_flag. - Update snapshot serialization flow to traverse empty buckets and stamp snapshot versions on them.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/server/snapshot.cc | Traverses buckets with include empty buckets enabled so empty buckets are processed during snapshotting. |
| src/server/serializer_base.cc | Stamps snapshot versions onto empty buckets and ensures delayed tiered entries are flushed when appropriate. |
| src/core/dash.h | Adds visit_empty option to TraverseBuckets and switches iterator “done” tracking from null-owner to an explicit done_ flag. |
|
I haven't included migrations intentionally, I will first implement omits for snapshots fully and then transfer this knowledge to migrations |
|
For sake of discussion, if the bucket is empty and serialization have reached it, we must emit a journal change. So in either case we if emit a journal change when encountering an empty bucket in |
Yes, we may emit a journal change. And it might look harmless because there is no difference when to serialize this single value. But we actually don't want to do it. It is because it increases the "attack" surface - any subsequent write to that bucket now has to be journaled. Compare that with omitting this bucket - no writes will be journaled. |
Needed for #7060
Currently, the comparison
bucket_version < snapsho_versiondetermines whether the snapshot serialization process has reached this specific bucket. This fact can be later used to make decisions and perform various optimizations (for example journal omits, see #7060).However, empty buckets are currently fully ignored by all actors: the dashtable iteration functions, the iterator itself and the snapshot. During serialization, their versions are not updated - because there is nothing to serialize. However, just by looking at the bucket and it's not-updated version, we can't tell if the snapshot loop has already reached it or will do so only in the future.
This PR allows them in all places, so they act as regular buckets and their versions are updated