-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Add in-memory data structures and (de)serialization support for subcompaction progress #13928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Irrelevant gcc_no_test_run/clang-analyze failure since they show up in other simple change like https://github.com/facebook/rocksdb/actions/runs/17566358798/job/49894009961?pr=13924 and https://github.com/facebook/rocksdb/actions/runs/17566358798/job/49894009939?pr=13924 |
86fa190
to
a0cbebd
Compare
a0cbebd
to
a5b0550
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change makes sense to me overall. Mostly nitpick comments.
db/version_edit.h
Outdated
|
||
// Structure to hold subcompaction progress information for resumable | ||
// compaction | ||
struct ResumableSubcompactionProgress { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we have subcompaction progress that is not resumable, I think we can remove "Resumable" part out of all codebase. I would just call this SubcompactionProgress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
db/version_edit.h
Outdated
: last_persisted_output_files_count_; | ||
} | ||
|
||
size_t& LastPersistedOutputFilesCount(bool is_proximal_level) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Instead of exposing both const &
and &
, I'd have Get/Set and disallow Get to be editable. This is also consistent with what VersionEdit
does.
Same for NumProcessedOutputRecords
.
For CompactionOutputFiles
, AddToCompactionOutputFiles()
instead of SetCompactionOutputFiles()
. (We already have Clear()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
db/version_edit.h
Outdated
<< CompactionOutputFiles(false).size(); | ||
oss << ", number proximal_level_compaction_output_files=" | ||
<< CompactionOutputFiles(true).size(); | ||
oss << ", last_persisted_output_file_count=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supernit: last_persisted_output_files_count=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
db/version_edit.cc
Outdated
.empty()) { | ||
continue; | ||
} | ||
// Reserve space in vector to avoid repeated reallocations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider using autovector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
return oss.str(); | ||
} | ||
|
||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Instead of having two of each (one for output level and one for proximal level) for output records, output files, count, etc. I'm wondering if we can group this per level output info, maybe call it SubcompactionPerLevelProgress
,
And then in SubcompactionProgress
can have two fields,
SubcompactionPerLevelProgress output_level_progress
andSubcompactionPerLevelProgress proximal_level_progress
I think this may simplify quite a bit of duplicate functions with is_proximal_level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - fixed
db/version_edit.cc
Outdated
} | ||
|
||
// Initialize accumulated progress if this is the first VersionEdit | ||
if (!has_resumable_subcompaction_progress_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this check is not needed. Also wondering how HasAccumulatedResumableSubcompactionProgress()
is going to be used (except for the tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, it's for building back all the output files from the delta record before resuming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
a5b0550
to
4c237b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
4c237b3
to
e236f84
Compare
Context
Resuming compaction is designed to periodically record the progress of an ongoing compaction and can resume from that saved progress after interruptions such as cancellation, database shutdown, or crashes.
This PR introduces the data structures needed to store subcompaction progress in memory, along with serialization and deserialization support to persist and parse this progress to/from "a manifest-like compaction progress file" (the actual creation of such file is in upcoming PRs).
Flow of resuming: DB::OpenAndCompact() -> Compaction progress file -> SubcompactionProgress -> CompactionJob
Flow of persistence: CompactionJob -> SubcompactionProgress -> Compaction progress file -> DB that is called with OpenAndCompact()
Summary
Progress represented by
SubcompactionProgress
will be tracked at the scope of a subcompaction, which is the smallest independent unit of compaction work.The frequency of recording this progress is once every N compaction output files (to be detailed in future PRs).
When recording, all fields, except for the output files metadata in
SubcompactionProgress
, will directly overwrite the corresponding fields from the last saved progress (SeeSubcompactionProgress
andSubcompactionProgressBuilder
for more).As a bonus, this PR refactors the file metadata encoding and decoding utilities into two static helper functions, EncodeToNewFile4() and DecodeNewFile4From(), to support subcompaction progress usage.
Test plan:
SubcompactionProgressTest
unit tests in version_edit_test.cc to verify basic serialization/deserialization and forward compatibility handlingFollow up:
paranoid_file_checks=true
(by default false). Output verification will be done before persistence of progress. As long as this follow-up is done before the landing of the integration PR to create the progress file, we can change the manifest-like compaction progress file format freely.