Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't take archived log size into account when calculating log size for flush #12680

Closed
wants to merge 1 commit into from

Conversation

HypenZou
Copy link
Contributor

@HypenZou HypenZou commented May 21, 2024

Context/Summary:
It seems unreasonable to take the archived log size into account when calculating log size for flush in method CreateCheckpoint. If the user sets WAL_ttl_seconds or WAL_size_limit_MB, the argument log_size_for_flush can easily be reached due to the size of the archived dir. As a result, the flush may always be triggered.
Test
corverd by ./checkpoint_test

Copy link
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update

// log_size_for_flush: if the total log file size is equal or larger than
to say excluding archive log file. Also please add an HISTORY entry.

Please also add a unit test to verify this behavior.

@HypenZou HypenZou force-pushed the checkpoint_log_flush branch 3 times, most recently from 67d70de to e975dd1 Compare June 18, 2024 17:25
@HypenZou
Copy link
Contributor Author

@hx235 thx for review🙏🏻
unit test and history entry has been added

@HypenZou HypenZou force-pushed the checkpoint_log_flush branch 4 times, most recently from 8b4824c to e284e38 Compare June 24, 2024 16:53
@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you very much!

Comment on lines 995 to 996
ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
"DBImpl::FlushForGetLiveFiles", [&](void*) { flushed = true; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of SyncPoint here is a little excessive, but OK. One downside of SyncPoint is that it's error-prone - the string name could have typo and that could make the test accidentally pass. Another downside is it's low-level - there could be a different code path flushing the memtable.

The more conventional alternative would be to check NumTableFilesAtLevel(0) is the same before and after the checkpoint. You could even make that a little bit more robust by setting disable_auto_compactions=true to prevent compaction out of L0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for advice, updated it

@facebook-github-bot
Copy link
Contributor

@HypenZou has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@HypenZou has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@HypenZou has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr merged this pull request in 5bb7f95.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants