-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove mutex for write #197
base: 6.4.tikv
Are you sure you want to change the base?
Conversation
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
This reverts commit b84bd77. Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Could you pass facebook's CI, or at least figure out why they failed? I see you have several builds timed out. That usually means certain cases are hanging (they don't output PASSED log nor error log). |
/test |
It seems to be a unstable test. These tests run a long time and may fail because of environment. |
@Little-Wallace I don't think so. There are some unstable ones, but they won't cause nearly half of the tests to fail (usually at most 2). |
The tests did not fail. But the process was killed because out of time |
@Little-Wallace Some tests are hanging, mostly likely on some sync point conditions. It's better to figure out exactly what are those conditions, even though we might not need to fix them. |
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain the bug you just fixed in upstream? So I know where the review process is flawed.
db/error_handler.h
Outdated
@@ -67,6 +61,7 @@ class ErrorHandler { | |||
// A flag indicating whether automatic recovery from errors is enabled | |||
bool auto_recovery_; | |||
bool recovery_in_prog_; | |||
std::atomic<bool> stop_state_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::atomic<bool> stop_state_; | |
std::atomic<bool> db_stopped_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
db/error_handler.h
Outdated
return !bg_error_.ok() && | ||
bg_error_.severity() >= Status::Severity::kHardError; | ||
} | ||
bool IsDBStopped() { return stop_state_.load(std::memory_order_acquire); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment "Do not require DB mutex held."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
new_options.max_background_compactions, | ||
new_options.max_background_jobs, | ||
new_options.base_background_compactions, | ||
/* parallelize_compactions */ true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* parallelize_compactions */ true); | |
true /* parallelize_compactions */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that all code of rocksdb place the comment before param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're half right. It's a per-file styling and this file happens to use postposition which is rare within the whole codebase...
|
||
// "number <= current_log_number - 1" is equivalent to | ||
// "number < current_log_number". | ||
MarkLogsSynced(current_log_number - 1, true, s); | ||
if (!s.ok()) { | ||
error_handler_.SetBGError(s, BackgroundErrorReason::kFlush); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you set this error now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be saved at "db/db_impl/db_impl_compaction_flush.cc" L549 and L210
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this function, we will not hold mutex so that we can not save BG error here.
The function |
I fix the problem of facebook#7516 in this commit: facebook@5fe832a There are some other bugs which are bout code only in main branch of facebook/rocksdb, so I did not port them here. |
db/db_impl/db_impl.cc
Outdated
@@ -2140,6 +2148,7 @@ Status DBImpl::CreateColumnFamilyImpl(const ColumnFamilyOptions& cf_options, | |||
ROCKS_LOG_INFO(immutable_db_options_.info_log, | |||
"Created column family [%s] (ID %u)", | |||
column_family_name.c_str(), (unsigned)cfd->GetID()); | |||
single_column_family_mode_.store(false, std::memory_order_release); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why move this line from 2135?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forget why I place it here.... Maybe I can try to place it back to origin
@@ -175,7 +174,9 @@ Status DBImpl::FlushMemTableToOutputFile( | |||
// flushed SST may contain data from write batches whose updates to | |||
// other column families are missing. | |||
// SyncClosedLogs() may unlock and re-lock the db_mutex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"db_mutex" -> log_write_mutex_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ret = s; | ||
{ | ||
InstrumentedMutexLock lock(&log_write_mutex_); | ||
for (auto l : logs_to_free_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit to make logs_to_free_ be protected by both log_write_mutex_ and mutex_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes no sense. But i think unlock mutex_
during this function may cause other bad case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. But I'd prefer seeing a full test report before merging.
@@ -1131,7 +1131,13 @@ class DBImpl : public DB { | |||
} | |||
} | |||
}; | |||
|
|||
struct LogContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add newlines around this struct definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
db/db_impl/db_impl.h
Outdated
@@ -1934,6 +1939,7 @@ class DBImpl : public DB { | |||
InstrumentedCondVar atomic_flush_install_cv_; | |||
|
|||
bool wal_in_db_path_; | |||
std::atomic<uint64_t> max_total_wal_size_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it together with max_total_in_memory_state_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
db/db_impl/db_impl_write.cc
Outdated
if (max_total_wal_size > 0) { | ||
return max_total_wal_size; | ||
} | ||
return 4 * max_total_in_memory_state_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_total_in_memory_state_
is an atomic now, need to specify memory order as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
I have finished the benchmark test. See details in https://docs.google.com/document/d/1f72BGmN1kREorkaCzhbsB0oocTP13IKJw4jy5mfm-TE/edit#heading=h.mtjjjfyw1ei |
Signed-off-by: Little-Wallace <[email protected]>
mutex_.Lock(); | ||
bool need_log_sync = !write_options.disableWAL && write_options.sync; | ||
bool need_log_dir_sync = need_log_sync && !log_dir_synced_; | ||
LogContext log_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not init need_log_dir_sync
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need_log_dir_sync
was protected by mutex_
before. This pr will protected this member variable by log_write_mutex_
so it could only be accessed in the method PreprocessWrite
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean log_dir_synced_
is protected by mutex_
? Oh, reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rocksdb/db/db_impl/db_impl_write.cc
Line 1194 in 46d7b19
// It's safe to access logs_ with unlocked mutex_ here because: |
change mutex_
to log_write_mutex_
Signed-off-by: Little-Wallace <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -220,10 +216,7 @@ void DBImpl::FindObsoleteFiles(JobContext* job_context, bool force, | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the log_sync_cv_
may wait here while holding the mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Oh, you mean condvar::Wait
will implicitly release mutex (a mutex is paired with it when constructed).mutex_
. That's a good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it will release the pairing mutex log_write_mutex_
, but not mutex_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch... I'm thinking about how to solve this problem....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether it is still correctly to release mutex_
here because if some other thread calling FindObsoleteFiles
but this thread is only finish half of its task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mutex_
is released before, I don't see a problem here. But you can't simply re-acquire the mutex
after Wait()
. That will break the lock order constraint.
Maybe you can skip those unsync-ed log files, and clean them in the next round?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I will unlock mutex_
before FindObsoleteLogFiles
method. PTAL
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM. Can we add a test to cover the case of the mutex?
I thought there are enough tests in RocksDB to cover this condition? I have passed the CI of facebook to make sure that it is no problem |
Signed-off-by: Little-Wallace [email protected]
Summary
Cherry-pick facebook#7516
This PR is to avoid write operation blocking on mutex when compaction or other method hold the mutex too long.