You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently for a point-lookup, Titan acquire 7 locks:
rocksdb db_mutex: to get snapshot
table cache mutex: to get block-based table reader
block cache mutex: to read blob index
titan db mutex: to get pointer to BlobStorage
BlobStorage mutex: to get blob file metadata
BlobFileCache mutex: to get file reader
blob cache mutex: to read cached blob
Compare with vanilla rocksdb, only 2 locks are acquired:
table cache mutex: to get block-based table reader
block cache mutex: to read blob index
It gives opportunities to optimize for CPU bounded scenarios. For example, by just reordering blob cache access to before getting blob file metadata, we get 6% throughput improvement in titandb_bench of in-memory workload. Here we list some of the ideas to reduce locking overhead:
Currently we check blob cache in BlobFileReader. If we move the check to BlobStorage level, before BlobStorage::FindFile call, we can avoid BlobStorage mutex lock and BlobFileCache mutex lock for cache hit case.
The change is safe because, when we read a blob index from rocksdb, we assume the blob file it points to always exists (otherwise we should return Status::Corruption). So if we have a blob cache hit, we don't need to check blob file existence.
Move BlobFileCache check to before BlobStorage::FindFile call
Similarly, we can query BlobFileCache to get blob file reader, and if it's a hit, skip BlobStorage::FindFile call. That way for a cache hit (which is common) we saves a BlobStorage mutex lock.
Store BlobStorage pointer in ColumnFamilyHandle
rocksdb embed cfd pointer in ColumnFamilyHandleImpl. One of the benefit is, with the handle, a read doesn't need to acquire db_mutex to obtain cfd pointer. Similarly, we can embed BlobStorage pointer in the handle to save mutex lock to obtain BlobStorage pointer. We can define TitanColumnFamilyHandle as following and return it to caller. The struct is safe to be used as rocksdb::ColumnFamilyHandle when calling rocksdb methods.
struct TitanColumnFamilyHandle : public rocksdb::ColumnFamilyHandleImpl {
std::shared_ptr<BlobStorage> blob_storage;
};
Avoid use weak_ptr
In the code we use the pattern a lot:
std::weak_ptr<Something> FindSomthing() {
...
return ptr; // ptr is a std::shared_ptr
}
std::shared_ptr<Something> something = FindSomething().lock();
This is not necessary. We can have FindSomething return shared_ptr. Using weak_ptr have no benefit, and incur one more atomic ref-count operation.
Making rocksdb::DBImpl::GetSnapshot lock-free
It is also possible to make rocksdb GetSnapshot lock-free, though that's very involving.
The text was updated successfully, but these errors were encountered:
Currently for a point-lookup, Titan acquire 7 locks:
BlobStorage
BlobStorage
mutex: to get blob file metadataBlobFileCache
mutex: to get file readerCompare with vanilla rocksdb, only 2 locks are acquired:
It gives opportunities to optimize for CPU bounded scenarios. For example, by just reordering blob cache access to before getting blob file metadata, we get 6% throughput improvement in titandb_bench of in-memory workload. Here we list some of the ideas to reduce locking overhead:
Move blob cache access to
BlobStorage
level#140
Currently we check blob cache in
BlobFileReader
. If we move the check toBlobStorage
level, beforeBlobStorage::FindFile
call, we can avoidBlobStorage
mutex lock andBlobFileCache
mutex lock for cache hit case.The change is safe because, when we read a blob index from rocksdb, we assume the blob file it points to always exists (otherwise we should return
Status::Corruption
). So if we have a blob cache hit, we don't need to check blob file existence.Move
BlobFileCache
check to beforeBlobStorage::FindFile
callSimilarly, we can query
BlobFileCache
to get blob file reader, and if it's a hit, skipBlobStorage::FindFile
call. That way for a cache hit (which is common) we saves aBlobStorage
mutex lock.Store
BlobStorage
pointer inColumnFamilyHandle
rocksdb embed
cfd
pointer inColumnFamilyHandleImpl
. One of the benefit is, with the handle, a read doesn't need to acquire db_mutex to obtaincfd
pointer. Similarly, we can embedBlobStorage
pointer in the handle to save mutex lock to obtainBlobStorage
pointer. We can defineTitanColumnFamilyHandle
as following and return it to caller. The struct is safe to be used asrocksdb::ColumnFamilyHandle
when calling rocksdb methods.Avoid use
weak_ptr
In the code we use the pattern a lot:
This is not necessary. We can have
FindSomething
return shared_ptr. Using weak_ptr have no benefit, and incur one more atomic ref-count operation.Making
rocksdb::DBImpl::GetSnapshot
lock-freeIt is also possible to make rocksdb
GetSnapshot
lock-free, though that's very involving.The text was updated successfully, but these errors were encountered: