Skip to content

Commit

Permalink
WritePrepared Txn: rollback_merge_operands hack
Browse files Browse the repository at this point in the history
Summary:
This is a hack as temporary fix of MyRocks with rollbacking  the merge operands. The way MyRocks uses merge operands is without protection of locks, which violates the assumption behind the rollback algorithm. They are ok with not being rolled back as it would just create a gap in the autoincrement column. The hack add an option to disable the rollback of merge operands by default and only enables it to let the unit test pass.
Closes facebook#3711

Differential Revision: D7597177

Pulled By: maysamyabandeh

fbshipit-source-id: 544be0f666c7e7abb7f651ec8b23124e05056728
  • Loading branch information
Maysam Yabandeh authored and facebook-github-bot committed Apr 12, 2018
1 parent 6f5e644 commit d15397b
Show file tree
Hide file tree
Showing 4 changed files with 24 additions and 5 deletions.
4 changes: 3 additions & 1 deletion HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,15 @@
* Add a BlockBasedTableOption to align uncompressed data blocks on the smaller of block size or page size boundary, to reduce flash reads by avoiding reads spanning 4K pages.

### New Features
* * Introduce TTL for level compaction so that all files older than ttl go through the compaction process to get rid of old data.
* Introduce TTL for level compaction so that all files older than ttl go through the compaction process to get rid of old data.
* TransactionDBOptions::write_policy can be configured to enable WritePrepared 2PC transactions. Read more about them in the wiki.

### Bug Fixes
* Fsync after writing global seq number to the ingestion file in ExternalSstFileIngestionJob.
* Fix WAL corruption caused by race condition between user write thread and FlushWAL when two_write_queue is not set.
* Fix `BackupableDBOptions::max_valid_backups_to_open` to not delete backup files when refcount cannot be accurately determined.
* Fix memory leak when pin_l0_filter_and_index_blocks_in_cache is used with partitioned filters
* Disable rollback of merge operands in WritePrepared transactions to work around an issue in MyRocks. It can be enabled back by setting TransactionDBOptions::rollback_merge_operands to true.

### Java API Changes
* Add `BlockBasedTableConfig.setBlockCache` to allow sharing a block cache across DB instances.
Expand Down
8 changes: 8 additions & 0 deletions include/rocksdb/utilities/transaction_db.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,14 @@ struct TransactionDBOptions {
// before the commit phase. The DB then needs to provide the mechanisms to
// tell apart committed from uncommitted data.
TxnDBWritePolicy write_policy = TxnDBWritePolicy::WRITE_COMMITTED;

// TODO(myabandeh): remove this option
// Note: this is a temporary option as a hot fix in rollback of writeprepared
// txns in myrocks. MyRocks uses merge operands for autoinc column id without
// however obtaining locks. This breaks the assumption behind the rollback
// logic in myrocks. This hack of simply not rolling back merge operands works
// for the special way that myrocks uses this operands.
bool rollback_merge_operands = false;
};

struct TransactionOptions {
Expand Down
1 change: 1 addition & 0 deletions utilities/transactions/transaction_test.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ class TransactionTestBase : public ::testing::Test {
txn_db_options.transaction_lock_timeout = 0;
txn_db_options.default_lock_timeout = 0;
txn_db_options.write_policy = write_policy;
txn_db_options.rollback_merge_operands = true;
Status s;
if (use_stackable_db == false) {
s = TransactionDB::Open(options, txn_db_options, dbname, &db);
Expand Down
16 changes: 12 additions & 4 deletions utilities/transactions/write_prepared_txn.cc
Original file line number Diff line number Diff line change
Expand Up @@ -218,15 +218,18 @@ Status WritePreparedTxn::RollbackInternal() {
std::map<uint32_t, const Comparator*>& comparators_;
using CFKeys = std::set<Slice, SetComparator>;
std::map<uint32_t, CFKeys> keys_;
bool rollback_merge_operands_;
RollbackWriteBatchBuilder(
DBImpl* db, WritePreparedTxnDB* wpt_db, SequenceNumber snap_seq,
WriteBatch* dst_batch,
std::map<uint32_t, const Comparator*>& comparators)
std::map<uint32_t, const Comparator*>& comparators,
bool rollback_merge_operands)
: db_(db),
callback(wpt_db, snap_seq,
0), // 0 disables min_uncommitted optimization
rollback_batch_(dst_batch),
comparators_(comparators) {}
comparators_(comparators),
rollback_merge_operands_(rollback_merge_operands) {}

Status Rollback(uint32_t cf, const Slice& key) {
Status s;
Expand Down Expand Up @@ -275,7 +278,11 @@ Status WritePreparedTxn::RollbackInternal() {

Status MergeCF(uint32_t cf, const Slice& key,
const Slice& /*val*/) override {
return Rollback(cf, key);
if (rollback_merge_operands_) {
return Rollback(cf, key);
} else {
return Status::OK();
}
}

Status MarkNoop(bool) override { return Status::OK(); }
Expand All @@ -289,7 +296,8 @@ Status WritePreparedTxn::RollbackInternal() {
protected:
virtual bool WriteAfterCommit() const override { return false; }
} rollback_handler(db_impl_, wpt_db_, last_visible_txn, &rollback_batch,
*wpt_db_->GetCFComparatorMap());
*wpt_db_->GetCFComparatorMap(),
wpt_db_->txn_db_options_.rollback_merge_operands);
auto s = GetWriteBatch()->GetWriteBatch()->Iterate(&rollback_handler);
assert(s.ok());
if (!s.ok()) {
Expand Down

0 comments on commit d15397b

Please sign in to comment.