Skip to content

DeleteRange

Abhishek Madan edited this page Dec 17, 2018 · 23 revisions

Overview

DeleteRange is an operation designed to replace the following pattern where a user wants to delete a range of keys in the range [start, end):

...
Slice start, end;
// set start and end
auto it = db->NewIterator(ReadOptions());

for (it->Seek(start); cmp->Compare(it->key(), end) < 0; it->Next()) {
  db->Delete(WriteOptions(), it->key());
}
...

This pattern requires performing a range scan, which prevents it from being usable on any performance-sensitive write path. To mitigate this, RocksDB provides a native operation to perform this task:

...
Slice start, end;
// set start and end
db->DeleteRange(WriteOptions(), start, end);
...

Under the hood, this creates a range tombstone represented as a single kv, which significantly speeds up write performance. Read performance with range tombstones is competitive to the scan-and-delete pattern. (For a more detailed performance analysis, see the DeleteRange blog post.

Internals [WIP]

  • "tombstone fragments"
  • fragmentation algorithm
  • point lookups
  • range scans
  • compactions

Future Work [WIP]

  • tombstone iterator lifetime management
  • memtable caching
  • snapshot-release compactions
  • new format version proposal

Contents

Clone this wiki locally