Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added bloom filter support. Improve concurrency and fix some issues. #93

Closed
wants to merge 43 commits into from

Conversation

pcmind
Copy link
Contributor

@pcmind pcmind commented Dec 30, 2017

I did port some missing feature and fix some issues, I hope to be able to contribute back to main repo.
I know it's bigger pull request than anticipated(for that I am sorry), but hope to be useful to everybody.
Most this work was done following original implementation without to much refactoring to simplify comparison.

Work in PR

Removed some JVM global lock

Removed synchronized block done on class object. A synchronized block like
synchronized (MMapTable.class) {...} will use a mutex that is shared across
the entire JVM, and any other database instance would block on same mutex instance.

More concurrent friendly read/write

Previous implementation was based on full locking during read/write.
Now, only one writer can write to database at a time, but the first to enter the
write lock may help some other writers to finish at the same time it write its own data.
Read will release the lock when accessing immutable table and files to enable other
reads to enter.

Added bloom filter support

Added bloom filter policy implementation as the one implemented in original implementation.

In particular, we add a new FilterPolicy class. An instance of this class can be supplied in Options when opening a database. If supplied, the instance is used to generate summaries of keys (e.g., a bloom filter) which are placed in sstables. These summaries are consulted by DB::Get() so we can avoid reading sstable blocks that are guaranteed to not contain the key we are looking for. This change provides one implementation of FilterPolicy based on bloom filters.
The issue #90 also was corrected in this process.

Added LRU block cache to Tables

Added an LRU block cache to cache uncompressed read blocks. Existing, but
unused, Options.cacheSize can now be used to configure the cache size.

Improved abstraction over file access

Tables and log files read/write classes had no abstraction over file access, this lead to
a confusing code duplication. Interface to abstract file access where added and
duplicated classes where merged. Besides a cleaned design, it enabled easier
testing.
As memory mapped files in JVM can have there issues, Options.allowMmapWrites and
Options.allowMmapWrites where added to configure if memory mapped files
are to be used for read or/and write.

Compaction

Implement DbImpl#compactRange(byte[] begin, byte[] end) requested in by issue #30, with support
for full compaction by using compactRange(null, null).
Fix issue with VersionSet#getOverlappingInputs(int level, InternalKey begin, InternalKey end)
that could return wrong file set when used with level = 0 (ranges may overlap in level 0).
Fix issue #85 correcting Level$findFile(InternalKey) by enabling to return index
greater that last possible index.

Fix SnapshotSeekingIterator

Fix SnapshotIterator that could, in some case, return deleted entries.
A simple test to show this incorrect behavior would be:

DbStringWrapper db = new DbStringWrapper(new Options().compressionType(NONE), databaseDir);
db.put("b", "v");
db.delete("b");
db.delete("a");
assertEquals(Iterators.toString(db.iterator()), "[]");

Failing with:

java.lang.AssertionError: expected [[]] but found [[a=]]
Expected :[]
Actual   :[a=]

Other notes

DbimplTest was modified to enable to run all test using distinct configuration set
and more tests where ported and created.

DbBenchmark was modified to enable concurrent benchmark and set more configurations. Added histogram support.

# Conflicts:
#	leveldb/src/main/java/org/iq80/leveldb/impl/TableCache.java
… and reduce code duplication.

Same more work can be done to enable one to externally change file access implementation.
# Conflicts:
#	.travis.yml
#	leveldb-benchmark/src/main/java/org/iq80/leveldb/benchmark/DbBenchmark.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/DbImpl.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/FileChannelLogWriter.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/Level.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/Level0.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/MMapLogWriter.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/TableCache.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/Version.java
#	leveldb/src/main/java/org/iq80/leveldb/impl/WriteBatchImpl.java
#	leveldb/src/main/java/org/iq80/leveldb/table/FileChannelTable.java
#	leveldb/src/main/java/org/iq80/leveldb/table/MMapTable.java
#	leveldb/src/main/java/org/iq80/leveldb/table/Table.java
#	leveldb/src/main/java/org/iq80/leveldb/table/TableBuilder.java
#	leveldb/src/test/java/org/iq80/leveldb/table/TableTest.java
SnapshotSeekingIterator could return deleted entries (it should't) depending on the order they where disposed in source iterator.
File selection in compaction process was not collecting all overlapping files in level 0. This could cause the db to forget that one entry was deleted after a compaction (tombstone removed, but not the earlier entry).
…ompaction as complete without executing it.

A ManualCompaction could be notified as complete, but not executed, if it was set during the execution of an other compaction execution work.
Track snapshots number (and not full version). This change enable to release "version" instance and compaction operation to only delete entries older that older snapshot. This change fix issue dain#48.
@pcmind
Copy link
Contributor Author

pcmind commented Jan 10, 2018

If you have interest in this merge contact me.

@pcmind pcmind closed this Jan 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants