Implements commit log compression. #2

blambov · 2014-09-19T14:13:28Z

Provides two implementations of commit log segments, one matching the
previous memory-mapped writing method for uncompressed logs and one
that uses in-memory buffers and compresses sections between sync markers
before writing to the log. Replay is changed to decompresses these sections
and keep track of the uncompressed position to correctly identify the replay
position.

The compression class and parameters are specified in cassandra.yaml and
stored in the commit log descriptor, which can now vary in size.

Tested by the test-compression target, which now enables LZ4Compression
of commit logs in addition to compression for SSTables.

commit log segments, one matching the previous memory-mapped writing method for uncompressed logs and one that uses in-memory buffers and compresses information between sync markers before writing to the log. The compression class and parameters are specified in cassandra.yaml.

aweisberg · 2015-01-12T23:51:44Z

src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java

Should this be returning VERSION_22?

Fixed by merge.

aweisberg · 2015-01-13T18:42:45Z

What is the test coverage, unit or otherwise for compressed/uncompressed playback? It would be good to see the new code produce the mutations out of both compressed and uncompressed segments.

I think that this kind of performance work isn't done without a micro benchmark so we know where we stand and so the next person to work on the CL can track the impact of their changes. I would really like to see a JMH test added that evaluates the compressed and uncompressed versions for throughput for tiny mutations as well as "normal" (say 1k).

When testing compression I have found that generating compressible data doesn't seem to work well. I never got the advertised compression rates out of Snappy. You might need to grab a real corpus like some HTML or the text of alice and wonderland to give the compressor something reasonable to chew on.

I have some concern about waiting to do compression until the end. It's fine in the common case and configuration, but there are situations where you want to start the flush immediately and it would be nice if it compressed and wrote a few hundred kilobytes at a time. You haven't taken away the mapped option so latency sensitive users can opt for that and it can be enhanced later if there is demand.

Compression is single threaded and any time spent in force() can't also be used to do compression. This can really cut into compression throughput and it is a function of the latency of force() which varies. With synchronous commits it's can have an impact.

The ICompressor interface and some of the implementations are also a little limiting. You should be able to compress the data directly into the memory mapped file. Snappy java supports that although LZ4 doesn't appear to. It also makes more sense to compress and then checksum. There is less data to checksum after compression, and another advantage of getting the order right is that you can use a JNI call to a native checksum implementation which will be much faster. You can use a larger checksum like murmur3 over each compressed block. The checksum will be better then a 4-byte checksum, but use less space overall.

Yet another advantage is that Murmur3 doesn't have a lookup table evicting random cache lines (on every core) as part of every single mutation that flows through C*.

I only bring this stuff up because this is a format change and getting it all done at once might make the amount of multi-version support code that has to be written smaller as opposed to making these changes incrementally across multiple versions.

belliottsmith · 2015-01-13T19:32:35Z

I'm not really sure JMH is the right tool for disk bound workloads?

To add to Ariel's excellent points, another thing to consider is recycling the ByteBuffer we use, and potentially also use a DirectByteBuffer. As it stands this will create a non-trivial amount of garbage that may live long enough to be promoted (what with being disk bound).

aweisberg · 2015-01-13T19:45:22Z

JMH doesn't solve the hard problems of for disk bound workloads certainly. It doesn't hurt either and it has some support for parameterizing and reporting. I didn't know about CommitLogStress when I asked about JMH. We don't need to build something new if something is already in place.

I thought about the ByteBuffer pooling and lack of DBB. ICompressor doesn't accept DBB for input or output. The allocation is large so it will go straight into the old generation. C* runs CMS pretty regularly, but maybe not that regularly. It might make sense to pool two of them and if anymore are needed beyond that to allocate and discard. Since the allocation is CL segment size it's feels like a lot of memory to pin for one purpose.

belliottsmith · 2015-01-13T20:12:43Z

CommitLogStress is not very good, but JMH is architected around microbenchmarks or very small operations, which this doesn't really fit.

We now support DBB in LZ4, so there's no reason to not roll it out here. Pooling as many as we need makes sense to me, no reason to constrain ourselves to just 2? Perhaps periodically prune if instances aren't being used. We're currently pinning as much memory as we're ahead of disk by, although admittedly here we only need it so long as we're writing to the page cache. Which is perhaps another optimisation - the buffer should be recycled/discarded as soon as we've compressed its contents.

iamaleksey · 2015-01-13T20:15:22Z

Can you copy those four last comments to JIRA, for future archeologists pleasure?

…over case Patch by jmckenzie; reviewed by branimir for CASSANDRA-9104

The failure detector is now configurable via cassandra.custom_failure_detector_class (cherry picked from commit 8127d43) # The commit message #2 will be skipped: # STAR-842 - fix up (cherry picked from commit d9625b6)

aweisberg reviewed Jan 12, 2015
View reviewed changes

Multiple sync threads.

873cc2b

blambov force-pushed the compressed-cl branch from d254f17 to 7d8720b Compare January 22, 2015 16:25

Corrections and comments.

71ab661

blambov force-pushed the compressed-cl branch from 7d8720b to 71ab661 Compare January 22, 2015 16:28

blambov pushed a commit that referenced this pull request May 1, 2015

Follow-up #2 for 9104. Fix SchemaLoader + EmbeddedCassandraServer rec…

774bd0b

…over case Patch by jmckenzie; reviewed by branimir for CASSANDRA-9104

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implements commit log compression. #2

Implements commit log compression. #2

Uh oh!

blambov commented Sep 19, 2014

Uh oh!

aweisberg Jan 12, 2015

Uh oh!

blambov Jan 27, 2015

Uh oh!

aweisberg commented Jan 13, 2015

Uh oh!

belliottsmith commented Jan 13, 2015

Uh oh!

aweisberg commented Jan 13, 2015

Uh oh!

belliottsmith commented Jan 13, 2015

Uh oh!

iamaleksey commented Jan 13, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Implements commit log compression. #2

Are you sure you want to change the base?

Implements commit log compression. #2

Uh oh!

Conversation

blambov commented Sep 19, 2014

Uh oh!

aweisberg Jan 12, 2015

Choose a reason for hiding this comment

Uh oh!

blambov Jan 27, 2015

Choose a reason for hiding this comment

Uh oh!

aweisberg commented Jan 13, 2015

Uh oh!

belliottsmith commented Jan 13, 2015

Uh oh!

aweisberg commented Jan 13, 2015

Uh oh!

belliottsmith commented Jan 13, 2015

Uh oh!

iamaleksey commented Jan 13, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants