This module creates the persistent storage for a Versioned Hybrid Store (VHS).
The VHS is a combination of a DynamoDB table and an S3 bucket. When we store large values, we write:
- The value as an object in S3, and
- A pointer to the S3 object as an item in DynamoDB
This is a hybrid of two data stores, and the DynamoDB pointers record the version of the value -- thus, Versioned Hybrid Store.
This is Amazon's recommended approach for storing large values in DynamoDB:
If your application needs to store more data in an item than the DynamoDB size limit permits, you can try compressing one or more large attributes, or you can store them as an object in Amazon Simple Storage Service (Amazon S3) and store the Amazon S3 object identifier in your DynamoDB item.
This module creates the DynamoDB table and S3 bucket for this purpose. It also creates some IAM policies that can be used to read or modify the VHS.
-
Just DynamoDB? Items in DynamoDB have a maximum size of 400KB, which is too small for our stores.
-
Just DynamoDB, but with compressed items? This only delays the issue -- we'd still hit the 400KB limit, but later.
-
Just S3? DynamoDB supports conditional writes to ensure updates are ordered correctly; S3 does not. We need correct ordering, so S3 wouldn't work for us.
-
Another database technology? Plenty of databases have larger per-item limits and conditional writes (e.g. Cassandra, MySQL, Aurora), but we were already using DynamoDB and S3. It's easier for us to use a few databases that we understand well than use a lot of databases we understand a little.
There are two variants of the VHS:
-
The
single-version-store
tracks a single version of each value. The DynamoDB table only keeps the pointer to the newest object in S3.(The table has a hash key
id
.) -
The
multi-version-store
tracks every version of a value. The DynamoDB table keeps the pointer to every object in S3.(The table has a hash key
id
and a range keyversion
.)
We have Scala classes for interacting with a VHS in our shared Scala libraries.