0.21.0
We are excited to announce the release of Dolt 0.21.0. This release contains a host of exciting features, as well as our usual blend of bug fixes and performance improvements.
Squash merge
As a result of our own internal data collaboration projects, we realized that a squash
command for condensing change sets as a consideration for collaborators was an essential tool. This is now in Dolt.
NFS Mounted Drives
A user highlighted that Dolt didn't work with NFS mounted drives due to the way it was interacting with the filesystem. We have now fixed this.
Garbage Collection
We now have a dolt gc
command for cleaning up unreferenced data. This was requested by several users as a space saving mechanism in production settings.
Performance Improvements
We continue to aggressively pursue performance improvements, most notably a huge improvement in full table scans.
sysbench
tooling
As we detailed in a blogpost yesterday we have created a tooling to provide our development team and contributors with a simple way to measure SQL performance. For example, to compare a arbitrary commit to the current working set (to test whether changes introduce expected performance benefits):
$ ./run_benchmarks.sh bulk_insert <username> 19f9e571d033374ceae2ad5c277b9cfe905cdd66
This will build Dolt at the appropriate commits, spin up containers with sysbench
, and execute the benchmarks.
Documentation Fixes
An open source contributor provided several fixes to our CLI documentation, which we have gratefully merged.
GCP Remotes
We have fixed Google Cloud Platform remotes motivated by a bug report from a user experimenting with Dolt.
Merged PRs
- 930: Bump go-mysql-server
- 929: store/types: value_store.go: GC implementation uses errgroup instead of atomicerr.
- 928: gc chunks
Implements garbage collection by traversing aDatabase
from its root chunk and coping all reachable chunks to a new set of NBS tables.
While "garbage collection generation" will protect the NBS from corruption by out-of-process writers, GC is not currently thread safe for concurrent use in-process. Getting to online GC will require work around protecting in-progress writes that are not yet reachable from the root chunk. - 927: /.github/workflows/ci-bats-tests.yaml: skip aws tests if no secrets found
- 925: benchmark tools
- 923: doc corrections
fixed some typos (I think 😊) - 922: go/util/sremotesrv: grpc.go: Echo the client's NbsVersion in GetRepoMetadata.
- 921: fix gcp remotes
- 920: go/go.mod: Adopt dolthub/fslock fork. Forked version uses Open(RDRW) for lock file on *nix, which works on NFS.
- 918: /.github/workflows/ci-bats-tests.yaml: remove deprecated syntax
- 917: Increase maxiumum SQL statement length to 100MB (initially 512K)
Signed-off-by: Zach Musgrave [email protected] - 915: Daylon's suggestions for bheni perf PR Pt. 2
- 914: Fix for reading old dolt_schemas
- 913: squash merge
- 912: go/store/{datas,nbs}: Use application-level temp dir for byte sink chunk files with datas.Puller.
- 911: Daylon's suggestions for bheni perf PR
- 910: Adding "Garbage Collection Generation" to manifest file
This new manifest field will supportNomsBlockStore
garbage collection and protect againstNBS
corruption. StoringgcGen
in the manifest will support deleting chunks from anNBS
in a safe way.NBS
instances that see a differentgcGen
than they saw when they last read the manifest will error and require clients to re-attempt their write.
NBS
will now have three forms of write errors (not including IO errors or other kinds of unexpected errors):nbs.errOptimisticLockFailedTables
: Another writer landed a manifest update since the last time we read the manifest. The root chunk is unchanged and the set of chunks referenced in the manifest is either the same or has strictly grown. Therefore the NBS can handle this by rebasing on the new set of tables in the manifest and re-attempting to add the same set of novel tables.nbs.errOptimisticLockFailedRoot
: Another writer landed a manifest update that includes a new root chunk. The set of chunks referenced in the manifest is either the same or has strictly grown, but it is not know which chunk are reachable from the new root chunk. The NBS has to pass this value to the client and let them decide. If the client is adatas.database
it will attempt to rebase, read the head of the dataset it is committing to, and execute itsmergePolicy
(Dolt passes a noopmergePolicy
).chunks.ErrGCGenerationExpired
: This is similar to a moved root chunk, but with no guarantees about what chunks remain in the ChunkStore. Any information fromCS.Has(ctx, chunk)
is now stale. Writers must rewrite all data to the chunkstore.
- 909: use tr to lowercase output instead of {output,,}
lowercasing via parameter expansion${output,,}
is only supported on Bash 4+. I switched to usingtr
so I could run the tests locally. - 205: Implemented drop trigger
As discussed, we disallow dropping any triggers that are referenced in other triggers. - 204: Added trigger statements