Skip to content

An on-disk pythonic embedded key-value store for compressed data storage and distributed data analysis

License

Notifications You must be signed in to change notification settings

ayaanhossain/ShareDB

Repository files navigation

CI-badge codecov-badge version-badge python-badge os-badge license-badge

ShareDB in ActionInstallationLicenseContributingAcknowledgementsAPI

ShareDB is a lightweight, persistent key-value store with a dictionary-like interface built on top of LMDB. It is intended to replace a python dictionary when

  1. the key-value information needs to persist locally for later reuse,
  2. the data needs to be shared across multiple processes with minimal overhead, and
  3. the keys and values can be (de)serialized via msgpack or pickle.

A ShareDB instance may be opened simultaneously in children, for reading in parallel, as long as a single process writes to the instance. Parallel writes made across processes are not safe; they are not guaranteed to be written, and may corrupt instance. ShareDB is primarily developed and tested using Linux and is compatible with both Python 2.7 and Python 3.6 and above.

ShareDB in Action

>>> from ShareDB import ShareDB           # Easy import
>>> print(ShareDB.__version__)            # Check version
1.1.4
>>> myDB = ShareDB(path='./test.ShareDB') # Store ShareDB locally
>>> myDB['Name'] = ['Ayaan Hossain']      # Insert information
>>> myDB.get(key='Name')                  # Retrieve values
['Ayaan Hossain']
>>> # Accelerated batch insertion/update via a single transaction
>>> len(myDB.multiset(kv_iter=zip(range(0, 10), range(10, 20))).sync())
11
>>> 7 in myDB                             # Membership queries work
True
>>> myDB['non-existent key']              # KeyError on invalid get as expected
Traceback (most recent call last):
...
KeyError: "key=non-existent key of <class 'str'> is absent"
>>> myDB.pop(7)                           # Pop a key just like a dictionary
17
>>> list(myDB.multipopitem(num_items=5))  # Or, pop as many items as you need
[(0, 10), (1, 11), (2, 12), (3, 13), (4, 14)]
>>> myDB.remove(5).remove(6).length()     # Chain removal of several keys
2
>>> myDB.clear().length()                 # Or, clear entire ShareDB
0
>>> myDB.drop()                           # Close/delete when you're done
True

ShareDB methods either return data/result up on appropriate query, or a self is returned to facilitate method chaining. Terminal methods .close() and .drop() return a boolean indicating success.

Please see the /examples/ directory for full examples of ShareDB usage. Please see the API.md file for API details.

Installation

One-shot installation/upgrade of ShareDB from PyPI.

$ pip install --upgrade ShareDB

Alternatively, clone ShareDB from GitHub,

$ git clone https://github.com/ayaanhossain/ShareDB

navigate into repo, and install via pip.

$ cd ShareDB
$ pip install .

You can test ShareDB with pytest inside the /tests/ directory.

$ cd tests
$ pytest

Uninstallation of ShareDB is easy with pip.

$ pip uninstall ShareDB

License

ShareDB (c) 2019-2024 Ayaan Hossain.

ShareDB is an open-source software under MIT License.

See LICENSE file for more details.

Contributing

Please discuss any issues/bugs you're facing, or any changes/features you have in mind by opening an issue, following the Contributor Covenant. See COC.md file for details. Please provide detailed information, and code snippets to facilitate debugging.

To contribute to ShareDB, please clone this repository, commit your code on a separate new branch, and submit a pull request. Please annotate and describe all new and modified code with detailed comments and new unit tests as applicable. Please ensure that modified builds pass existing unit tests before sending pull-requests. For versioning, we use SemVer.

Acknowledgements

ShareDB is maintained by:

ShareDB was originally written to meet data analysis needs in Prof. Howard Salis' Lab at Penn State University.

API

ShareDB API details can be found in the API.md file.