Skip to content

Commit

Permalink
Rename to sdic + added more docs before release
Browse files Browse the repository at this point in the history
  • Loading branch information
lra committed Jun 6, 2016
1 parent e008048 commit cd5d3e3
Show file tree
Hide file tree
Showing 10 changed files with 122 additions and 28 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@
*.pyc

# Generated by make release
/sql_data_integrity_checker.egg-info/
/sdic.egg-info/
/dist/
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ test:

clean:
rm -rf dist/
rm -rf sql_data_integrity_checker.egg-info
rm -rf sdic.egg-info

release: clean
python setup.py sdist
Expand Down
114 changes: 101 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,130 @@
# sql-data-integrity-checker
# sdic

[![CircleCI](https://circleci.com/gh/percolate/sql-data-integrity-checker.svg?style=svg)](https://circleci.com/gh/percolate/sql-data-integrity-checker)
[![codecov](https://codecov.io/gh/percolate/sql-data-integrity-checker/branch/master/graph/badge.svg)](https://codecov.io/gh/percolate/sql-data-integrity-checker)
__A.K.A. SQL Data Integrity Checker__

Asynchronous soft constraints executed against you databases.
Queries that are intended to be ran here should produce `bad data`,
or data that should not be in the table that is the object of the query.
[![CircleCI](https://circleci.com/gh/percolate/sdic.svg?style=svg)](https://circleci.com/gh/percolate/sdic)
[![codecov](https://codecov.io/gh/percolate/sdic/branch/master/graph/badge.svg)](https://codecov.io/gh/percolate/sdic

## One line purpose

`sdic` executes all the SQL queries found in a folder and display its output.

## More detailled purpose

In any RDBMS, you can set constraints to prevent the application to save the
data in a way that's not consistent. E.g. if you want all your users to have an
email, you can set the email column to `NOT NULL`.

This works for simple constraints:

1. It's easy to implement
1. It's cheap for the database to check on every change

But for more complex constraints that you'd like to set, it'd be either very
expensinve to check on every write, or even impossible to write as a
constraint.

With `sdic`, you can write you complex constraints as simple queries, and have
the database run them asynchronously at the occurence you want.

We call them "soft constraints".

## Example

Let's say that you have a `users` table, defined like this:

- `id` Primary Key `NOT NULL`
- `firstname` `NULL`
- `lastname` `NULL`
- `email` `NOT NULL`

Now, let's suppose your application allow users to register just with their
`email` but can fill in they `firstname` and `lastname` later on, but we don't
want our users to have only a firstname or a lastname.

Simply put, our constraint is: Make sure every users has either a `firstname`
and a `lastname` set, or both set to NULL.

With `sdic`, you can add this `enforce_fullname.sql` file and let `sdic` check
that every user comply nightly.

```sql
-- Make sure every user with a name has both a firstname and a lastname
SELECT id, firstname, lastname
FROM users
WHERE
(firstname IS NULL AND lastname IS NOT NULL) OR
(firstname IS NOT NULL AND lastname IS NULL)
LIMIT 10
;
```

Put this file in `your-environment/your-server/enforce_fullname.sql`.

Edit the `your-environment/servers.ini` file to tell sdic how to connect to
your server.

Now run `sdic your-environment` and it will output any user that do no comply
with your soft constraint.

You can have as many soft constraints on as many servers and as many
environments as you need.

## Install as a cron

If you want to get an email every night to give you a list of all the soft
constraints that have been broken during the last day, just add it to you
crontab. We like to have it run daily, so we can fix any bug generating bad
data before it becomes a real problem.

Example crontab:
```
MAILTO="[email protected]"
@daily sdic live
```

`[email protected]` is the email that will get the soft constraints broken every
day. Make sure your local MTA is well configured on your system. You can test
it by doing `date | mail -s test [email protected]`.

## Databases supported

Any database supported by [SQLAlchemy](http://www.sqlalchemy.org/) should be
supported, including [PostgreSQL](https://www.postgresql.org/) and
[MySQL](https://www.mysql.com/).

## Install

`pip install sql-data-integrity-checker`
`pip install sdic`

## Configuration

An example configuration is given in the `example-environment` folder.

The script reads from a designated folder, whose path you pass as an argument.
This folder should consist of the following:

1. A `servers.ini` file, which contains the Database URL/s (see`examples` folder)

1. A `servers.ini` file, which contains the Database URLs (see the
`example-environment` folder)
1. A sub-folder, which contains the actual queries in a `.sql` file format

## Usage

A `directory` argument is mandatory:

`sql-data-integrity-checker path/to/your/folder`
`sdic path/to/your/folder`

If you have e.g more than one server in a folder, but you want to
only run one of them, an optional `server` argument can be passed as well:

`sql-data-integrity-checker path/to/your/folder server1`
`sdic path/to/your/folder server1`

If a query produces an output, it will look something like this:

```bash
```
-----===== /!\ INCOMING BAD DATA /!\ =====-----
Server: circleci
Server: big-database
File: test_query.sql
SQL Query:
Expand Down
4 changes: 4 additions & 0 deletions example-environment/server1/test_query.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- This is a query that returns current time.
-- You should never do this, but this can be used to see if sdic is actually
-- printing results.
SELECT NOW();
4 changes: 4 additions & 0 deletions example-environment/server2/test_query.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- This is a query that returns number 1.
-- You should never do this, but this can be used to see if sdic is actually
-- printing results.
SELECT 1;
File renamed without changes.
2 changes: 0 additions & 2 deletions examples/server1/test_query.sql

This file was deleted.

2 changes: 0 additions & 2 deletions examples/server2/test_query.sql

This file was deleted.

10 changes: 6 additions & 4 deletions sdic/main.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
#!/usr/bin/env python
"""sql-data-integrity-checker
"""sdic
A.K.A. SQL Data Integrity Checker
Asynchronous soft constraints executed against your databases.
The path to your queries and servers.ini files should be defined as an arg.
Optionally, declare a single server if you have multiple ones
in a directory, but want to only run one.
Usage:
sql_data_integrity_checker <directory> [<server>]
sdic <directory> [<server>]
Options:
-h --help Show this screen.
Expand Down Expand Up @@ -167,7 +169,7 @@ def get_servers_from_config(directory):

def main():
args = docopt(__doc__,
version="sql-data-integrity-checker {}".format(VERSION))
version="sdic {}".format(VERSION))

# Check that the given directory exists
if not isdir(args['<directory>']):
Expand All @@ -185,7 +187,7 @@ def main():

# Everything's ok, run the main program
with lock:
syslog.openlog('data_integrity_checker')
syslog.openlog('sdic')

has_output = False
if not args['<server>']:
Expand Down
10 changes: 5 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
"""Setup file to automate the install of Mackup in the Python environment."""
"""Setup file to automate the install of sdic in the Python environment."""
from setuptools import setup
from sdic.constants import VERSION


setup(
name='sql-data-integrity-checker',
name='sdic',
version=VERSION,
author='Laurent Raufaste',
author_email='[email protected]',
url='https://github.com/percolate/sql-data-integrity-checker',
url='https://github.com/percolate/sdic',
description='Asynchronous soft constraints executed against you databases',
keywords='sql mysql postgresql sqlalchemy data integrity constraints',
keywords='sdic sql mysql postgresql sqlalchemy data integrity constraints',
license='GPLv3',
packages=['sdic'],
install_requires=['docopt', 'prettytable'],
entry_points={
'console_scripts': [
'sql-data-integrity-checker=sdic.main:main',
'sdic=sdic.main:main',
],
},
classifiers=[
Expand Down

0 comments on commit cd5d3e3

Please sign in to comment.