Skip to content

Commit 0f17d50

Browse files
authored
Merge pull request #361 from diffix/edon/docs
Docs cleanups
2 parents 11413cc + 761b3f6 commit 0f17d50

File tree

4 files changed

+80
-71
lines changed

4 files changed

+80
-71
lines changed

README.md

Lines changed: 38 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,13 @@
1-
# Important notice
2-
3-
This is a pre-release version of the extension and is not intended for general use yet.
4-
It may be unstable and documentation is limited.
5-
If you have any questions, please contact us at [[email protected]](mailto:[email protected]).
6-
71
# PG Diffix
82

93
`pg_diffix` is a PostgreSQL extension for strong dynamic anonymization. It ensures that answers to simple SQL queries are anonymous. For more information, visit the [Open Diffix](https://www.open-diffix.org/) website.
104

11-
Check out the [Admin Tutorial](docs/admin_tutorial.md) for an example on how to set up `pg_diffix`.
12-
See the [Admin Guide](docs/admin_guide.md) for details on configuring and using the extension.
5+
**For administrators:** Check out the [admin tutorial](docs/admin_tutorial.md) for an example on how to set up `pg_diffix`.
6+
See the [admin guide](docs/admin_guide.md) for details on configuring and using the extension.
7+
To install from source, see the [installation](#installation) section.
8+
9+
**For analysts:** The [banking notebook](docs/banking.ipynb) provides example queries against a real dataset.
10+
The [analyst guide](docs/analyst_guide.md) describes the SQL features and limitations imposed by `pg_diffix`.
1311

1412
## Installation
1513

@@ -34,7 +32,9 @@ every session start for restricted users. This can be accomplished by configurin
3432
For example, to automatically load the `pg_diffix` extension for all users connecting to a database,
3533
you can execute the following command:
3634

37-
`ALTER DATABASE db_name SET session_preload_libraries TO 'pg_diffix';`
35+
```
36+
ALTER DATABASE db_name SET session_preload_libraries TO 'pg_diffix';
37+
```
3838

3939
Once loaded, the extension logs information to `/var/log/postgresql/postgresql-13-main.log` or equivalent.
4040

@@ -48,7 +48,9 @@ You might also need to remove the extension from the list of preloaded libraries
4848

4949
For example, to reset the list of preloaded libraries for a database, you can execute the following command:
5050

51-
`ALTER DATABASE db_name SET session_preload_libraries TO DEFAULT;`
51+
```
52+
ALTER DATABASE db_name SET session_preload_libraries TO DEFAULT;
53+
```
5254

5355
## Testing the extension
5456

@@ -67,7 +69,10 @@ or if available, just make your usual PostgreSQL user a `SUPERUSER`.
6769

6870
Or you can use the [PGXN Extension Build and Test Tools](https://github.com/pgxn/docker-pgxn-tools) Docker image:
6971

70-
`docker run -it --rm --mount "type=bind,src=$(pwd),dst=/repo" pgxn/pgxn-tools sh -c 'cd /repo && apt update && apt install -y jq && pg-start 13 && pg-build-test'`.
72+
```
73+
docker run -it --rm --mount "type=bind,src=$(pwd),dst=/repo" pgxn/pgxn-tools sh -c \
74+
'cd /repo && apt update && apt install -y jq && pg-start 13 && pg-build-test'
75+
```
7176

7277
## Docker images
7378

@@ -82,15 +87,21 @@ The example below shows how to build the image and run a minimally configured co
8287

8388
Build the image:
8489

85-
`make image`
90+
```
91+
make image
92+
```
8693

8794
Run the container in foreground and expose in port 10432:
8895

89-
`docker run --rm --name pg_diffix -e POSTGRES_PASSWORD=postgres -p 10432:5432 pg_diffix`
96+
```
97+
docker run --rm --name pg_diffix -e POSTGRES_PASSWORD=postgres -p 10432:5432 pg_diffix
98+
```
9099

91100
From another shell you can connect to the container via `psql`:
92101

93-
`psql -h localhost -p 10432 -d postgres -U postgres`
102+
```
103+
psql -h localhost -p 10432 -d postgres -U postgres
104+
```
94105

95106
For more advanced usage see the [official image reference](https://hub.docker.com/_/postgres).
96107

@@ -108,16 +119,25 @@ Three users are created, all of them with password `demo`:
108119

109120
Build the image:
110121

111-
`make demo-image`
122+
```
123+
make demo-image
124+
```
112125

113126
Run the container in foreground and expose in port 10432:
114127

115-
`docker run --rm --name pg_diffix_demo -e POSTGRES_PASSWORD=postgres -e BANKING_PASSWORD=demo -p 10432:5432 pg_diffix_demo`
128+
```
129+
docker run --rm --name pg_diffix_demo -e POSTGRES_PASSWORD=postgres -e BANKING_PASSWORD=demo -p 10432:5432 pg_diffix_demo
130+
```
116131

117132
Connect to the banking database (from another shell) for anonymized access:
118133

119-
`psql -h localhost -p 10432 -d banking -U trusted_user`
134+
```
135+
psql -h localhost -p 10432 -d banking -U trusted_user
136+
```
120137

121138
To keep the container running you can start it in detached mode and with a restart policy:
122139

123-
`docker run -d --name pg_diffix_demo --restart unless-stopped -e POSTGRES_PASSWORD=postgres -e BANKING_PASSWORD=demo -p 10432:5432 pg_diffix_demo`
140+
```
141+
docker run -d --name pg_diffix_demo --restart unless-stopped \
142+
-e POSTGRES_PASSWORD=postgres -e BANKING_PASSWORD=demo -p 10432:5432 pg_diffix_demo
143+
```

docs/admin_guide.md

Lines changed: 5 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,3 @@
1-
# Important notice
2-
3-
This is a pre-release version of the extension and is not intended for general use yet.
4-
It may be unstable and documentation is limited.
5-
If you have any questions, please contact us at [[email protected]](mailto:[email protected]).
6-
71
# Configuration
82

93
This document provides detailed information about the configuration, behavior and recommended usage of `pg_diffix`.
@@ -42,7 +36,7 @@ Trusted users have fewer SQL restrictions than untrusted users, and therefore ha
4236

4337
For example, the command to assign the access level `anonymized_untrusted` to the role `public_access` is:
4438

45-
```SQL
39+
```
4640
CALL diffix.mark_role('public_access', 'anonymized_untrusted');
4741
```
4842

@@ -75,12 +69,12 @@ __NOTE:__ if AID columns are not correctly labeled, the extension may fail to an
7569
The procedure `diffix.mark_personal(table_name, aid_columns...)` is used to label a table as personal and
7670
to label its AID columns. For example:
7771

78-
```SQL
72+
```
7973
CALL diffix.mark_personal('employee_info', 'employee_id');
8074
```
8175
labels the table `employee_info` as personal, and labels the `employee_id` column as an AID column.
8276

83-
```SQL
77+
```
8478
CALL diffix.mark_personal('transactions', 'sender_acct', 'receiver_acct');
8579
```
8680
labels the table `transactions` as personal, and labels the `sender_acct` and `receiver_acct` columns as AID columns.
@@ -158,17 +152,7 @@ Default value is `*`. Any user can change this setting.
158152

159153
## Restricted features and extensions
160154

161-
**TODO:** I think this kind of information is better put in the notebook tutorial? Or if you want it here it seems incomplete or something. Needs work...
162-
163-
For users other than `direct`, various data and features built into PostgreSQL are restricted. Among others:
164-
165-
1. Issue utility statements like `COPY` and `ALTER TABLE`, beside a few allowlisted ones, are not allowed.
166-
2. Some of the data in `pg_catalog` tables like `pg_user_functions` is not accessible.
167-
3. Selected subset of less frequently used PostgreSQL query features like `EXISTS` or `NULLIF` are disabled.
168-
4. Inheritance involving a personal table is not allowed.
169-
5. Some of the output of `EXPLAIN` for queries involving a personal table is censored.
170-
171-
**NOTE** If any of the currently blocked features is necessary for your use case, open an issue and let us know.
155+
For a detailed description of supported SQL features and restrictions, see the [analyst guide](analyst_guide.md).
172156

173157
Row level security (RLS) can be enabled and used on personal tables.
174158
It is advised that the active policies are vetted from the point of view of anonymity.
@@ -192,7 +176,7 @@ Given that AIDs may not be perfect, some care must be taken in the selection of
192176

193177
For example, imagine the following query in a table where `account_number` is the AID column:
194178

195-
```sql
179+
```
196180
SELECT last_name, religion, count(*)
197181
FROM table
198182
GROUP BY last_name, religion

docs/admin_tutorial.md

Lines changed: 37 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,3 @@
1-
# Important notice
2-
3-
This is a pre-release version of the extension and is not intended for general use yet.
4-
It may be unstable and documentation is limited.
5-
If you have any questions, please contact us at [[email protected]](mailto:[email protected]).
6-
71
# Admin tutorial
82

93
This document provides an example on how to install and configure `pg_diffix` to expose a simple dataset
@@ -14,52 +8,70 @@ containing a column named `id`, which uniquely identifies protected entities (th
148

159
## Installation
1610

17-
1. Install the packages required for building the extension:
11+
1\. Install the packages required for building the extension:
1812

19-
`sudo apt-get install make jq gcc postgresql-server-dev-14`
13+
```
14+
sudo apt-get install make jq gcc postgresql-server-dev-14
15+
```
2016

21-
2. Install PGXN Client tools:
17+
2\. Install PGXN Client tools:
2218

23-
`sudo apt-get install pgxnclient`
19+
```
20+
sudo apt-get install pgxnclient
21+
```
2422

25-
3. Install the extension:
23+
3\. Install the extension:
2624

27-
`sudo pgxn install pg_diffix`
25+
```
26+
sudo pgxn install pg_diffix
27+
```
2828

2929
## Activation
3030

31-
1. Connect to the database as a superuser:
31+
1\. Connect to the database as a superuser:
3232

33-
`sudo -u postgres psql test_db`
33+
```
34+
sudo -u postgres psql test_db
35+
```
3436

35-
2. Activate the extension for the current database:
37+
2\. Activate the extension for the current database:
3638

37-
`CREATE EXTENSION pg_diffix;`
39+
```
40+
CREATE EXTENSION pg_diffix;
41+
```
3842

39-
3. Automatically load the extension for all users connecting to the database:
43+
3\. Automatically load the extension for all users connecting to the database:
4044

41-
`ALTER DATABASE test_db SET session_preload_libraries TO 'pg_diffix';`
45+
```
46+
ALTER DATABASE test_db SET session_preload_libraries TO 'pg_diffix';
47+
```
4248

4349
## Configuration
4450

45-
1. Label the test data as personal (requiring anonymization):
51+
1\. Label the test data as personal (requiring anonymization):
4652

47-
`CALL diffix.mark_personal('test_table', 'id');`
53+
```
54+
CALL diffix.mark_personal('test_table', 'id');
55+
```
4856

49-
2. Create an account for the analyst:
57+
2\. Create an account for the analyst:
5058

51-
`CREATE USER analyst_role WITH PASSWORD 'some_password';`
59+
```
60+
CREATE USER analyst_role WITH PASSWORD 'some_password';
61+
```
5262

53-
3. Give the analyst read-only access to the test database:
63+
3\. Give the analyst read-only access to the test database:
5464

5565
```
5666
GRANT CONNECT ON DATABASE test_db TO analyst_role;
5767
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst_role;
5868
```
5969

60-
4. Label the analyst as restricted and trusted:
70+
4\. Label the analyst as restricted and trusted:
6171

62-
`CALL diffix.mark_role('analyst_role', 'anonymized_trusted');`
72+
```
73+
CALL diffix.mark_role('analyst_role', 'anonymized_trusted');
74+
```
6375

6476

6577
__That's it!__ The analyst can now connect to the database and issue (only) anonymizing queries against the test dataset.

docs/analyst_guide.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,3 @@
1-
# Important notice
2-
3-
This is a pre-release version of the extension and is not intended for general use yet.
4-
It may be unstable and documentation is limited.
5-
If you have any questions, please contact us at [[email protected]](mailto:[email protected]).
6-
71
# Analyst guide
82

93
This document describes features and restrictions of `pg_diffix` for users with anonymized access to a database.
@@ -12,7 +6,6 @@ mechanisms that Diffix Elm uses to protect personal data.
126

137
## Table of Contents
148

15-
- [Important notice](#important-notice)
169
- [Analyst guide](#analyst-guide)
1710
- [Table of Contents](#table-of-contents)
1811
- [Access levels](#access-levels)

0 commit comments

Comments
 (0)