Skip to content

Learning in Public: PostgreSQL database for JSON Tabular Data Packages

License

Notifications You must be signed in to change notification settings

PostgreSqlStan/pgtab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pgtab

pgtab is a sample PostgreSQL database which can import, parse, construct, and export JSON Tabular Data Packages.

This is a learning project, not intended to do anything that can't be accomplished with existing external tools.

pgtab installation screenshot

Feedback is welcome. Create a repo issue, a discussion, or tag me on twitter.

Contributions might be welcome.

⚠️ Use at your own risk.

This is a work in progress and subject to change. Don't assume anything here works correctly or represents best practices. Some of my experiments are probably misguided.

Installation

Requirements: PostgreSQL 14 or newer.

Clone this repo and launch psql from the pgtab directory:

git clone [email protected]:PostgreSqlStan/pgtab.git
cd pgtab
psql -U postgres

❗ You don't know me and have no reason to trust me. So, follow the Principle of Least Privilege and use pgtab with a normal (not superuser) account.

Create the database, connect to it with an appropriate account, and run the build.psql script:

create database pgtab owner stan;
\c pgtab stan
\i build.psql

Optionally, run load_packages.psql to import sample data packages from the packages directory:

\i load_packages.psql

Contents

Most of pgtab is organized into two schemas (namespaces):

pgtab=> \dn+
                           List of schemas
  Name   │  Owner   │  Access privileges   │       Description
─────────┼──────────┼──────────────────────┼──────────────────────────
 pgtab   │ stan     │                      │ pgtab: main
 pgtab_i │ stan     │                      │ pgtab: import json

The utility view pgtab._ls lists visible pgtab objects and their descriptions.

pgtab=> table _ls;
 Schema │ Type  │        Name         │                      Description
────────┼───────┼─────────────────────┼───────────────────────────────────────────────────────
 pgtab  │ func  │ pgtab_version       │ return pgtab version
 pgtab  │ func  │ tf_field_pos        │ trigger function: set field.ordinal_position
 pgtab  │ func  │ tf_resource_updated │ trigger function: set resource.last_updated
 pgtab  │ proc  │ go                  │ go to selected package object
 pgtab  │ proc  │ go_                 │ set session variables to view selected package object
 pgtab  │ table │ field               │ resource fields
 pgtab  │ table │ package             │ packages
 pgtab  │ table │ resource            │ package resources
 pgtab  │ view  │ _ls                 │ view: schema objects
 pgtab  │ view  │ jf                  │ view: JSONB fields
 pgtab  │ view  │ jp                  │ view: JSONB package
 pgtab  │ view  │ jr                  │ view: JSONB resources
 pgtab  │ view  │ p                   │ view: package

go() – a psql UI

Other schemas (pgtab_p, pgtab_r, pgtab_f) are used by the go procedure, which makes it easy to examine and edit selected package items with psql without repeatedly typing WHERE qualifiers, joins, or specifying columns.

❕ This procedure relies on session settings and is unlikely to work with most external tools.

Assuming you imported the sample data packages, select the "gdp" package from the main schema (pgtab):

pgtab=> call go('gdp');
   selected
──────────────
 package: gdp

The search_path is changed to show views can be used to examine (and edit) elements of the JSON package:

  • p - selected package
  • r - package resources
  • readme - package readme
  • description - package readme
  • custom - json elements not defined in the database

From a selected package, a resource can be selected:

pgtab=> call go('gdp');
          selected
─────────────────────────────
 package: gdp, resource: gdp

The resource's fields are listed in the f view:

pgtab=> table f;
 # │     name     │  type  │ title │    description     │ format
───┼──────────────┼────────┼───────┼────────────────────┼────────
 1 │ Country Name │ string │ •     │ •                  │ •
 2 │ Country Code │ string │ •     │ •                  │ •
 3 │ Year         │ year   │ •     │ •                  │ •
 4 │ Value        │ number │ •     │ GDP in current USD │ •

To select a field, call go with the field #:

pgtab=> call go(1);
                     selected
──────────────────────────────────────────────────
 package: gdp, resource: gdp, field: Country Name

To move up in the hierachy (package->resource->field) call go with an empty string:

pgtab=> call go('');
          selected
─────────────────────────────
 package: gdp, resource: gdp

Call go without parameters to return to the main schema.

Package and resources are searched with the ~~* operator, which is case-insensitive and allows wildcards.

pgtab=> call go('eu%');
               selected
──────────────────────────────────────
 package: eu-emissions-trading-system

About

Learning in Public: PostgreSQL database for JSON Tabular Data Packages

Topics

Resources

License

Stars

Watchers

Forks