Skip to content

[DEPRECATED] A Python library for storing Data Packages in different storages.

License

Notifications You must be signed in to change notification settings

openknowledge-archive/datapackage-storage-py

Repository files navigation

datapackage-storage-py

Travis Coveralls

A Python library for storing Data Packages in different storages.

Import/Export

See section below how to get tabular storage object.

High-level API is easy to use.

Having Data Package in current directory we can import it to storage:

import datapackage_storage

datapackage_storage.import_package(<storage>, 'descriptor.json')

Also we can export it from storage:

import datapackage_storage

datapackage_storage.export_package(<storage>, 'descriptor.json', 'datapackage_name')

Tabular Storage

On level between the high-level interface and low-level driver package uses Tabular Storage concept:

Tabular Storage

BigQuery

Install jsontableschema-bigquery-py package.

To start using Google BigQuery service:

  • Create a new project - link
  • Create a service key - link
  • Download json credentials and set GOOGLE_APPLICATION_CREDENTIALS environment variable

We can get storage this way:

import io
import os
import json
import jtsbq
from apiclient.discovery import build
from oauth2client.client import GoogleCredentials

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '.credentials.json'
credentials = GoogleCredentials.get_application_default()
service = build('bigquery', 'v2', credentials=credentials)
project = json.load(io.open('.credentials.json', encoding='utf-8'))['project_id']
storage = jtsbq.Storage(service, project, 'dataset', prefix='prefix')

SQL

Install jsontableschema-sql-py package.

SQLAlchemy is used as sql wrapper. We can get storage this way:

import jtssql
from sqlalchemy import create_engine

engine = create_engine('sqlite:///:memory:', prefix='prefix')
storage = jtssql.Storage(engine)

Mappings

datapackage.json -> *not stored*
datapackage.json resources -> storage tables
data/data.csv schema -> storage table schema
data/data.csv data -> storage table data

Documentation

API documentation is presented as docstings:

About

[DEPRECATED] A Python library for storing Data Packages in different storages.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages