Generate and load BigQuery tables based on Data Package.
This section is intended to be used by end-users of the library.
See section below how to get tabular storage object.
High-level API is easy to use.
Having Data Package in current directory we can import it to bigquery database:
import dpbq
dpbq.import_package(<storage>, 'descriptor.json')
Also we can export it from bigquery database:
import dpbq
dpbq.export_package(<storage>, 'descriptor.json')
To start using Google BigQuery service:
- Create a new project - link
- Create a service key - link
- Download json credentials and set
GOOGLE_APPLICATION_CREDENTIALS
environment variable
We can get storage this way:
import io
import os
import json
import jtsbq
from apiclient.discovery import build
from oauth2client.client import GoogleCredentials
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '.credentials.json'
credentials = GoogleCredentials.get_application_default()
service = build('bigquery', 'v2', credentials=credentials)
project = json.load(io.open('.credentials.json', encoding='utf-8'))['project_id']
storage = jtsbq.Storage(service, project, 'dataset')
See jsontableschema layer readme.
datapackage.json -> *not stored*
datapackage.json resources -> bigquery tables
data/data.csv schema -> bigquery table schema
data/data.csv data -> bigquery table data
Default Google BigQuery client is used as part of jsontableschema-bigquery-py
package - docs.
This section is intended to be used by tech users collaborating on this project.
To activate virtual environment, install
dependencies, add pre-commit hook to review and test code
and get run
command as unified developer interface:
$ source activate.sh
The project follow the next style guides:
To check the project against Python style guide:
$ run review
To run tests with coverage check:
$ run test
Coverage data will be in the .coverage
file.