Skip to content

postpayio/ness

Repository files navigation

Ness

A Python datalake client.

Test Coverage Package version

Requirements

Installation

pip install pyarrow ness

Quickstart

import ness

dl = ness.dl(bucket="mybucket", key="mydatalake")
df = dl.read("mytable")

Sync

# Sync all tables
dl.sync()

# Sync a single table
dl.sync("mytable")

# Sync and read a single table
df = dl.read("mytable", sync=True)

Format

Specify the input data source format, the default format is parquet:

import ness

dl = ness.dl(bucket="mybucket", key="mydatalake", format="csv")

AWS Profile

Files are synced using default AWS profile, you can configure another one:

import ness

dl = ness.dl(bucket="mybucket", key="mydatalake", profile="myprofile")

Command Line

Usage: ness sync [OPTIONS] S3_URI

Options:
  --format TEXT   Data lake source format.
  --profile TEXT  AWS profile.
  --table TEXT    Table name to sync.
  --help          Show this message and exit.
ness sync bucket/key --table mytable