Skip to content

Graviti-AI/graviti-python-sdk

Repository files navigation

Graviti Python SDK

Pre-commit Documentation Status GitHub PyPI PyPI - Python Version Downloads

Graviti Python SDK is a python library to access Graviti workspace and manage your datasets. It provides a pythonic way to access your datasets by Graviti OpenAPI.


NOTE: This project is still on pre-alpha stage, may have breaking changes in the future.


Installation

Graviti can be installed from PyPI:

pip3 install graviti

Or from source:

git clone https://github.com/Graviti-AI/graviti-python-sdk.git
cd graviti-python-sdk
pip install -e .

Documentation

More information can be found on the documentation site

Usage

Before using Graviti SDK, please finish the following registration steps:

Get a Dataset

Workspace initialization:

from graviti import Workspace
ws = Workspace(f"{YOUR_ACCESSKEY}")

List datasets on the workspace:

>>> ws.datasets.list()
LazyPagingList [
  Dataset("graviti-example/Graviti-dataset-demo")
]

Get one dataset:

>>> dataset = ws.datasets.get("Graviti-dataset-demo")
>>> dataset
Dataset("graviti-example/Graviti-dataset-demo")(
  (alias): '',
  (default_branch): 'main',
  (created_at): 2022-07-20 04:22:35+00:00,
  (updated_at): 2022-07-20 04:23:45+00:00,
  (is_public): False,
  (storage_config): 'AmazonS3-us-west-1'
)

Switch Between Different Versions

View the current version of the dataset:

>>> dataset.HEAD
Branch("main")(
  (commit_id): '47293b32f28c4008bc0f25b847b97d6f',
  (parent): None,
  (title): 'Commit-1',
  (committer): 'graviti-example',
  (committed_at): 2022-07-20 04:22:35+00:00,
)

List history commits:

>>> dataset.commits.list()
LazyPagingList [
  Commit("47293b32f28c4008bc0f25b847b97d6f")
]

List all branches:

>>> dataset.branches.list()
LazyPagingList [
  Branch("main"),
  Branch("dev")
]

List all tags:

>>> dataset.tags.list()
LazyPagingList [
  Tag("v1.0")
]

Checkout commit/branch/tag:

>>> dataset.checkout("47293b32f28c4008bc0f25b847b97d6f")  # commit id
>>> dataset.HEAD
Commit("47293b32f28c4008bc0f25b847b97d6f")(
  (parent): None,
  (title): 'Commit-1',
  (committer): 'graviti-example',
  (committed_at): 2022-07-20 04:22:35+00:00,
)
>>> dataset.checkout("dev")  # branch name
>>> dataset.HEAD
Branch("dev")(
  (commit_id): '47293b32f28c4008bc0f25b847b97d6f',
  (parent): None,
  (title): 'Commit-1',
  (committer): 'graviti-example',
  (committed_at): 2022-07-20 04:22:35+00:00,
)
>>> dataset.checkout("v1.0")  # tag name
>>> dataset.HEAD
Commit("47293b32f28c4008bc0f25b847b97d6f")(
  (parent): None,
  (title): 'Commit-1',
  (committer): 'graviti-example',
  (committed_at): 2022-07-20 04:22:35+00:00,
)

Get a Sheet

List all sheets:

>>> list(dataset.keys())
['train']

Get a sheet:

>>> dataset["train"]
   filename  box2ds
0  a.jpg     DataFrame(1, 6)
1  b.jpg     DataFrame(1, 6)
2  c.jpg     DataFrame(1, 6)

Get the Data

Get the DataFrame:

>>> df = dataset["train"]
>>> df
   filename  box2ds
0  a.jpg     DataFrame(1, 6)
1  b.jpg     DataFrame(1, 6)
2  c.jpg     DataFrame(1, 6)

View the schema of the sheet:

>>> df.schema
record(
  fields={
    'filename': string(),
    'box2ds': array(
      items=label.Box2D(
        coords=float32(),
        categories=['boat', 'car'],
        attributes={
          'difficult': boolean(),
          'occluded': boolean(),
        },
      ),
    ),
  },
)

Get the data by rows or columns:

>>> df.loc[0]
filename  a.jpg
box2ds    DataFrame(1, 6)
>>> df["box2ds"]
0  DataFrame(1, 6)
1  DataFrame(1, 6)
2  DataFrame(1, 6)
>>> df.loc[0]["box2ds"]
   xmin  ymin  xmax  ymax  category  attribute
                                     difficult  occluded
0  1.0   1.0   4.0   5.0   boat      False      False
>>> df["box2ds"][0]
   xmin  ymin  xmax  ymax  category  attribute
                                     difficult  occluded
0  1.0   1.0   4.0   5.0   boat      False      False