Graviti Python SDK is a python library to access Graviti workspace and manage your datasets. It provides a pythonic way to access your datasets by Graviti OpenAPI.
NOTE: This project is still on pre-alpha stage, may have breaking changes in the future.
Graviti can be installed from PyPI:
pip3 install graviti
Or from source:
git clone https://github.com/Graviti-AI/graviti-python-sdk.git
cd graviti-python-sdk
pip install -e .
More information can be found on the documentation site
Before using Graviti SDK, please finish the following registration steps:
- Please visit Graviti to sign up.
- Please visit Graviti Developer Tools to get an AccessKey.
Workspace initialization:
from graviti import Workspace
ws = Workspace(f"{YOUR_ACCESSKEY}")
List datasets on the workspace:
>>> ws.datasets.list()
LazyPagingList [
Dataset("graviti-example/Graviti-dataset-demo")
]
Get one dataset:
>>> dataset = ws.datasets.get("Graviti-dataset-demo")
>>> dataset
Dataset("graviti-example/Graviti-dataset-demo")(
(alias): '',
(default_branch): 'main',
(created_at): 2022-07-20 04:22:35+00:00,
(updated_at): 2022-07-20 04:23:45+00:00,
(is_public): False,
(storage_config): 'AmazonS3-us-west-1'
)
View the current version of the dataset:
>>> dataset.HEAD
Branch("main")(
(commit_id): '47293b32f28c4008bc0f25b847b97d6f',
(parent): None,
(title): 'Commit-1',
(committer): 'graviti-example',
(committed_at): 2022-07-20 04:22:35+00:00,
)
List history commits:
>>> dataset.commits.list()
LazyPagingList [
Commit("47293b32f28c4008bc0f25b847b97d6f")
]
List all branches:
>>> dataset.branches.list()
LazyPagingList [
Branch("main"),
Branch("dev")
]
List all tags:
>>> dataset.tags.list()
LazyPagingList [
Tag("v1.0")
]
Checkout commit/branch/tag:
>>> dataset.checkout("47293b32f28c4008bc0f25b847b97d6f") # commit id
>>> dataset.HEAD
Commit("47293b32f28c4008bc0f25b847b97d6f")(
(parent): None,
(title): 'Commit-1',
(committer): 'graviti-example',
(committed_at): 2022-07-20 04:22:35+00:00,
)
>>> dataset.checkout("dev") # branch name
>>> dataset.HEAD
Branch("dev")(
(commit_id): '47293b32f28c4008bc0f25b847b97d6f',
(parent): None,
(title): 'Commit-1',
(committer): 'graviti-example',
(committed_at): 2022-07-20 04:22:35+00:00,
)
>>> dataset.checkout("v1.0") # tag name
>>> dataset.HEAD
Commit("47293b32f28c4008bc0f25b847b97d6f")(
(parent): None,
(title): 'Commit-1',
(committer): 'graviti-example',
(committed_at): 2022-07-20 04:22:35+00:00,
)
List all sheets:
>>> list(dataset.keys())
['train']
Get a sheet:
>>> dataset["train"]
filename box2ds
0 a.jpg DataFrame(1, 6)
1 b.jpg DataFrame(1, 6)
2 c.jpg DataFrame(1, 6)
Get the DataFrame:
>>> df = dataset["train"]
>>> df
filename box2ds
0 a.jpg DataFrame(1, 6)
1 b.jpg DataFrame(1, 6)
2 c.jpg DataFrame(1, 6)
View the schema of the sheet:
>>> df.schema
record(
fields={
'filename': string(),
'box2ds': array(
items=label.Box2D(
coords=float32(),
categories=['boat', 'car'],
attributes={
'difficult': boolean(),
'occluded': boolean(),
},
),
),
},
)
Get the data by rows or columns:
>>> df.loc[0]
filename a.jpg
box2ds DataFrame(1, 6)
>>> df["box2ds"]
0 DataFrame(1, 6)
1 DataFrame(1, 6)
2 DataFrame(1, 6)
>>> df.loc[0]["box2ds"]
xmin ymin xmax ymax category attribute
difficult occluded
0 1.0 1.0 4.0 5.0 boat False False
>>> df["box2ds"][0]
xmin ymin xmax ymax category attribute
difficult occluded
0 1.0 1.0 4.0 5.0 boat False False