Skip to content

Latest commit

 

History

History
61 lines (43 loc) · 1.94 KB

README.md

File metadata and controls

61 lines (43 loc) · 1.94 KB

Pyversions PyPi

Koza

pupa Data transformation framework

Disclaimer: Koza is in beta; we are looking for beta testers

Transform csv, json, yaml, jsonl, and xml and converting them to a target csv, json, or jsonl format based on your dataclass model. Koza also can output data in the KGX format

Documentation: https://koza.monarchinitiative.org/

Highlights
  • Author data transforms in semi-declarative Python
  • Configure source files, expected columns/json properties and path filters, field filters, and metadata in yaml
  • Create or import mapping files to be used in ingests (eg id mapping, type mappings)
  • Create and use translation tables to map between source and target vocabularies

Installation

pip install koza

Getting Started

Send a local or remove csv file through Koza to get some basic information (headers, number of rows)

koza validate \
  --file https://raw.githubusercontent.com/monarch-initiative/koza/main/examples/data/string.tsv \
  --delimiter ' '

Sending a json or jsonl formatted file will confirm if the file is valid json or jsonl

koza validate \
  --file ./examples/data/ZFIN_PHENOTYPE_0.jsonl.gz \
  --format jsonl
koza validate \
  --file ./examples/data/ddpheno.json.gz \
  --format json \
  --compression gzip
Example: transforming StringDB
koza transform --source examples/string/protein-links-detailed.yaml --global-table examples/translation_table.yaml 

koza transform --source examples/string-declarative/protein-links-detailed.yaml --global-table examples/translation_table.yaml