Skip to content

Commit

Permalink
Initializing prototype
Browse files Browse the repository at this point in the history
  • Loading branch information
kshefchek authored Mar 30, 2021
1 parent 07f003c commit f7333ac
Show file tree
Hide file tree
Showing 134 changed files with 5,660 additions and 15 deletions.
29 changes: 17 additions & 12 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,30 +1,35 @@
# This is a basic workflow to help you get started with Actions

# Builds and runs pytest on ubuntu-latest
# Tests python versions >=3.6
name: CI

# Controls when the action will run.
on:
# Triggers the workflow on push or pull request events but only for the main branch
push:
branches: [ main ]
pull_request:
branches: [ main ]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
# https://github.com/actions/setup-python
test-python3-ubuntu-latest:
name: test py${{ matrix.python-version }} on linux
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.6', '3.7', '3.8', '3.9']
env:
PYTHON: ${{ matrix.python-version }}
OS: ubuntu

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2

# Runs a single command using the runners shell
- name: set up python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- name: make test
run: make test
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@ dmypy.json

# Pyre type checker
.pyre/

# IDE
.idea
44 changes: 44 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
##### Building locally

First create a virtual environment with your favorite tool, and activate eg
```bash
python3.8 -m venv venv
source venv/bin/activate
```

Install and test with make
```bash
make
```

Or with flit
```
pip install flit
flit install --deps develop --symlink
```

##### Linting and Formatting
TODO - write some docs on linting on formating

Lint with flake8, black, and isort
```bash
make lint
```

Format with autoflake, black, and isort (updates files in place)
```bash
make format
```

##### Build and Publish to PyPI
Building and publishing requires git >= 2.30

Build a wheel and an sdist (tarball) from the package:
```bash
make build
```

Publish to PyPI
```bash
make publish
```
54 changes: 54 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Note that you should be in your virtual environment of choice before running make

MAKEFLAGS += --warn-undefined-variables
MAKEFLAGS += --no-builtin-rules
MAKEFLAGS += --no-builtin-variables

.DEFAULT_GOAL := all
SHELL := bash

.PHONY: all
all: install-flit install-koza install-dev test

.PHONY: install-flit
install-flit:
pip install flit

.PHONY: install-koza
install-koza: install-flit
flit install --deps production --symlink

.PHONY: install-dev
install-dev: install-flit
flit install --deps develop --symlink

.PHONY: test
test: install-flit install-dev
python -m pytest

.PHONY: build
build:
flit build

.PHONY: publish
publish:
flit publish

.PHONY: clean
clean:
rm -rf `find . -name __pycache__`
rm -f `find . -type f -name '*.py[co]' `
rm -rf .pytest_cache
rm -rf dist

.PHONY: lint
lint:
flake8 --exit-zero --max-line-length 120 koza/ tests/
black --check --diff koza tests
isort --check-only --diff koza tests

.PHONY: format
format:
autoflake --remove-all-unused-imports --recursive --remove-unused-variables --in-place koza tests --exclude=__init__.py
isort koza tests
black koza tests
64 changes: 61 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,63 @@
# Koza
### Koza

Data ingest framework for the Biolink model
![pupa](docs/img/pupa.png) Data ingest framework for the Biolink model

*Disclaimer*: Koza is in pre-alpha, see the [dev branch](https://github.com/monarch-initiative/koza/tree/dev) for a preview
*Disclaimer*: Koza is in pre-alpha


##### Highlights
Koza allows you to:

- Author transforms with dataclasses in semi-declarative Python
- Configure filters, metadata, and data mappings in yaml
- Import and optionally transform mapping files
- Create an ETL pipeline for multiple sources

While Koza aims to support a declarative programming paradigm, it
also supports procedural programming constructs

TODO describe assumptions for source data


###### What is out of scope mid term?

- Models other than Biolink

#### Installation

```
pip install koza
```

#### Getting Started

Send a TSV file through Koza to get some basic information (headers, number of rows)

```bash
koza run \
--file https://raw.githubusercontent.com/monarch-initiative/koza/dev/tests/resources/source-files/string.tsv \
--delimiter ' '
```

Or a jsonl formatted file
```bash
koza run \
--file ./tests/resources/source-files/ZFIN_PHENOTYPE_0.jsonl.gz \
--format jsonl
```

#### A Small Example - adding configuration and filters


#### A Full Example - Adding transform logic and maps


##### Adding Transform Logic


##### Adding A Map
TODO


##### Example of a procedural transform
TODO
17 changes: 17 additions & 0 deletions config/bioweave.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: 'Monarch Ingest'

output: ./out/

curie_map: /foo/bar/baz

sources:
- foo
- bar
- baz

serializations:
- nturtles
- jsonlines
- tsv

config_dir: /foo/bar/baz
Loading

0 comments on commit f7333ac

Please sign in to comment.