-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Kenneth V. Domingo
committed
Sep 6, 2024
1 parent
a45df31
commit 5ac85ce
Showing
6 changed files
with
201 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# Giga DataOps Platform: Data Sharing | ||
|
||
The Giga DataOps Platform is a data platform developed by Thinking Machines Data Science | ||
in coordination with UNICEF Giga. The objective of the platform is to ingest school data | ||
from various sources, applying concepts from master data management and data governance | ||
in order to produce a single source of truth—the **School Master Data**—which will then | ||
be used by consumers and downstream applications. | ||
|
||
This repository contains the code for the **Data Sharing** service of the Platform. | ||
|
||
## Table of Contents | ||
|
||
1. [Development](development.md) | ||
2. [Deployment](deployment.md) | ||
3. [Support](support.md) | ||
|
||
## Jump to other platform services | ||
|
||
- [Giga Sync](https://github.com/unicef/giga-data-ingestion) | ||
- [Dagster](https://github.com/unicef/giga-dagster) | ||
- [Datahub](https://github.com/unicef/giga-datahub) | ||
- [Superset](https://github.com/unicef/giga-superset) | ||
- [Trino](https://github.com/unicef/giga-trino) | ||
- [Monitoring](https://github.com/unicef/giga-monitoring) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Deployment Procedure | ||
|
||
CI/CD has been set up with Azure DevOps. To deploy, simply merge changes into the | ||
relevant branch: | ||
|
||
`main` > DEV | ||
|
||
`staging` > STG | ||
|
||
`production` > PRD | ||
|
||
To manually trigger deployments, go to | ||
the [Pipelines](https://unicef.visualstudio.com/OI-GIGA/_build) page and trigger | ||
the relevant pipeline: | ||
|
||
- giga-data-sharing-deploy-dev | ||
- giga-data-sharing-deploy-stg | ||
- giga-data-sharing-deploy-prd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Development Lifecycle | ||
|
||
## Trunk Based Development | ||
|
||
 | ||
|
||
The Giga DataOps Platform project follows the concept of Trunk-Based Development, | ||
wherein User Stories are worked on PRs. PRs then get merged to `main` once approved by | ||
another developer. | ||
|
||
The `main` branch serves as the most up-to-date version of the code base. | ||
|
||
### Naming Conventions | ||
|
||
#### Branch Names | ||
|
||
Refer to [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/). | ||
|
||
#### PR Title | ||
|
||
`[<Feature/Fix/Release/Hotfix>](<issue-id>) <Short desc>` | ||
|
||
#### PR Template | ||
|
||
[pull_request_template.md](../.github/pull_request_template.md) | ||
|
||
### Development Workflow | ||
|
||
- Branch off from `main` to ensure you get the latest code. | ||
- Name your branch according to the Naming Conventions. | ||
- Keep your commits self-contained and your PRs small and tailored to a specific feature | ||
as much as possible. | ||
- Push your commits, open a PR and fill in the PR template. | ||
- Request a review from 1 other developer. | ||
- Once approved, rebase/squash your commits into `main`. Rule of thumb: | ||
- If the PR contains 1 or 2 commits, perform a **Rebase**. | ||
- If the PR contains several commits that build toward a larger feature, perform a | ||
**Squash**. | ||
- If the PR contains several commits that are relatively unrelated (e.g., an | ||
assortment of bug fixes), perform a **Rebase**. | ||
|
||
## Local Development | ||
|
||
### File Structure Walkthrough | ||
|
||
- `azure/` - Contains all configuration for Azure DevOps pipelines. | ||
- `data_sharing/` - Contains all custom Data Sharing Proxy code. | ||
- `docs/` - This folder contains all Markdown files for creating Backstage TechDocs. | ||
- `infra/` - Contains all Kubernetes & Helm configuration. | ||
- `scripts/` - Contains custom reusable scripts. | ||
|
||
### Pre-requisites | ||
|
||
#### Required | ||
|
||
- [ ] [Docker](https://docs.docker.com/engine/) | ||
- [ ] [Task](https://taskfile.dev/installation/#install-script) | ||
- [ ] [asdf](https://asdf-vm.com/guide/getting-started.html) | ||
- [ ] [Poetry](https://python-poetry.org/docs/#installation) | ||
- [ ] [Python 3.11](https://www.python.org/downloads/) | ||
|
||
#### As-needed | ||
|
||
- [ ] [Kubernetes](https://kubernetes.io/docs/tasks/tools/) | ||
- If you are using Docker Desktop on Windows, you can use the bundled Kubernetes | ||
distribution. | ||
- [ ] [Helm](https://helm.sh/docs/intro/install/) | ||
|
||
Refer to the Development section in the docs | ||
of [unicef/giga-dagster](https://github.com/unicef/giga-dagster/blob/main/docs/development.md#local-development). | ||
|
||
### Cloning and Installation | ||
|
||
1. `git clone` the repository to your workstation. | ||
2. Run initial setup: | ||
```shell | ||
task setup | ||
``` | ||
|
||
### Environment Setup | ||
|
||
**Data Sharing** has its own `.env` file. The contents of this file can be provided upon | ||
request. There are also `.env.example` files which you can use as reference. Copy the | ||
contents of this file into a new file named `.env` in the same directory, then supply | ||
your own values. | ||
|
||
Ensure that the Pre-requisites have already been set up and all the necessary | ||
command-line executables are in your `PATH`. | ||
|
||
### Running the Application | ||
|
||
```shell | ||
# spin up Docker containers | ||
task | ||
# Follow Docker logs | ||
task logs | ||
# List all tasks (inspect Taskfile.yml to see the actual commands being run) | ||
task -l | ||
``` | ||
|
||
#### Housekeeping | ||
|
||
At the end of your development tasks, stop the containers to free resources: | ||
|
||
```shell | ||
task stop | ||
``` | ||
|
||
### Initial configuration | ||
|
||
1. Run the following to create the database tables and seed the initial roles and admin | ||
token: | ||
```shell | ||
task migrate | ||
task load-fixtures -- roles api_keys | ||
``` | ||
2. To [interact](https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md) with | ||
the Delta Sharing server you can: | ||
1. Access the Swagger UI at https://localhost:5000 and use the built-in | ||
**Try it out** examples. | ||
2. Use an API testing tool like Postman or Insomnia to send requests to the server. | ||
3. To get the initial bearer token, refer to the `.env` you created earlier and | ||
look for the keys `ADMIN_API_KEY` and `ADMIN_API_SECRET`. The admin bearer token is | ||
constructed as | ||
```text | ||
ADMIN_API_KEY:ADMIN_API_SECRET | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Support | ||
|
||
For any questions/help needed regarding this project, you may contact the project team: | ||
|
||
**UNICEF Giga** | ||
|
||
- [Shilpa Arora](mailto:[email protected]) - Product Owner | ||
- [Rishabh Raj](mailto:[email protected]) - Platform Engineer | ||
- [Brian Musisi](mailto:[email protected]) - Data Engineer | ||
|
||
**Thinking Machines** | ||
|
||
- [Billie Zulueta](mailto:[email protected]) - Project Manager | ||
- [AJ Tamayo](mailto:[email protected]) - Project Lead | ||
- [Gerlito Chagas](mailto:[email protected]) - Data Operations Engineer/Software | ||
Engineer | ||
|
||
**Former project team members** | ||
|
||
- Lia Mabaquiao - Project Manager | ||
- Kenneth Domingo - Tech Lead/Software Engineer | ||
- Flo Barot - Senior Data Engineer | ||
- Dana Redeña - Senior Enterprise Solutions Engineer | ||
- Bianca Caugma - Data Strategy Consultant | ||
- Renz Jaranilla - Data Strategy Consultant | ||
- Erin Cheng - Data Engineer | ||
- Renz Togonon - Analytics Engineer | ||
- Sofia Pineda - Data Engineer/Enterprise Solutions Engineer | ||
- Tiff Gamboa - Data Engineer | ||
- Aveline Germar - Analytics Engineer |