zurich local map

This app was created for the Airbyte + Motherduck Hackathon. It uses Open Government Data (OGD) of the city of Zurich, Switzerland: Open Data Zurich.

It displays things to do, where to sleep, eat and drink and what to see in and around Zurich, Switzerland. In total there are 11 different categories to explore.

The app ingests open API data from zurich tourism using Airbyte, stores the data on Motherduck, transforms it on MD with dbt and serves it in streamlit.

Demo

https://zurichlocal.streamlit.app/

Stack

Data is loaded from the City of Zurich Open Data API (11 API endpoints, one per category) into the "raw" schema on Motherduck. The Airbyte API Connection was built in the Airbyte UI and exported to YAML. Mothderduck is accessed using the new Airbyte Motherduck connector. The ingestion is run through PyAirbyte.
The data is transformed using DBT and DuckDB. Several custom DBT macros process the raw data. The final marts schema combines the processed sources into a star schema.
Data from the marts schema is loaded into a Streamlit dashboard using duckdb and python and displayed on a folium map.

Streamlit is definitely not the best choice for this type of app (map, list and detail views), but it is a quick way to display the data, which was the main part of this challenge.

The stack and architecture were chosen to reflect a classic data engineering workflow, even though the amount data used in the app is quite small.

The used Motherduck DB:

ATTACH 'md:_share/zuerich_db/8bc853bd-f2a3-4856-897b-a65e423ccb81'

Installation

Install python 3.11, e.g. using pyenv
Install poetry
Install Docker (needed for Airbyte Connector)

In the project dir run

$ poetry install --no-root

Usage

Create a Motherduck account and database. Copy your access token from the Motherduck settings.
Create a Database called "zuerich_db" on Motherduck
Create a .env file in the project root and set MD_ACCESS_TOKEN=<your token>
Run the Airbyte import to your Motherduck instance:

$ python src/ingest/ingest_data.py

Run the DBT transformations on Motherduck

$ python src/transform/dbt_runner.py

Run the streamlit dashboard

$ python -m streamlit run src/dashboard/app.py

ToDo

Testing - Test data for correctness and completeness, test pipelines, test dashboard.
Orchestration - e.g. use Dagster to schedule and run the data ingesition and transformation pipelines.
CI/CD - use github actions to build container images of the data pipeline steps.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.streamlit		.streamlit
airbyte/sources		airbyte/sources
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
zurich_local_stack.png		zurich_local_stack.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zurich local map

Demo

Stack

Installation

Usage

ToDo

About

Releases

Packages

Languages

lukasstolz/zurich_local

Folders and files

Latest commit

History

Repository files navigation

zurich local map

Demo

Stack

Installation

Usage

ToDo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages