Skip to content

Things to do, where to sleep, eat and drink and what to see in and around Zurich, Switzerland.

Notifications You must be signed in to change notification settings

lukasstolz/zurich_local

Repository files navigation

Ruff pre-commit

zurich local map

This app was created for the Airbyte + Motherduck Hackathon. It uses Open Government Data (OGD) of the city of Zurich, Switzerland: Open Data Zurich.

It displays things to do, where to sleep, eat and drink and what to see in and around Zurich, Switzerland. In total there are 11 different categories to explore.

The app ingests open API data from zurich tourism using Airbyte, stores the data on Motherduck, transforms it on MD with dbt and serves it in streamlit.

Demo

https://zurichlocal.streamlit.app/

Stack

image

  1. Data is loaded from the City of Zurich Open Data API (11 API endpoints, one per category) into the "raw" schema on Motherduck. The Airbyte API Connection was built in the Airbyte UI and exported to YAML. Mothderduck is accessed using the new Airbyte Motherduck connector. The ingestion is run through PyAirbyte.
  2. The data is transformed using DBT and DuckDB. Several custom DBT macros process the raw data. The final marts schema combines the processed sources into a star schema.
  3. Data from the marts schema is loaded into a Streamlit dashboard using duckdb and python and displayed on a folium map.

Streamlit is definitely not the best choice for this type of app (map, list and detail views), but it is a quick way to display the data, which was the main part of this challenge.

The stack and architecture were chosen to reflect a classic data engineering workflow, even though the amount data used in the app is quite small.

The used Motherduck DB:

ATTACH 'md:_share/zuerich_db/8bc853bd-f2a3-4856-897b-a65e423ccb81'

Installation

In the project dir run

$ poetry install --no-root

Usage

  1. Create a Motherduck account and database. Copy your access token from the Motherduck settings.

  2. Create a Database called "zuerich_db" on Motherduck

  3. Create a .env file in the project root and set MD_ACCESS_TOKEN=<your token>

  4. Run the Airbyte import to your Motherduck instance:

$ python src/ingest/ingest_data.py
  1. Run the DBT transformations on Motherduck
$ python src/transform/dbt_runner.py
  1. Run the streamlit dashboard
$ python -m streamlit run src/dashboard/app.py

ToDo

  1. Testing - Test data for correctness and completeness, test pipelines, test dashboard.
  2. Orchestration - e.g. use Dagster to schedule and run the data ingesition and transformation pipelines.
  3. CI/CD - use github actions to build container images of the data pipeline steps.

About

Things to do, where to sleep, eat and drink and what to see in and around Zurich, Switzerland.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages