Skip to content

Bicycle traffic data from several cities in a single database

License

Notifications You must be signed in to change notification settings

kedio-labs/velo-city-db

Repository files navigation

VéloCityDB

██╗   ██╗███████╗██╗      ██████╗  ██████╗██╗████████╗██╗   ██╗██████╗ ██████╗ 
██║   ██║██╔════╝██║     ██╔═══██╗██╔════╝██║╚══██╔══╝╚██╗ ██╔╝██╔══██╗██╔══██╗
██║   ██║█████╗  ██║     ██║   ██║██║     ██║   ██║    ╚████╔╝ ██║  ██║██████╔╝
╚██╗ ██╔╝██╔══╝  ██║     ██║   ██║██║     ██║   ██║     ╚██╔╝  ██║  ██║██╔══██╗
 ╚████╔╝ ███████╗███████╗╚██████╔╝╚██████╗██║   ██║      ██║   ██████╔╝██████╔╝
  ╚═══╝  ╚══════╝╚══════╝ ╚═════╝  ╚═════╝╚═╝   ╚═╝      ╚═╝   ╚═════╝ ╚═════╝ 

This project creates an SQLite database gathering bicycle traffic measurements from several cities.

The database itself is not released as an artifact because of its large size. Instead, you can use this project to create that database locally.

Cities gather their bicycle traffic differently. For each measurement, VéloCityDB standardises that variety of data into the following dimensions:

  • City name
  • Measurement location
    • These are usually traffic counting terminals spread across the city
  • Hourly traffic count
  • Measurement timestamp

Note

VéloCityDB is a great building block for use cases such as data analysis and dashboards.

For inspiration, you can have a look at query and visualisation examples here.

Cities currently supported are, in alphabetic order:

Country City Data Source Data Format Data Licence
France Bordeaux Capteur de trafic vélo - historique horaire CSV Licence Ouverte / Open Licence
France Nantes Comptages vélo de Nantes Métropole CSV Open Data Commons Open Database License (ODbL)
France Paris Comptage vélo - Données compteurs CSV Open Data Commons Open Database License (ODbL)
France Rennes Comptages vélo CSV Open Data Commons Open Database License (ODbL)
France Strasbourg strasbourgvelo.fr (processed by fetching data from SIRAC - flux trafic en temps réel) CSV Open Data Commons Open Database License (ODbL)
United Kingdom Camden - London Camden Cycle Counters Phase 2 CSV Open Government Licence v3.0

How to run

This app will download bicycle traffic data in CSV format for all supported cities and ingest the data in an SQLite database. Patience is a bliss, the whole process can take some time!

Once you have created the database, you can have a look at some query examples here.

Via Docker (Batteries included)

# build the docker image
make docker-build

# run it
make docker-run

The database should be created in a new data directory under the project root.

On a local machine (Bring your own batteries)

You will need Java 11 or above.

VéloCityDB is released as a fat JAR that works on Linux, macOS and Windows.

Clone this repository and run the following:

# Build the fat JAR
make build-fatjar

# Run VéloCityDB
make run

If you need to run the app with specific flags, this is not currently supported via the Makefile. Download the latest JAR from the releases page and run the following:

mkdir data

# Passing specific flags
# See all available flags
java -jar velo-city-db-0.2.0-standalone.jar --help
# Ingest into a new database that will be created under /database/target/directory
java -jar velo-city-db-0.2.0-standalone.jar --data-directory-path $(pwd)/data
# Force re-download existing CSV files 
java -jar velo-city-db-0.2.0-standalone.jar --override-csv-files true --data-directory-path $(pwd)/data

Once the process has completed, you should have a new SQLite database located in the directory named data.

How to run tests

# all tests
make test

# unit tests only
make test-unit

# integration tests only
make test-integration

How to build a fat JAR

make build-fatjar

How to add a new city

First of all, thank you for your interest in this project and in adding a new city to the database!

The codebase is in Kotlin. Contributions of all technical levels are welcome.

The best way to contribute is by raising a PR that includes both business logic and tests!

The easier path ("Easy peasy lemon squeezy")

If a CSV version of the bicycle traffic data is available in a single file, you can have a look at how currently supported cities are ingested in VéloCityDB. This consists in:

  • Adding the relevant city name and endpoint to src/main/resources/data-sources.yml
    • The city name must be pascal cased, with alphanumeric characters only, i.e. MyNewCity.
  • Writing a CSV parser for that city in the package located at src/main/kotlin/parse.
    • The CSV parser class must have a name of the pattern MyNewCityCsvParser where MyNewCity is the exact same name as the one used in data-sources.yml

The more involved path ("I shall design a turbocharger all by myself")

If you need to introduce new logic, you can either convert the data into CSV file so that you can piggyback on the CSV parsing code described above, or create completely new logic. :)

What's in the name?

VéloCity is a play on words involving:

  • On one hand, Vélo - French for bicycle - and City
  • On the other hand the word velocity which can be roughly defined as "the speed and direction of motion of an object" (thank you, Wikipedia).

Wishlist

  • Add more cities
  • Ingest geolocation data where available
  • For convenience, create a Docker container that packs a pre-configured analytics tool
    • e.g. a tool such as Metabase with pre-configured queries and dashboards to have a quick overview of VéloCityDB

License

3-Clause BSD License.

See file named LICENSE at the root of the project.

About

Bicycle traffic data from several cities in a single database

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages