Skip to content

A Capstone project for a Springboard Data Engineering Bootcamp operated by Washington University

Notifications You must be signed in to change notification settings

jeff-abe-98/Taxi-Trips

Repository files navigation

Taxi Trips

A Capstone project for a Springboard Data Engineering Bootcamp operated by Washington University

Contents

  • Initial Data Collection files
    • data_collections.py
      • A file containing extraction functions
    • data_collection_nb.ipynb
      • A python notebook that was used to pull the datasets using the extraction functions

/Pipeline Prototype/

/etl/

Contains the etl package with functions for extraction, transformation, and loading in their respective modules. Some of these functions have been limited to pull a smaller portion of the data for prototyping purposes.

*_etl.py

ETL python scripts that run the extraction, transformation, and loading passing data between one another via Queues

*.ipynb

Python notebooks that were used to prototype processes before writing the etl package

/logs/

Contains etl log files

crontab

An example crontab file for pipeline operations of the etl scripts

Problem Statement

An phenomenon I have heard of in NYC is that it can be faster to get somewhere by bike than by car. This is believable, but NYC is a large place; and for a visitor, or new resident, this may be difficult to determine. This project aims to allow a visitor or new resident of NYC to check if their trip is likely to be faster by bike, or by taxi. And what the weather would be like in the case that they were to bike.

Data Sources

About

A Capstone project for a Springboard Data Engineering Bootcamp operated by Washington University

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published