DATA PIPELINE USING AIRFLOW
Apache Airflow is an open-source data workflow management platform. It started at Airbnb in October 2015 as a solution to manage the company's increasingly complex workflows. Airflow uses directed acyclic graphs (DAGs) to manage workflow orchestration. Tasks and dependencies are defined in Python and then Airflow manages the scheduling and execution. DAGs can be run either on a defined schedule.
The pipeline steps are as follows:
GATHER DATA-->TRANSFORM-->MONITOR
The code for the pipeline is in he MYDAG.PY file.
The demonstration of the pipeline can be found at this link: