Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 936 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 936 Bytes

FlightDataAnalysis

This project aims to read large datasets of departure and arrival times of airplanes to predict delays and provide near real-time computations and visualize the result. The software being used for this project is Apache Hadoop using MapReduce.

This is to measure the airline on-time performance in the United States of America.

Summary

Travelling via airlines is one of the most important and efficient ways. However, many travellers frequently encounter flight delays. Which airline carrier has the most delay? What are the main reasons for the flights being delayed?

The aim of this project is to produce a visualisation of the airline performance data.

These are the analyses made from the data:

  • Best Airlines to Avoid Delays
  • Cancellaion Counts and Reasons for them.
  • Recommendations in terms of the above aspects are made at the end.

Dataset

http://stat-computing.org/dataexpo/2009/the-data.html