Skip to content

The main intention of the analysis is to analyze the main reason as to why an aircraft crash, by classifying the reason of why the accident has happened using Naïve Bayes algorithm and then analyzing the data to find some patterns, trends in the crashes.

Notifications You must be signed in to change notification settings

rn-snehapriya/Air-Crash-Data-Analysis

Repository files navigation

Air-Crash-Data-Analysis

The main intention of the analysis is to analyse the main reason as to why an aircraft crash, by classifying the reason of why the accident has happened using Naive Bayes algorithm and then analysing the data to find some patterns, trends in the crashes. There are multiple reasons like aircraft model problem, weather, pilot error etc., has been analysed. This will help in adding some more restrictions, pre-flight checks, good pilot training, making some changes in the technology in the aircraft. The dataset used for analyses in from Kaggle which has all the accidents piled up from 1908 to 2009.

  • Exploratory data analysis was done in Jupyter Notebook.
  • Big data technology used was Hadoop.
  • Query language used is Hive.
  • Data visualization for further analysis was done in Tableau.

Note: The dataset which is downloaded after classification using Naive Bayes (labelled.csv) is then uploaded into HDFS for further analysis using Hadoop.

About

The main intention of the analysis is to analyze the main reason as to why an aircraft crash, by classifying the reason of why the accident has happened using Naïve Bayes algorithm and then analyzing the data to find some patterns, trends in the crashes.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published