Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 1.35 KB

README.md

File metadata and controls

21 lines (13 loc) · 1.35 KB

Installation

Jupyter Notebook https://jupyter.org/install is the only prerequisit to run the project.

Project Motivation

This notebook targets in predicting sale price of houses based on the Ames Housing dataset. There are 79 features available which could be used for predicting the sale price. The dataset was obtained from Kaggle Housing Prices Competition https://www.kaggle.com/c/home-data-for-ml-course. The score obtained on the test data is among the first 2% of the Kaggle competition.

File Description

The housing.ipynb is the main file which includes the statistical analysis of the dataset and the prediction model. The data_description.txt includes the description of the different features of the dataset and the possible values in case of categorical features. The train.csv includes 80 columns (features plus sale price) while the test.csv includes only the features and is used to make predictions and then submit them to Kaggle.

How to Interact with the Project

Downloading the whole directory and running the Jupyter Notebook file housing.ipynb outputs a submission.csv file which could be submitted to Kaggle.

Acknowledgments

Many ideas on the project were taken from the following notebooks:

https://www.kaggle.com/artyomkolas/housing-prices-nanpredct-featurselect-top-1

https://www.kaggle.com/angqx95/data-science-workflow-top-2-with-tuning