Welcome to my final project of IBM Data Analysis with Python. In this project, I served as a Data Analyst for a Real Estate Investment Trust that aims to invest in residential real estate.
- 📜 📈 Project Overview
- 📂 About the Dataset
- 🎯 Project Goals
- 📁 Files
- ✅ Conclusion
- 📝 License
In this project, I analyzed a housing dataset to build predictive models for house prices. As a Data Analyst for a Real Estate Investment Trust, I utilized data preprocessing, feature engineering, and regression modeling techniques to generate insights and build predictive models. Their performance was evaluated with the R² scores, showcasing the practical application of data science in real-world scenarios.
The dataset contains house sale prices for King County, USA, including Seattle, for homes sold between May 2014 and May 2015. You can access the dataset here: House Sales in King County, USA.
1. Data Cleaning: Addressing unnecessary columns and managing missing values.
2. Exploratory Data Analysis (EDA): Producing visualizations such as box plots and regression plots, along with statistical summaries.
3. Feature Engineering: Applying polynomial transformations.
4. Modeling: Building Ridge regression models and evaluating their performance through R² scores.
kc_house_data.csv
: The dataset, House Sales in King County, USA.House_Sales_in_King_Count_USA.ipynb
: Jupyter notebook with code and analysis..README.md
: In this file, you have the documentation details for the project.
This project effectively demonstrates my application of data science techniques and methodologies to predict housing prices, achieving satisfactory model performance. Thank you for exploring this project!
This project is licensed under the Apache License 2.0. For further details, please refer to the LICENSE
file.