Skip to content

meeks627/House_price_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

House Price Prediction

A short, easy-to-follow project that trains a regression model to predict house prices using the (classic) Boston housing dataset. The primary work is in the colab notebook:

  • Project_4_House_Price_Prediction.ipynb Table of contents
  • About
  • Notebook(s)
  • Requirements
  • Quick start (Colab and local)
  • Reproducible run (commands/snippets)
  • Notes about the dataset
  • How the model is trained
  • Evaluation & results

About This project demonstrates end-to-end steps for a supervised regression task:

  • Loading a housing dataset
  • Exploratory data analysis and visualization
  • Train/test split
  • Train an XGBoost regression model
  • Evaluate model performance

Notebook(s)

  • Project_4_House_Price_Prediction.ipynb — the primary notebook with all the code and visualizations.

Requirements Minimal Python packages (example versions that are known to work):

  • python >= 3.8
  • numpy
  • pandas
  • matplotlib
  • seaborn
  • scikit-learn
  • xgboost

You can install the essentials with pip:

pip install numpy pandas matplotlib seaborn scikit-learn xgboost jupyter

Quick start

Run in Google Colab (recommended if you don't want to configure locally)

  1. Open the notebook in Colab:
  2. Run the notebook cells in order.

Run locally

  1. Clone the repository:
git clone https://github.com/meeks627/House_price_Prediction.git
cd House_price_Prediction
  1. Install dependencies (see Requirements).
  2. Start Jupyter and open the notebook:
jupyter notebook Project_4_House_Price_Prediction.ipynb
  1. Run the cells top-to-bottom.

Reproducible run / key snippets

  • The notebook uses sklearn.datasets.load_boston() to load the dataset:
import sklearn.datasets
house_price_dataset = sklearn.datasets.load_boston()

Note: sklearn.datasets.load_boston is deprecated/removed in recent scikit-learn versions. If you encounter an error, either:

  • Install a scikit-learn version that still includes load_boston (e.g., pip install scikit-learn==1.1.3), OR
  • Use fetch_openml to retrieve the Boston dataset:
from sklearn.datasets import fetch_openml
boston = fetch_openml(name="boston", version=1, as_frame=True)
X = boston.data
y = boston.target

Model training (as implemented in the notebook)

  • The notebook trains an XGBoost regressor:
from xgboost import XGBRegressor
model = XGBRegressor()
model.fit(X_train, Y_train)
preds = model.predict(X_test)
  • Evaluation metrics commonly shown in the notebook: MAE, MSE, RMSE, R^2 (scikit-learn metrics module).

Notes about the dataset

  • The project uses the Boston housing dataset (13 features) and price as target.
  • The dataset historically contains a capped value of 50.0 for some entries — check the notebook for handling and interpretation.
  • The Boston dataset has been deprecated in scikit-learn due to ethical concerns; for production work consider using alternative datasets (e.g., California housing) or a custom dataset.

Evaluation & results

  • The notebook includes visualizations, correlation heatmap, train/test split, model training and evaluation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published