The purpose of this project is to extract an analysis of rental properties in Taipei City provided by a real estate company. We are to determine how much the real estate company should charge for their properties.
- Inferential Statistics
- Machine Learning
- Data Visualization
- Predictive Modeling
- etc.
- R
- Python
- D3
- PostGres, MySql
- Pandas, jupyter
- HTML
- JavaScript
- etc.
We were given a dataset provided by a real estate company to "clean" up errors and to extract an analysis of rental properties. The dataset contained various information of each property. To the houses age, location and the price of the units area. It also provided the number of convenience stores and the distance to the nearest MRT station.
We use the data to see how it will influence the rental housing prices. We explored the distribution of the variables in the dataset and plotted histograms, scatter plots, and a heatmap to visualize the data. To predict the rental housing prices we'll use a linear regression model to evaluate and to explain and assess the rental prices.
- Numeric column interpreted as a string
- Column not relevant to the analysis
- Rows with missing values
- Long column names
- Unknown units (distance , currency)