in this project i have used KNN and Random Forest in order to check if there is a correlation between The average_rating of the books in Goodreads and
- the language of the book
- num_pages
- ratings_count
- text_reviews_count
during the project i have faced an Imbalanced data set and have used an Up-sampling method to try and solve the problem
the Project is devided to the following sections:
1.Data Overview
2.Data Cleaning
3.Data Adjusting
4.Applying Machine Learning Model
4.1. K-nearest neighbors
5.The Problem
6.Random Forest
7.Up-Sampling the minority classes as a solution
8.KNN after re-sampling and comparison