- python 3.8+
- Pandas
- Numpy
- Sk-Learn
- gensim
- nltk
- textblob
- yellowbrick
The project aims to analyze the Amazon Fine Food Reviews dataset using different Text Mining techniques. Initially, an exploratory analysis of the data is performed, followed by some preprocessing activities. Then, different classification and clustering models are implemented, to classify reviews starting from their text and grouping similar reviews.
Google Drive folder with models estimated and datasets: https://drive.google.com/drive/u/1/folders/1veNClNl7CxCTFHVNY2Fp29hcMoEj-the
The project aims to answer the following questions:
- Can a review be classified as good or bad from its text?
- Is it possible to predict the user’s rating starting from the text of the review?
- Is it possible to group similar reviews?