Retweet-Prediction-during-COVID-19

Note: This is a work in progress. The work done till now is being built upon

My work on this problem is adapted from ACM 29th ACM International Conference on Information and Knowledge Management (CIKM2020) Analytics Retweet Prediction challenge (https://www.cikm2020.org/covid-19-retweet-prediction-challenge/). The goal is to “predict the popularity of COVID-19-related tweets in terms of the number of their retweets,” meaning, predict a tweet’s popularity (as measured by its retweets) via other factors related to it.

Steps Followed:

Exploratory Data Analysis
Data Preprocessing to remove columns that are not required, stop words. Duplicates also handled here.
Normalization of numerical columns
Model Training and Testing

Following improvements can be (and are in the processing of being) made:

Hyperparameter tuning to tune parameters of models built.
Instead of a train test split, we could do cross validation to a. Select a model out of the various models trained. b. Tune hyperparameters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Retweet-Prediction-during-COVID-19

Files

README.md

Latest commit

History

README.md

File metadata and controls

Retweet-Prediction-during-COVID-19