The sub-problems I tried to solve are:
- Building a testing data set by extract tweets from Twitter with specific keywords, using the tweepy API
- Began buidling a classification model that would predict whether an input text can be classified
- An algorithm that would be able to process input text and classify it as "Hate Content", "Not Hate Content", or "Neutral/unidentifiable" using the above defined model.
Since this is my first time building a (text classification/NLP) model completeley from scratch, I could not complete the second bullet. However, I have listened some more potetional ideas at the end of my script that I thought could be implemented if I had the time.
Thanks for the oppurtunity! This was an exciting challenge and I have a lot to learn.
Pranati Modumudi