Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 1.12 KB

README.md

File metadata and controls

25 lines (13 loc) · 1.12 KB

star-wars-text-analysis

A text analysis project on collection of script dialogue between characters for the episode 4,5,6 of star wars

Getting started

Star Wars is a popular film franchise that takes place in a galaxy far, far away. This is a collection of script dialogue between characters for the first three movies (episodes 4-6). Since it's a holiday (and just because Star Wars is an awesome movie), this data should serve as a fun way to implement text mining and linguistics.

The source files are as listed below:

SW_EpisodeIV.txt - Script from the Episode IV: A New Hope with columns character and dialogue.

SW_EpisodeV.txt - Script from the Episode V: The Empire Strikes Back with columns character and dialogue.

SW_EpisodeVI.txt - Script from the Episode VI: Return of the Jedi with columns character and dialogue.

Software used

I have used R's tidytext, tm, wordcloud packages for doing text analysis. For cleansing the data I have used dplyr and ggplot2 packages of R.

Finally, I have created an ipython notebook with text analysis and have also checked-in an R markdown file.

Author

Sridhar Varanasi