Skip to content

Latest commit

 

History

History
 
 

subreddit-algebra

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Subreddit Algebra

This directory contains the code and data behind the story: Dissecting Trump's Most Rabid Online Following

The raw data (an online cache of Reddit comments going back to 2005) is from Google's BigQuery and more information about the data can be found here.

Details about the three files of code in this folder:

File Description
processData.sql SQL code for filtering, processing and formatting Reddit comment data from Google's BigQuery. (Note that if you click on the raw data link above, this SQL query will automatically be loaded).
subredditVectorAnalysis.R Conducts a latent semantic analysis of over 50,000 subreddits that creates a vector representation of each one based on commenter co-occurence. It also implements "subreddit algebra:" the ability to add and subtract different subreddits to reveal how they relate to one another.
computeUserOverlap.sql A separate SQL query used for computing the user overlap between r/The_Donald and other subreddits