Skip to content

Social network analysis of twitter data with hadoop, flume, hive and R (igraph).

Notifications You must be signed in to change notification settings

buhrmann/tweetonomy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tweetonomy

A Hadoop/R environment for analyzing the structure and dynamics of twitter communities. Currently it is being used to study the interaction of ideological communities forming around the major political parties in Spain and the media.

Tweets are piped via a Flume source as json into Hadoop's filesystem. The custom flume source uses Twitter4j to follow a number of accounts and track a list of keywords. Hive is setup to create daily partitioned tables for the collected tweets. At periodic intervals the hive query language is used to aggregate, summarize and extract information required to build the social network of retweets and mentions, as well as to associate content with each node in the social network. In R, the igraph package is then used to analyse the communities in the tweet network, to produce time-series capturing the up-and-down of popular content (e.g. hashtags), and to analyse how communities and the media influence each other.

About

Social network analysis of twitter data with hadoop, flume, hive and R (igraph).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages