Skip to content
Juan-Pablo Velez edited this page Nov 17, 2013 · 13 revisions

Tweedr is an application that seeks to use information from social media to better inform disaster relief efforts.

The problem: a flood

Disasters are chaotic, creating massive changes in the local environment and delivering damage stochastically throughout the affected area. In parallel, disasters often dramatically reduce the ability communication, by downing telephone lines and cutting power, wrecking roads and bridges, or otherwise debilitating effective transportation.

However, recovery efforts need the most up-to-date information available, to help them:

  • Know what needs are most important to address.
    • Different areas will have different needs. One neighborhood might need help erecting temporary shelter, while another only needs food and water supplies.
  • Decide what organizations are best suited to address different groups of needs.
  • Plan recovery missions.
    • If some roads are impassable, the entire trip might need to be adjusted.

This lack of information means there is one more step between relief and the disaster victims, since the relief workers must perform their damage reconnaissance while out in the field, and return to headquarters to update and strategize their next venture.

The solution: a flood of text

Social media can fill in some of these gaps of information. Mobile communication usually outlives landlines and other modes of communication, and in recent disasters, the volume of messages—SMSes and tweets—sent from the disaster site has been enormous. For example, one dataset we are working with, from Hurricane Sandy, consists of five million tweets containing the token "sandy" produced between October 27th and November 7th, 2012.

Effective use of this information requires immediately processing large amounts of natural text in a short amount of time. Currently, this means that Red Cross (or some other relief organization) workers have to sift through thousands of Tweets, SMSes, and calls, to extract useful information that will help them direct or carry out on-the-ground relief efforts.

Disaster relief agencies, like the Red Cross, UN, or FEMA, need to quickly assimilate information coming in from various sources, so that they know what's damaged and where. This tool aims to expedite on-site efforts by enhancing the relief agencies', knowledge based on information extracted.

Natural Language Processing (NLP) is one way to help with this processing. More general machine learning (ML) methods can help to geolocate tweets based on text and the social graph. The solution that we have developed here, Tweedr, is primarily an API and a UI:

  1. An API / Pipeline that can process a stream of text (along with metadata) and add useful annotations on top of that text, using machine learning to learn from previous disasters.
  2. A user interface for effectively viewing and consuming these annotations in aggregate.

The impact

Better-informed disaster relief providers can more efficiently address problems, e.g.,

  • prioritize tasks
  • deliver supplies
  • route vehicles

Goals of Tweedr

We have crowdsourced annotations of tweets sent during Hurricane Sandy and the Joplin tornado.

We also have a limited number of token-level labels, marking sequences of tokens as useful for a variety of sub-tasks.

With this set of labeled data, our goal is to filter tweets from a new disaster as useful, apply classification based on previously seen classes.

Example

In one example of a target annotation domain, each tweet was assigned one of a number of categories:

  • Casualties and damage
  • Caution and advice
  • Donations of money, goods or services
  • People missing, found or seen
  • Unknown
  • Information source

For this type of data, our immediate goal is to replicate human annotations via machine learning.

See Ontology for a more structured listing of the categories we are interested in.

Related work

The Related Work page lists existing applications with similar goals and related papers.

Getting involved

Want to get involved? Start with Deploying Tweedr on your own server, then check out the Contributing page.

Partners

University of Chicago logo + QCRI logo

Data Science for Social Good partnered with Qatar Computational Research Institute to make this project happen.

Clone this wiki locally