Bayesian Aggregation of Categorical Distributions with Applications in Crowdsourcing

This repository contains the code, manuscript, datasets, and other materials accompanying our work published as a research paper to the International Joint Conferences on Artificial Intelligence (IJCAI).

Table of Contents

1. Motivation
2. The Datasets
3. Our Model
4. Software Implementation
- 4.1. Getting the code
- 4.2. Dependencies
5. Citation
6. License

1. Motivation

Consider the following exemplar news article:

Figure 1. Source: BBC News, 2017.

Although the title may give us a coarse indication of the content of the article (e.g. Politics), a careful reading of the text reveals that about 66% of the article is about Diplomacy, 28% about Arms Control, 3% about Geopolitics, and 3% about Foreign Policy.

Figure 2. Example judgment of proportions.

Such categorisation is valuable in areas such as information retrieval and recommendation as it allows for finer grained searches and organisation than classifications into single categories. Other examples may include the labelling of proportions of sentiments (e.g. surprise or joy), or the labelling of images when multiple objects are present at the same time.

Despite years of advances in automated classification, humans are still better on such tasks. As a result, crowdsourcing has increasingly been a popular way to leverages human annotators of various abilities and domain experience to perform tasks that would be too difficult or expensive to process computationally or using experts, but would only require simple instructions to complete.

However, collecting reliable judgments from unknown members of a crowd (also called workers) remains a challenging task. It is well known that crowdsourcing platforms suffers from malicious participants (also called spammers) which provide judgments randomly regardless of the document. Such spammers can constitute up to 45% of the population of workers. This increases the cost of acquiring judgments and degrade accuracy of the aggregation.

In this present work, we introduce a new method to aggregate judgments of proportions across multiple categories that for the first time accounts for spammers.

2. The Datasets

This repository uses a total of three datasets to evaluate accuracy, including two novel crowdsourced judgments for a total of 796 annotations about proportions of objects in images and colours in countries flags.

SemEval-2007. Each worker was presented with a list of news headlines and was asked to give numeric judgments between zero and a hundred for each of six sentiments. A total of 1,000 judgments are available accross 100 news headlines.

Figure 3. Example of annotated news headline taken from the SemEval2007 dataset.

IAPR-TC12. Each worker were presented with images and was asked to estimate the proportion of each of the six regions in it (e.g. landscape/nature or man-made). We collected a total of 336 judgments from a set of 16 images.

Figure 4. Example of a judgment in a rural scene from the IAPR-TC12 dataset performed with a pie chart.

Colours. Twenty-three participants were asked to judge the proportion of 10 colours in 20 countries' flag. We crowdsourced a total of 460 judgments of proportion.

Figure 5. Example of flags taken from the Colours dataset.

3. Our Model

Our proposed model (that we call multi-category independent Bayesian classifier combination, or MBCC for short) builds on the strength of prior approaches to deal with aggregating distributions while at the same time accounting for spammers. In particular, we extend IBCC, and associate with each document a categorical distribution representing the proportions of each category.

The factor graph below illustrates the generative process (that is, the process by which our model assumes the judgments of proportions from the workers have been generated) that learns both the proportions per document, and the accuracy of each worker. This is a typical factor graph where each node represent a random variable and each connection a probabilistic conditional dependency.

Figure 6. Factor graph of MBCC.

we start by sampling a confusion matrix for each worker. Each row \(\pi\) of a confusion matrix is distributed according to a Dirichlet distribution with hyperparameter \(\alpha\).
we then sample a categorical distribution \(\Lambda\) for each document, which represent the aggregated judgment of the proportion by all workers. This categorical distribution \(\Lambda\) is similarly drawn from a Dirichlet prior with hyperparameter \(\epsilon\).
we then repeateadly sample this distribution \(\lambda\) \(\n\) times to obtain multiple discrete categories \(z\).
we then use those samples \(\z\) as index of the workers' confusion matrix \(\pi\), and samples discrete judgments \(\c\) from the appropriate row of the confusion matrix of each worker.
finally, we find the most likely categorical distributions \Phi which generated the samples \(\c\) for all documents and workers.

4. Software Implementation

All source code used to generate the results and figures in the paper are in the src and scripts directory. The data used in this study is provided in data and the sources for the manuscript text and figures are in manuscript. The poster and presentation can be found in poster/poster.pdf and poster/presentation.pdf respectively.

4.1. Getting the code

You can download a copy of all the files in this repository by cloning the git repository:

git clone https://github.com/alexandry-augustin/mbcc.git

or download a zip archive.

4.2. Dependencies

The model was developed on Ubuntu Linux using MonoDevelop as IDE.

You’ll need a working Python environment and the Infer.NET 2.6 library to run the code.

5. Citation

If you use our code or dataset, please cite as follows:

@inproceedings{augustin2017mbcc,
  title={Bayesian aggregation of categorical distributions with applications in crowdsourcing},
  author={Augustin, Alexandry and Venanzi, Matteo and Hare, J and Rogers, A and Jennings, NR},
  year={2017},
  organization={AAAI Press/International Joint Conferences on Artificial Intelligence}
}

6. License

All source code is made available under the MIT license. You can freely use and modify the code, without warranty, so long as you provide attribution to the authors. See LICENSE for the full license text.

The manuscript text is not open source. The authors reserve the rights to the article content, which has been published in the proceedings of the International Joint Conference on Artificial Intelligence (IJCAI).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian Aggregation of Categorical Distributions with Applications in Crowdsourcing

1. Motivation

2. The Datasets

3. Our Model

4. Software Implementation

4.1. Getting the code

4.2. Dependencies

5. Citation

6. License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
manuscript		manuscript
poster		poster
scripts		scripts
src		src
Infer.NET 2.6.zip		Infer.NET 2.6.zip
LICENSE		LICENSE
README.adoc		README.adoc

License

alexandry-augustin/mbcc

Folders and files

Latest commit

History

Repository files navigation

Bayesian Aggregation of Categorical Distributions with Applications in Crowdsourcing

1. Motivation

2. The Datasets

3. Our Model

4. Software Implementation

4.1. Getting the code

4.2. Dependencies

5. Citation

6. License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages