This dataset includes the official statistics on the 11,538 athletes (6,333 men and 5,205 women) and 306 events at the 2016 Olympic Games in Rio de Janeiro. The data was taken from the Rio 2016 website, which has since been deleted. You can read more about that in a blog post, "Data for the 2016 Olympic Games in Rio de Janeiro".
The athlete data is stored in athletes.csv
; one athlete per row, and eleven columns. Empty cells are null values.
id
- Athlete id
- Integer between 1 and 1,000,000,000
- Unique
- No null values
name
- Athlete's full name
- String up to forty characters in length
- Not unique
- No null values
nationality
- Athlete's nationality
- One of the IOC's 206 three-letter country codes, or
ROT
for members of the Refugee Olympic Team. Kuwaiti athletes' nationality is given asIOA
due to the suspension of the Kuwait Olympic Committee - Not unique
- No null values
sex
- Athlete's sex
- One of two lower-case string values:
male
female
- Not unique
- No null values
date_of_birth
- Athlete's date of birth
YYYY-MM-DD
format- Not unique
- No null values
height
- Athlete's height, in metres
- Floating-point number
- Not unique
- Contains null values
weight
- Athlete's weight, in kilograms
- Integer
- Not unique
- Contains null values
sport
- The sport in which the athlete competes, as defined by the IOC
- One of 28 lower-case string values
aquatics
archery
athletics
badminton
basketball
boxing
canoe
cycling
equestrian
fencing
football
golf
gymnastics
handball
hockey
judo
modern pentathlon
rowing
rugby sevens
sailing
shooting
table tennis
taekwondo
tennis
triathlon
volleyball
weightlifting
wrestling
- Not unique
- No null values
gold
- Number of gold medals won by the athlete
- Integer
- Not unique
- No null values
silver
- Number of silver medals won by the athlete
- Integer
- Not unique
- No null values
bronze
- Number of bronze medals won by the athlete
- Integer
- Not unique
- No null values
info
- Free-form English-language description of the athlete
- String
- Unique (excluding null values)
- Contains null values
The event data is stored in events.csv
; one event per row, and six columns. There are 306 events across 50 disciplines in 28 sports; 161 for men, 136 for women, and 9 mixed sex.
id
- Event id
- Integer between 1 and 1,000,000
- Unique
- No null values
sport
- The sport in which the event is categorised competes, as defined by the IOC
- One of 28 lower-case string values (see
athletes.csv
column definitions for possible values) - Not unique
- No null values
discipline
- The event's sport sub-category; when a sport is not sub-categorised (e.g. football) simply a duplicate of column 2)
- String up to 21 characters in length
- Not unique
- No null values
name
- Name of the event
- String up to 35 characters in length
- Not unique
- No null values
sex
- Sex of the athletes competing in the event
- One of two lower-case string values:
male
female
- Not unique
- No null values
venues
- Names of the venues at which the event occurs; comma-separated list when an event occurs at more than one venue
- String up to 112 characters in length
- Not unique
- No null values