For this month's data jam, we will be working with data from Monday at the Houston Hobby airport. This data is sourced from FlightAware's API.
We have the data available in csv, each in their respective directories.
The data are found in the following files:
- weather.csv: weather.csv -- Weather at the airport for the time-period
Column | Description |
---|---|
airport | Airport code; will be 'KHOU' for this dataset |
time | Epoch time (see here for more info) |
cloud_friendly | Cloudy/Clear/Etc. |
cloud_altitude | Height of clouds (99999 if Clear) |
cloud_type | Abbreviated Cloud description |
conditions | None |
pressure | barometric pressure (inHg) |
temp_air | air temperature (Celsius) |
temp_dewpoint | dewpoint (Celsius) |
temp_relhum | relative humidity |
visibility | visibility range (statute miles) |
wind_friendly | e.g. 'Windy' |
wind_direction | wind direction (knots); 360 means true north |
wind_speed | wind speeds (knots) |
wind_speed_gust | gust speeds (knots) |
raw_data | METAR data used by pilots, e.g.'KIAH 250453Z 36015G22KT 10SM CLR 16/04 A2995 RMK AO2 SLP142 T01610044' |
- flights.csv: flights.csv -- Flight-specific data
Column | Description |
---|---|
flight_id | unique flight id string |
ident | 'ANA7211' |
actual_ident | 'UAL128' |
departuretime | Epoch time (see here for more info) |
arrivaltime | Epoch time (see here for more info) |
origin | Origin airport code |
destination | Destination airport code |
aircrafttype | Aircraft type (e.g. 'B787') |
meal_service | 'Business: Dinner / Economy: Dinner' |
seats_cabin_first | number of passenger seats avail. on flight in first class |
seats_cabin_business | number of passenger seats avail. on flight in business class |
seats_cabin_coach | number of passenger seats avail. on flight in coach class |
- routes.csv: flights.csv -- Flight-specific route plan data
Column | Description |
---|---|
flight_id | unique flight id string |
order | 1 |
name | e.g. 'KAYEX' |
type | 'Waypoint' -- or 'Origin Airport' or 'Reporting Point' or 'VOR-TAC (NAVAID)' or 'Destination Airport' or probably some others |
latitude | e.g. 36.4875 |
longitude | e.g. -120.9478611 |
- tracks.csv: flights.csv -- Flight-specific location tracking data
Column | Description |
---|---|
flight_id | unique flight id string |
timestamp | Epoch time (see here for more info) |
latitude | e.g. 37.63875 |
longitude | e.g. -122.3621 |
groundspeed | ground speed (knots) |
altitude | altitude (hundreds of feet) |
altitudeStatus | None |
updateType | e.g. 'TA' |
altitudeChange | e.g. 'C' |
make_csv_url <- function(name){
url <- paste('https://raw.githubusercontent.com/houstondatavis/data-jam-february-2017/data-pipeline/', name, '.csv', sep='')
return(url)
}
flights <- read.csv(make_csv_url('flights'))
routes <- read.csv(make_csv_url('routes'))
tracks <- read.csv(make_csv_url('tracks'))
weather <- read.csv(make_csv_url('weather'))
# http://stackoverflow.com/questions/32400867/pandas-read-csv-from-url#answer-32400969
import pandas as pd
import io
import requests
def make_csv_url(name):
return "https://raw.githubusercontent.com/houstondatavis/data-jam-february-2017/data-pipeline/" + name + ".csv"
flights = pd.read_csv(
io.StringIO(
requests.get(make_csv_url('flights')).content.decode('utf-8')
)
)
routes = pd.read_csv(
io.StringIO(
requests.get(make_csv_url('routes')).content.decode('utf-8')
)
)
tracks = pd.read_csv(
io.StringIO(
requests.get(make_csv_url('tracks')).content.decode('utf-8')
)
)
weather = pd.read_csv(
io.StringIO(
requests.get(make_csv_url('weather')).content.decode('utf-8')
)
)