San Francisco Trees

The data this week comes from San Francisco's open data portal.

There are dozens of tree species, and many other intresting features to explore in this dataset! I did drop a few columns that were either > 75% missing or redundant, feel free to check out the source for the fully original dataset.

Also - make sure to follow @tidypod - they'll have some interesting #TidyTuesday updates to come this week!

Some interesting articles:

Trees of Life in SF
Landmark trees
Non-native trees
Friends of the urban forest
SF Tree Guide

Get the data here

# Get the Data

sf_trees <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-28/sf_trees.csv')

# Or read in with tidytuesdayR package (https://github.com/thebioengineer/tidytuesdayR)
# PLEASE NOTE TO USE 2020 DATA YOU NEED TO UPDATE tidytuesdayR from GitHub

# Either ISO-8601 date or year/week works!

# Install via devtools::install_github("thebioengineer/tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2020-01-28') 
tuesdata <- tidytuesdayR::tt_load(2020, week = 5)


sf_trees <- tuesdata$sf_trees

Data Dictionary

`sf_trees.csv`

A full data dictionary is available at: the source but it's fairly sparse.

variable	class	description
tree_id	double	Unique ID
legal_status	character	LegalLegal staus: Permitted or DPW maintained
species	character	Tree species includes common name after the :: separator
address	character	Street Address
site_order	double	Order of tree at address where multiple trees are at same address. Trees are ordered in ascending
address order
site_info	character	Site Info - Where the tree resides
caretaker	character	Agency or person that is primary caregiver to tree -- Owner of Tree
date	double	Date Planted (NA if before 1955)
dbh	double	Diameter at breast height
plot_size	character	Dimension of plot - typically in feet
latitude	double	Latitude
longitude	double	Longitude

Cleaning Script


library(tidyverse)
library(here)
library(tidytuesdaymeta)
library(pryr)
library(visdat)
library(skimr)
library(lubridate)
library(leaflet)

create_tidytuesday_folder()

raw_df <- read_csv(here::here("2020", "2020-01-28", "Street_Tree_Map.csv"),
                   col_types = 
                   cols(
                     TreeID = col_double(),
                     qLegalStatus = col_character(),
                     qSpecies = col_character(),
                     qAddress = col_character(),
                     SiteOrder = col_double(),
                     qSiteInfo = col_character(),
                     PlantType = col_character(),
                     qCaretaker = col_character(),
                     qCareAssistant = col_character(),
                     PlantDate = col_character(),
                     DBH = col_double(),
                     PlotSize = col_character(),
                     PermitNotes = col_character(),
                     XCoord = col_double(),
                     YCoord = col_double(),
                     Latitude = col_double(),
                     Longitude = col_double(),
                     Location = col_character()
                   )) %>% 
  janitor::clean_names()

small_df <- raw_df %>% 
  select(-x_coord,-y_coord,-q_care_assistant, -permit_notes) %>% 
  filter(plant_type != "Landscaping") %>% 
  select(-plant_type) %>% 
  separate(plant_date, into = c("date", "time"), sep = " ") %>% 
  mutate(date = parse_date(date, "%m/%d/%Y")) %>% 
  select(-time, -location) %>% 
  arrange(date) %>% 
  rename(legal_status = q_legal_status,
         species = q_species,
         address = q_address,
         site_info = q_site_info,
         caretaker = q_caretaker)

small_df %>% skimr::skim()

small_df %>% 
  write_csv(here::here("2020", "2020-01-28", "sf_trees.csv"))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

San Francisco Trees

Get the data here

Data Dictionary

`sf_trees.csv`

Cleaning Script

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

San Francisco Trees

Get the data here

Data Dictionary

sf_trees.csv

Cleaning Script

`sf_trees.csv`