-
Notifications
You must be signed in to change notification settings - Fork 5
/
02_geographies.Rmd
183 lines (145 loc) · 8.05 KB
/
02_geographies.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
---
title: "Geographic processing"
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
github_document:
toc: true
always_allow_html: true
urlcolor: blue
---
```{r include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE,
cache = FALSE)
library(dplyr); library(tidyr); library(readr); library(stringr); library(tibble)
library(tigris)
library(sf)
library(tidycensus)
library(here)
st_erase = function(x, y) st_difference(x, st_union(st_combine(y)))
state <- c("MN")
county <- c("Anoka", "Carver", "Dakota", "Hennepin", "Ramsey", "Scott", "Washington") # either type in specific county names, or for all counties within a state assign `county <- NULL`
# state <- c("WI")#c("MN")
# Demographic variables
acs_year <- 2021 # The 2017-2021 ACS 5-year estimates are scheduled to be released on December 8, 2022. So this can be updated with "2021" then.
census_year <- 2020
```
# Process geographies
For the Twin Cities, Growing Shade nests block groups (the core level of analyses) into larger neighborhood and city-level geographies. This step is not easily applied to other regions, so will likely need to be specifically tailored if applying the methods elsewhere.
**NOTE:** this script **DOES** rely on some parameters found inside the "global" `01_tutorial.Rmd` script, so please be sure to run that before running this script! It is okay if the tutorial script encounters an error and can't run all the way through, you'll still be saving information about which state/counties to use here!
### Neighborhoods and city levels
Since we're going to be making a map which shows census tracts, cities, or neighborhoods depending on the user input, a crosswalk needs to be made which relates block groups to the city and neighborhood levels.
If you can't download with a code-based method, download specific geographies, and put them in the data-raw folder. For the Twin Cities, neighborhoods need to be downloaded manually.
- [Minneapolis](https://opendata.minneapolismn.gov/datasets/communities/explore?location=44.970861%2C-93.261718%2C12.85)
- [St. Paul](https://information.stpaul.gov/City-Administration/District-Council-Shapefile-Map/dq4n-yj8b)
- [Brooklyn Park](https://gis.brooklynpark.org/neighborhoodinfo/) (but we aren't including their neighborhoods yet)
Adjust the code below as necessary to ensure that both `nhood_geo` (neighborhoods) and `ctu_geo` (city/townships) have a column named `GEO_NAME` and `geometry`. For the neighborhood data, there should also be a `city` column (i.e., "Minneapolis" or "St. Paul" for the Twin Cities region).
After the raw geographies are downloaded, then you need to make a crosswalk which relates block groups into neighborhoods and cities. For this step, it is useful to remove major river features (boundaries around rivers often are poorly aligned, removing rivers makes generating the crosswalk much cleaner). At least in the Twin Cities, several block groups legitimately do fall within multiple cities, so this step is a admittedly a bit complicated. A simpler alternative is to just use the city/township in which the majority of the block group falls.
If this section doesn't apply for other regions, it should be easy enough to remove elements in the user-interface of the application.
```{r block-group-geo}
bg_geo <- block_groups(
state = state,
county = county,
year = census_year
)
```
```{r nhood-ctu-geo}
# neighborhood
minneap <- read_sf(paste0(here::here(), "/data-raw/minneapolis neighborhoods/Minneapolis_Communities.shp")) %>%
rename(GEO_NAME = CommName) %>%
mutate(Shape_Area = as.numeric(st_area(.))) %>%
mutate(city = "Minneapolis") %>%
st_transform(4326)
stpaul <- read_sf(paste0(here::here(), "/data-raw/stpaul neighborhoods/geo_export_0c076f52-d6ff-4546-b9fa-bd9980de6e8a.shp")) %>%
mutate(Shape_Area = as.numeric(st_area(.))) %>%
rename(GEO_NAME = name2) %>%
mutate(city = "St. Paul") %>%
mutate(GEO_NAME = case_when(GEO_NAME == "CapitolRiver Council" ~ "Downtown",
GEO_NAME == "Thomas-Dale/Frogtown" ~ "Frogtown",
GEO_NAME == "West Side Community Organization" ~ "West Side",
GEO_NAME == "West 7th Federation/Fort Road" ~ "West 7th-Fort Road",
GEO_NAME == "Highland" ~ "Highland Park",
GEO_NAME == "Summit Hill Association" ~ "Summit Hill",
GEO_NAME == "Eastview-Conway-Battle Creek-Highwood Hills" ~ "Battle Creek-Conway-Eastview-Highwood Hills",
GEO_NAME == "The Greater East Side" ~ "Greater East Side",
GEO_NAME == "Como" ~ "Como Park",
TRUE ~ GEO_NAME)) %>%
st_transform(4326)
nhood_geo <- bind_rows(minneap, stpaul) %>%
select(GEO_NAME, city)
#### ctus -----------
temp <- tempfile()
temp2 <- tempfile()
download.file(
"https://resources.gisdata.mn.gov/pub/gdrs/data/pub/us_mn_state_metc/bdry_metro_counties_and_ctus/shp_bdry_metro_counties_and_ctus.zip",
destfile = temp
)
unzip(zipfile = temp, exdir = temp2)
list.files(temp2)
ctu_geo <- sf::read_sf(paste0(temp2, pattern = "/CTUs.shp")) %>%
transmute(GEO_NAME = CTU_NAME)
files <- list.files(temp2, full.names = TRUE)
file.remove(files)
```
```{r ctu-nhood-crosswalk}
temp <- tempfile()
download.file("https://resources.gisdata.mn.gov/pub/gdrs/data/pub/us_mn_state_metc/water_lakes_rivers/gpkg_water_lakes_rivers.zip", destfile = temp)
river_lake_all <- sf::read_sf(unzip(temp, "water_lakes_rivers.gpkg"))
# river layer to erase major boundary rivers-------
river_lake_buffer <- river_lake_all %>%
filter(NAME_DNR %in% c("Mississippi", "Minnesota", "St. Croix")) #%>% #these rivers are boundaries
# fxns to make easy -----
# find crosswalks
find_crosswalks <- function(x) {
crosswalk <- x %>%
st_transform(26915) %>%
st_buffer(-150) %>% #buffer the perimeter of the geography
st_erase(river_lake_buffer %>%
st_buffer(200) %>% #buffer out rivers
st_union() %>%
st_buffer(0)) %>%
st_intersection(bg_geo %>%
dplyr::select(GEOID) %>%
rename(bg_id = GEOID) %>%
st_transform(26915)) %>%
st_drop_geometry()
return(crosswalk)
}
ctu_crosswalk <- find_crosswalks(ctu_geo) %>%
mutate(flag = case_when(GEO_NAME == "Blakeley Twp." & bg_id != "271390813001" ~ "remove",
# minnesota river is squirrely
TRUE ~ "keep")) %>%
filter(flag != "remove") %>%
dplyr::select(-flag)
nhood_crosswalk <- find_crosswalks(nhood_geo)
wide_ctu_crosswalk_1 <- ctu_crosswalk %>%
aggregate(GEO_NAME ~ bg_id, paste, collapse = ", ") %>%
rename(jurisdiction = GEO_NAME) %>%
group_by(bg_id) %>%
count() %>%
left_join(ctu_crosswalk) %>%
add_column(cities = 999) %>%
dplyr::select(bg_id, GEO_NAME, cities, n) %>%
# spread(cities, GEO_NAME)
pivot_wider(names_from = cities, values_from = GEO_NAME, values_fn = list) %>% #filter(bg_id == "270030501082") %>% unlist()
unnest_wider(`999`, names_sep = ",")
wide_ctu_crosswalk <- wide_ctu_crosswalk_1 %>%
mutate(jurisdiction = paste(`999,1`, `999,2`, `999,3`, `999,4`,# `...5`,
# `...6`, `...7`,
sep = ", ")) %>%
dplyr::select(bg_id, jurisdiction) %>%
mutate(jurisdiction = str_replace_all(jurisdiction, ", NA", ""))
save(ctu_crosswalk, nhood_crosswalk, wide_ctu_crosswalk, file = paste0(here::here(), "/data-raw/geography_data.rda"))
# library(tibble); library(dplyr); library(tidyr); library(stringr)
# test <- tibble(id = c(1, 2, 3, 4, 5, 6),
# city = c("a", "a", "b", "c", "c", "c"))
#
# count(test, city) %>%
# left_join(test) %>%
# add_column(junk = NA_real_) %>%
# pivot_wider(names_from = junk, values_from = id, values_fn = list) %>%
# unnest_wider(`NA`) %>%
# mutate(jurisdiction = paste(`...1`, `...2`, `...3`,
# collapse = ", ")) %>%
# dplyr::select(city, jurisdiction) %>%
# mutate(jurisdiction = str_replace_all(jurisdiction, ", NA", ""))
```