-
Notifications
You must be signed in to change notification settings - Fork 6
/
Basic_Mapping.Rmd
795 lines (516 loc) · 26.4 KB
/
Basic_Mapping.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
---
title: "Basic Mapping"
subtitle: "R Notes"
author: "Luc Anselin and Grant Morrison^[University of Chicago, Center for Spatial Data Science -- [email protected],[email protected]]"
date: "08/08/2018"
output:
html_document:
fig_caption: yes
self_contained: no
toc: yes
toc_depth: 4
css: tutor.css
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
<br>
## Introduction
This notebook cover the functionality of the [Basic Mapping](https://geodacenter.github.io/workbook/3a_mapping/lab3a.html) section of the GeoDa workbook. We refer to that document for details on the methodology, references, etc. The goal of these notes is to approximate as closely as possible the operations carried out using GeoDa by means of a range of R packages.
The notes are written with R beginners in mind, more seasoned R users can probably skip most of the comments
on data structures and other R particulars. Also, as always in R, there are typically several ways to achieve a specific objective, so what is shown here is just one way that works, but there often are others (that may even be more elegant, work faster, or scale better).
For this notebook, we will use socioeconomic data about NYC from the GeoDa website. Our goal in this lab is show how to implement exploratory data analysis methods with three or more variables.
### Objectives
After completing the notebook, you should know how to carry out the following tasks:
- Read a shapefile with **sf**
- Set a projection with **sf**
- Make choropleth maps of different types
- Customize choropleth maps
- Calculate and plot centroids
- Make a cartogram
- Make conditional maps
#### R Packages used
- **tmap**: To make various choropleth maps and customize them
- **sf**: To read in the shapefile and add centroids
- **cartogram**: To make a cartogram
- **ggplot2**: Plot centroids
#### R Commands used
Below follows a list of the commands used in this notebook. For further details
and a comprehensive list of options, please consult the
[R documentation](https://www.rdocumentation.org).
- **Base R**: `install.packages`, `library`, `setwd`, `head`, `quantile`
- **tmap**: `tm_shape`, `tm_fill`, `tm_layout`, `tm_legend`, `tm_borders`, `tm_facets`
- **sf**: `st_read`, `plot`, `st_crs`, `st_set_geometry`, `st_centroid`
- **cartogram**: `cartogram_dorling`
- **ggplot2**: `ggplot`, `geom_sf`
## Preliminaries
Before starting, make sure to have the latest version of R and of packages that are compiled for the matching version of R (this document was created using R 3.5.1 of 2018-07-02). Also, optionally, set a working directory, even though we will not
actually be saving any files.^[Use `setwd(directorypath)` to specify the working directory.]
### Load packages
First, we load all the required packages using the `library` command. If you don't have some of these in your system, make sure to install them first as well as
their dependencies.^[Use
`install.packages(packagename)`.] You will get an error message if something is missing. If needed, just install the missing piece and everything will work after that.
```{r}
library(tmap)
library(sf)
library(cartogram)
library(ggplot2)
```
### Obtaining the Data from the GeoDa website
To get the data for this notebook, you will and to go to [NYC Data](https://geodacenter.github.io/data-and-lab/nyc/) The download format is a zipfile, so you will need to unzip it by double clicking on the file in your file finder. From there move the resulting folder titled: nyc into your working directory to continue. Once that is done, you can use the **sf** function: `st_read()` to read the shapefile into your R environment.
```{r}
nyc.bound <-st_read("nyc/nyc.shp")
plot(nyc.bound)
```
## Setting the Projection
```{r}
st_crs(nyc.bound)
st_crs(nyc.bound) <- 2263
plot(nyc.bound)
```
## Basic Choropleth Mapping and Design
We will be covering choropleth mapping and customization in the **tmap** library.
We will not come close to covering all of the customization options that
**tmap** provides. For more information on **tmap** customizations, check
out the documentation at [tmap documentation](https://cran.r-project.org/web/packages/tmap/tmap.pdf)
### tmap choropleth maps and customizations
Here we will start with the commands to get a bare bones choropleth map.
We start by inputting the shapefile into the `tm_shape` function. This gives
us our first layer, we use the `+` operator to add additional layers. Next,
we use `tm_fill` and input our desired variable as a string. These two
lines of code give us our first choropleth map with the default style of
pretty breaks.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008")
```
#### Title Format
Now we will begin exploring formatting options for **tmap** maps. We start
with title formatting. We add an another layer with the `+` operator.
The `tm_layout` function allows us to update, the title name, position, size,
along with many other things. Here we will only demonstrate how to change
the name, set the position and change the size of the font. To change the
title, we use the `title =` parameter from the `tm_layout` function.
For the position of the position of the title, we need `title.position =`.
This parameter takes a vector of two strings specifying the position in a
distinct format. The first item of the vector must be `left`, `right`, or
`center`. The second must be `top` or `bottom`.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_layout(title = "Rent 2008 Choropleth Map", title.size = 1.5, title.position = c("right","bottom"))
```
#### Color Scheme
Here we will go over some of the color customizations available in **tmap**. We
use the `palette =` parameter from the `tm_fill` function. We start with a basic
color scheme change by specifying the one of the built in palettes. Built in
palettes include "Reds", "Greens", and many more on found in the **tmap** documentation.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", palette = "Reds")
```
Now we will move on to more customizable color palettes. With this feature, we can
mimic the color scheme used in GeoDa for the choropleth extreme value plots. To do
this, we enter a vector with the desired colors for the palette. This will output
a spectrum of colors in the map that range between the colors specified in the vector.
For instance if we did `c("red", "blue"), the color spectrum would move from red to
purple, then to blue, with in between shades.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", palette = c("blue", "white", "red"))
```
#### Borders
We can update the borders with the `tm_borders` function. Here we will
specify the color, line width, and line type. The `col =` parameter
allows us to set the border color. `lwd =` sets the line width. `lty =`
sets the line type.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
#### Legend Format
We can change the default formatting of the legend in our maps with the `tm_layout`
function. We will mainly go over the positioning of the legend, since it can often
times get in the way of the plot. We will first go over how to specify the location
of the legend within the plot.
We use the `legend.position =` parameter to set the psoition in the plot. This parameter
takes a vector with two strings indicating the desired position. The first value must
be: "left", "right", or "center". The second value must be "top", "bottom", or "center"
In our case, the default position was the best position for the legend, but we will
move it to a worse position just to show how to.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_layout(legend.position = c("right", "bottom"))
```
There will be times, when there is no good position for the legend inside of the plot.
In these case we can move the legend outside. For this, we use the logical operator,
`legend.outside =` and we set it equal to TRUE.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_layout(legend.outside = TRUE)
```
Now that we have moved the legend outside of the map, we can chose the position. For this we use `legend.outside.position =`. We have four choices for this parameter, "top", "bottom", "right", and "left".
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_layout(legend.outside = TRUE, legend.outside.position = "bottom")
```
#### Basemaps
**tmap** allows us to add a basemap to our choropleth map. This doesn't work
unless you have the mode changed to `view` instead of `plot`. We do this
with the `tmap_mode` function before moving on to the map. `view` mode also
adds some interactivity to the map in that we will now be able to zoom in and
out.
```{r}
tmap_mode("view")
```
Now we can add a basemap. This is done in the layout function with the `basemap =`
parameter. We input `OpenStreetMap` inside a vector to get a basemap from
Open Street Map.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008") +
tm_layout(title = "Pretty Breaks Map", basemaps = c("OpenStreetMap"))
```
#### All Together
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", palette = c("blue", "white", "red")) +
tm_borders(col = "black", lwd =.8, lty = "solid") +
tm_layout(title = "Pretty Breaks Map", basemaps = c("OpenStreetMap"),
title.size = 1.5, title.position = c("right","bottom"))
```
Now we switch the mode back to plot, as we are done using basemaps for now.
```{r}
tmap_mode("plot")
```
## Different Interval Styles
An important feature of the `tm_fill` function is the `style =` parameter.
We can change the intervals for our maps with this feature. This is very
useful in exploring spatial patterns and distributions of variables. The
default interval style is Pretty Breaks, but this is not always the best way
to observe patterns in the data. **tmap** provides a variety of options for
these intervals along with a custom option, which can be used to create
mapping function for styles not available in **tmap**.
### Common map classifications
#### Quantile Map
Our first map is the style quantile. This map divides the observtions
equally into 5 intervals and then maps them accordingly. This option is
good for looking the upper or lower percentiles of a variable on a map,
but is not good for identifying extreme values.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", style = "quantile") +
tm_layout(title = "Quantile Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
#### Natural Breaks Map
A natural breaks map uses a nonlinear algorithm to group observations such
that the within-group homogeneity is maximized. This is a clustering
algorithm in one dimension to determine the break points that yield groups
with the largest internal similarity.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", style = "jenks") +
tm_layout(title = "Natural Breaks Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
#### Equal Interval Map
This map uses the equal interval style. An equal interval map is a map
that follows the same priciple as a histogram to organize the observations
into categories that divide the range of variables into equal interval
bins. This results in intervals of equal size, but the number of
observations in the bins will typically not be equal.
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", style = "equal") +
tm_layout(title = "Equal Interval Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
### Custom Breaks Map
To make a custom breaks map we use `breaks =` in the `tm_fill` function to set the breaks. All we need to
do is input a vector including our desired breaks. The numbers in the vector need to be in order from
least to greatest to avoid errors.
```{r}
tm_shape(nyc.bound) +
tm_fill("kids2000", breaks = c(5,15,25,35,45,55,65)) +
tm_layout(title = "Custom Breaks Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
### Extreme Value Maps
Extreme value maps are choropleth maps that highlight extreme values at lower and upper
ends of the scale. We will go through three different types of these maps: a standard
deviation map, a box map and a percentile map.
GeoDa supports all three of these maps, but **tmap** does not. We will have to use
the custom breaks functionality in **tmap** to make the other two maps.
#### Standard Deviation Map
```{r}
tm_shape(nyc.bound) +
tm_fill("rent2008", style = "sd", palette = c("blue", "white", "red")) +
tm_layout(title = "Standard Deviation Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
#### Box Map
A box map is the mapping counterpart of a box plot. It is like a quantile map, but
the four categories are extended to identify upper and lower outliers.
To make this, we will take advantage of the `breaks =` component of the `tm_fill`
function to make a new map style that is not supported by **tmap**
currently. We will be making a box plot map, which separtes the map
intervals like those of a box plot. These intervals will include lower
outliers, < 25%, 25% - 50%, 50% - 75%, > 75%, and upper outliers. We will
need to make a function that calculates these breaks and then evaluates
wheter or not outlier are present, as to avoid errors when using the
**tmap** functions. This can seem a little daunting at first, but is very
simple when separated into pieces. We start by getting a dataframe without
geometry, so we can select our desired variable. This is done with
`st_set_geometry` and we set it to NULL. Then we can use the `quantile`
function to get the min, max, median 1st quartile and third quartile. They
are inorder from least to greatest, so we assign descriptive names to each
by indexing the values. Now we assign values to our breaks: b1, b2, b3,
etc. The only important thing to note at this stage is that at b2 and b6
we use a formula to get the break points for outliers, which is 1.5 times
the interquartile range plus or minus the median. Next we set up an
if/else structure to test for the presence of lower and upper outliers,
then sets the breaks accordingly in **tmap** mapping functions.
```{r}
box_map <- function(shp, varname, mapTitle){
#need a datframe without geometry column for quantile()
df <- shp %>% st_set_geometry(NULL)
#gives us the quartiles and min and max
quartiles <- quantile(df[,varname])
#give shortened names to each value
min <- quartiles[1]
q1 <- quartiles[2]
m <- quartiles[3]
q3 <- quartiles[4]
max <- quartiles[5]
#set our potential break points
b1 <- min
b2 <- m - 1.5 * (q3 - q1)
b3 <- q1
b4 <- m
b5 <- q3
b6 <- m + 1.5 * (q3 - q1)
b7 <- max
# if/else if/ else structure to deal with the presence of absense of upper and lower outliers
if (b2 < b1 & b6 > b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b3,b4,b5,b7),
labels = c("< 25%", "25% - 50%", "50% - 75%", "> 75%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else if (b2 > b1 & b6 > b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b2,b3,b4,b5,b7),
labels = c("lower outlier", "< 25%", "25% - 50%", "50% - 75%", "> 75%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else if (b2 < b1 & b6 < b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b3,b4,b5,b6,b7),
labels = c("< 25%", "25% - 50%", "50% - 75%", "> 75%", "upper outlier"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else {
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b2,b3,b4,b5,b6,b7),
labels = c("lower outlier", "< 25%", "25% - 50%", "50% - 75%","> 75%", "upper outlier"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
}
}
#test our function
box_map(nyc.bound, "kids2000", "Box Map")
```
#### Percentile Map
The percentile map is a variant of a quantile map that would start off with 100 categories. However, rather than having these 100 categories, the map classification is reduced to six ranges, the lowest 1%, 1-10%, 10-50%, 50-90%, 90-99% and the top 1%.
The process for making a percentile map is similar to that for the box map. We
begin by getting a dataframe without geometry, so we can run the `quantile` function.
We then use the quantile function to get breakpoints at the appropriate percentiles.
we can do this by inputting a vector with the desired percentiles in decimal form.
This results in a numeric vector, which we will index to get our desired breakpoints.
We assign short names such as **b1** and **b2**, as we will be typing these alot
later in the function.
We will need an if/else structure to deal with the cases where our input variable has
a 1st percentile that is the same as the min and/or a 99th percentile that is the
same as the max. In each branch of the if/else structure we use breakpoints and labels
that correspond with the test case. In each case we use `breaks =` to set the breaks
and `labels =` to set the labels that fit our breaks
```{r}
percentile_map <- function(shp, varname, mapTitle){
#need a datframe without geometry column for quantile()
df <- shp %>% st_set_geometry(NULL)
#gives us the desired percentiles
breaks <- quantile(df[,varname], c(0,.01,.1,.5,.9,.99,1))
#assign the break points
b1 <- breaks[1]
b2 <- breaks[2]
b3 <- breaks[3]
b4 <- breaks[4]
b5 <- breaks[5]
b6 <- breaks[6]
b7 <- breaks[7]
# if else structure to deal with 1st and 0th percentile being equal and/or 99th and 100th being equal
if (b1 == b2 & b6 == b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b3,b4,b5,b7),
labels = c("0% - 10%", "10% - 50%", "50% - 90%", "90% - 100%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else if (b1 == b2 & b6 != b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b3,b4,b5,b6,b7),
labels = c("0% - 10%", "10% - 50%", "50% - 90%", "90% - 99%", "> 99%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else if (b1 != b2 & b6 == b7){
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b2,b3,b4,b5,b6,b7),
labels = c("< 1%", "1% - 10%", "10% - 50%", "50% - 90%", "90% - 100%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
} else {
tm_shape(shp) +
tm_fill(varname,
breaks = c(b1,b2,b3,b4,b5,b6,b7),
labels = c("< 1%", "1% - %10", "10% - 50%", "50% - 90%","90% - 99%", "> 99%"),
palette = c("blue", "white", "red")) +
tm_layout(title = mapTitle) +
tm_borders(col = "black", lwd =.8, lty = "solid")
}
}
#testing our function
percentile_map(nyc.bound, "rent2008", "percentile map")
```
### Plotting Categorical Variables
The dataset used for this notebook doesn't contain any categorical variables. We
will need to make a new column in order to demonstrate **tmap** functionality
for categorical variables.
We will use the `cut_number` function to create five categories for our categorical plot.
We pass the variable, e.g., **kids20000**, and the number of categories, say `n = 5`. This creates the new variable as an R `factor`, giving the intervals that resulted from the cut.
For example, we create a new variable **kidscat** using a quantile classification with five categories.
```{r}
nyc.bound$kidscat <- cut_number(nyc.bound$hhsiz00,n=5)
nyc.bound$kidscat
```
Now that we have our categorical variable, we can map it with **tmap**. To plot
a categorical variable, we just use `cat` in the `style =` parameter.
```{r}
# colors that match the geoda scheme are "salmon", "lightgreen", "lightblue" and the ones in the function
tm_shape(nyc.bound) +
tm_fill("kidscat", style = "cat", palette = c("lightblue","royalblue","lightgreen","forestgreen","salmon")) +
tm_layout(title = "Categorical Map") +
tm_borders(col = "black", lwd =.8, lty = "solid")
```
## Saving maps
To save a map in **tmap**, you first store the plot in a variable, then use the
`save_tmap` function. The `tm =` parameter takes the variable name of your map,
then the `filename =` parameter takes your desired file name. It is important to
note that the default save will be a png image. To get other formats, you include
the extension in the filename, for instance "tmap_choropleth.jpg".
```{r}
tm <- tm_shape(nyc.bound) +
tm_fill("kids2000")
save_tmap(tm = tm, filename = "tmap_choropleth")
```
## Adding Centroids to an sf object
Centroids are the shape centers of of a polygon. We can get these points by passing
our **sf** object: **nyc.bound** to the `st_centroid` function. From there we use
the pipe operator and `st_geometry` to get just the points. From there we store that
information in `centroids`. We can add this to **nyc.bound**, but we only need them
for a short section of this notebook.
```{r}
centroids <- st_centroid(nyc.bound) %>% st_geometry()
```
### Plotting Centroids
We have to use **ggplot2** to plot these points because the **tmap** function: `tm_dots` does not support the point structure made by **sf**. Plotting these points
is simple and similar to **tmap** mapping. We start with the `ggplot` function. Then we add the original **sf** object **nyc.bound** as a data source in `geom_sf`. From
here we also add the points using `geom_sf` and specifying **centroids** as the data
source.
```{r}
ggplot() +
geom_sf(data = nyc.bound, fill = "white") +
geom_sf(data = centroids, color = "red")
```
### Saving Centroids
To save our centroids, we will use the **sf** library. We use `st_write` to create a
new shapefile. The parameters necessary are the **sf** object, the desired name of the
file, and the driver. We use `ESRI Shapefile` in this case.
```{r}
st_write(centroids, "nyc_centroids", driver = "ESRI Shapefile")
```
## Conditional Maps
Conditional maps are a major feature of the functionality of **tmap**, where they are
referred to as facetting, or small multiples. This is implemented in the `tm_facet`
function.
There is one major difference between the approach taken in GeoDa and that in
**tmap**. In GeoDa, the conditioning variables are typically continuous, and different
types of classifications can be applied to them to obtain the actual condition. For
example, in the GeoDa Workbook illustration, the variables **hhsiz08** and **forhis08**
are used as conditioning variables for respectively, the x-axis and the y-axis. In
**tmap** the conditioning is based on a categorical variable that needs to be
available in the data set. The facetting formula does not evaluate functions, so the
conditioning categories need to be computed beforehand.
There are three so-called helper functions to make this easy: `cut_interval`,
`cut_width`, and `cut_number`. The closest to the median (2 quantiles) conditioning illustrated in the GeoDa Workbook is the `cut_number` function. We pass the variable, e.g., **hhziz00**, and the number of categories, say `n = 2`. This creates the new variable as an R `factor`, giving the intervals that resulted from the cut.
For example, we create a new variable **cut.hhsiz** using a quantile classification with two categories (as in the GeoDa Workbook, the variable will be split on the median
value), by setting `n=2`. Since we only have 55 observations, we can easily list the
full set of values to verify.
```{r}
nyc.bound$cut.hhsiz <- cut_number(nyc.data$hhsiz08,n=2)
nyc.bound$cut.hhsiz
```
And, similarly for **cut.yrhom**:
```{r}
nyc.bound$cut.forhis <- cut_number(nyc.data$forhis08,n=2)
nyc.bound$cut.forhis
```
After examining our cuts we can see that our results differ slightly from the GeoDa
conditional map example. The **cut.hhsiz** is the same as the example and **cut.forhis**
is is about .9 higher. When we map the results we will get a similar map to the GeoDa
example, with maybe a few differences.
We use the `box_map` function we created earlier in the notebook to replicate the results
frome the GeoDa work book. We use the add operator and `tm_facets` to get our conditional plot. We can do this with the function we made because the ending result
is a **tmap** choropleth. We use the `by =` parameter to specify our conditioning variables inside a vector. We can only input one or two conditioning vectors.
```{r}
box_map(nyc.bound, "rent2008", "Conditional Map") +
tm_facets(by = c("cut.hhsiz", "cut.forhis"))
```
## Cartograms
A cartogram is a map type where the original layout of the areal unit is replaced by a geometric form (usually a circle, rectangle, or hexagon) that is proportional to the value of the variable for the location. This is in contrast to a standard choropleth map, where the size of the polygon corresponds to the area of the location in question.
GeoDa implements a circular cartogram, in which the areal units are represented as circles, whose size (and color) is proportional to the value observed at that location. The changed shapes remove the misleading effect that the area of the unit might have on perception of magnitude.
We will implement the the same circular cartogram in this section with the **cartogram**
package. We start with the`cartogram_dorling` function, which will give us an **sf** object of circular
polygons based on our inputed variable. We store the result in **carto**.
```{r}
carto <- cartogram_dorling(nyc.bound, "kids2000")
```
With our new **sf** obeject, we can now map the cartogram using **tmap**. The process
is the same as with all of our maps. We start with our **sf** object in `tm_shape`,
then us `tm_fill` to specify our fill variable and other customizations. We use the
same variable for the circle sizes as the the fill color to keep the plot simple.
We can use different variables, but it makes the plot fairly hard to interpret. It is also important to note that we move the legend outside of the map because there is no room for it inside the map.
```{r}
tm_shape(carto) +
tm_fill("rent2008", palette = c("blue", "white", "red"), style = "quantile") +
tm_borders(col = "black", lwd =.8, lty = "solid") +
tm_layout(title = "Cartogram", legend.outside = TRUE,
legend.outside.position = "bottom")
```