-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathday19-TopicsAndDiscussion.Rmd
463 lines (347 loc) · 16.1 KB
/
day19-TopicsAndDiscussion.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
---
title: |
| Open Discussion of Various Topics
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: |
| ICPSR 2021 Session 1
| Jake Bowers, Ben Hansen, Tom Leavitt
bibliography:
- 'BIB/MasterBibliography.bib'
- 'BIB/master.bib'
- 'BIB/misc.bib'
- 'BIB/refs.bib'
fontsize: 10pt
geometry: margin=1in
graphics: yes
biblio-style: authoryear-comp
colorlinks: true
biblatexoptions:
- natbib=true
output:
beamer_presentation:
slide_level: 2
keep_tex: true
latex_engine: xelatex
citation_package: biblatex
template: icpsr.beamer
includes:
in_header:
- defs-all.sty
md_extensions: +raw_attribute-tex_math_single_backslash+autolink_bare_uris+ascii_identifiers+tex_math_dollars
pandoc_args: [ "--csl", "chicago-author-date.csl" ]
---
```{r setup1_env, echo=FALSE, include=FALSE}
library(here)
source(here::here("rmd_setup.R"))
## Print code by default
opts_chunk$set(echo = TRUE)
```
```{r setup2_loadlibs, echo=FALSE, include=FALSE}
## Load all of the libraries that we will use when we compile this file
## We are using the renv system. So these will all be loaded from a local library directory
library(dplyr)
library(ggplot2)
library(coin)
library(RItools)
library(optmatch)
```
## Today
1. Agenda: Talk about a few topics you mentioned last time and/or any topics
that you came to class wanting to discuss: TSCS/Longtudinal, Interference,
Pointer to Regression-based sensitivity analysis.
2. Questions arising from the reading or assignments or life?
# But first, review
## What have we done so far recently? {.fragile}
1. Assessed the sensitivity of a design which adjusted well for $\mathbf{x}$ but
which could not adjust directly for unmeasured confounders, $\mathbf{u}$
1. Rosenbaum's approach focusing on $\Gamma$ (assuming strong relationship
with $Y$). (And briefly mentioning how $\Gamma$ can be decomposed into
$\delta$ (increase in odds of a positive pair difference --- i.e. effect
on outcomes), and $\lambda$ (increase in odds of a treatment difference
--- i.e. effect on treatment/causal driver) to aid interpretation.
2. Fogarty's approach also using $\Gamma$ but in the context of testing
hypotheses about no average effect.
\begin{center}
\begin{tikzcd}[column sep=large, row sep=large,every arrow/.append style=-latex]
\mathbf{x}=\{x_1,x_2,\ldots\} \arrow[from=1-1,to=2-1, "\text{0 by adjustment}" description] \arrow[from=1-1,to=2-4, "?" description] & & &\\
Z \arrow[from=2-1,to=2-4, "\tau" description] & & & Y \\
\mathbf{u} \arrow[from=3-1,to=2-1, "\Gamma \text{ or } \lambda" description] \arrow[from=3-1,to=2-4, "\delta" description]
\end{tikzcd}
\end{center}
## What have we done so far recently?
2. Introduced the ideas behind Differences in Differences Designs (like
interrupted time-series but with a control group) and their focus on the
estimand of the $ATT_{\text{Population}}$ (pointers to other estimators.
Difference-in-Differences Estimators refer to this kind of design and focus
on this estimand.)
1. Addresses problems with the simple interrupted time series approach
2. Using own unit as counterfactual makes sense (but raises question about
history/trend other $\mathbf{u}$ that could be intervening in between
measurements)
3. Adding the non-intervention or control unit(s) helps under some
assumptions (or pre-sumptions) about using this group to impute
potential outcomes for the focal/treatment group.
4. Teaser about estimating the trends rather than assuming them (see
Leavitt's work and citations therein).
# Longitudinal Data
## Two simple questions
1. What is the impact of an event on a trajectory?
2. How do trajectories differ between types? ("treated types" and "control
types" or any types).
## Trajectories as Types
We can learn about #2 using existing tools: multiple observations can be
collapsed into a trajectory and we can work to clarify comparisons in a
categorical outcome between units. Here we create "types" directly. One could
imagine doing some latent class or cluster analysis etc.. to categorize types
of trajectories.
```{r}
## Trajectories Y_i_t for three units
time_length <- 7
times <- seq_len(time_length)
Y_1_t <- c(1, 0, 1, 1, 0, 0, 1)
Y_2_t <- c(0, 0, 0, 0, 0, 0, 1)
Y_3_t <- c(0, 1, 1, 1, 1, 1, 1)
longdat1 <- data.frame(
id = c(rep(1, time_length), rep(2, time_length), rep(3, time_length)),
Y = c(Y_1_t, Y_2_t, Y_3_t),
times = rep(times, 3),
type = c(rep(1, time_length), rep(0, time_length), rep(1, time_length))
)
longdat1
widedat1 <- longdat1 %>%
group_by(id) %>%
summarize(
Trajectory = paste(Y, collapse = ""),
Type = unique(type)
)
widedat1
```
- One could further classify trajectories as "Rising" versus "Falling" versus "Stable" and simplify the analysis.
- One could add "time of switching to type=1" as another variable if you observe this and it matters
- Transition probabilities.
## What is the impact of an event on a trajectory?
```{r echo=FALSE}
load("/Users/jwbowers/Documents/PROJECTS/jakebowers.org/ICPSR/ps4wl.rda")
```
What if we have an event occurring in the middle of a trajectory? What is the
causal effect of this event? Here the question is about the effect of having a
child on political participation (see larger literature on gender and political
participation). This structure could be the same as effect of "treaty signing" etc..
```{r}
## One person:
ps4wl %>%
filter(id65 == 8) %>%
select(one_of(c(
"id65", "year", "female", "allacts", "cumallacts",
"futureacts", "kidbornN", "kage1", "kage2", "maxdegP"
)))
```
What would be a good comparison person (or persons) for person 8 if we could find that person?
## Basics of matching for event data:
Here, I select only rows where people have a child or rows before people have a child (or rows where no child was born):
```{r}
wrk.df <- ps4wl %>%
filter((ps4wl$k1born == 1) | (ps4wl$k1born == 0 & (ps4wl$k1year < 0 | is.na(ps4wl$k1year)))) %>%
select(one_of(c(
"id65", "year", "maxdegP", "pmdeg65", "female", "allacts", "cumallacts",
"futureacts", "k1year", "kage1",
"statehs", "havekids", "hsact65", "civics65", "churP", "pintP", "ybyear", "k1born", "kid1"
)))
## For the purposes of design, we do not want years after birth of first child.
wrk.df %>% filter(id65 == 8)
```
Let's look for people who have not yet had a child but who have had a similar previous trajectory of political participation:
```{r}
## But we want allacts and cumallacts for the year before the kid was born in the treatment group.
wrk.df$baselineacts <- wrk.df$allacts
wrk.df$baselineacts[wrk.df$k1born == 1] <- wrk.df$allacts[which(wrk.df$k1born == 1) - 1]
wrk.df$baselinecumacts <- wrk.df$cumallacts
wrk.df$baselinecumacts[wrk.df$k1born == 1] <- wrk.df$cumallacts[which(wrk.df$k1born == 1) - 1]
## fix this for thos people who had a kid in 1965 (i.e. don't use someone else's 1997 data for them).
wrk.df$baselineacts[wrk.df$k1born == 1 & wrk.df$year == 65] <- wrk.df$allacts[wrk.df$k1born == 1 & wrk.df$year == 65]
wrk.df$baselinecumacts[wrk.df$k1born == 1 & wrk.df$year == 65] <- wrk.df$cumallacts[wrk.df$k1born == 1 & wrk.df$year == 65]
actsDist <- match_on(k1born ~ baselineacts + baselinecumacts,
data = wrk.df,
within = exactMatch(k1born ~ female, wrk.df)
)
```
Want to match people closer in year:
```{r}
## Make a distance matrix by hand
yearDist <- with(wrk.df, outer(year[k1born == 1], year[k1born == 0], "-"))
dimnames(yearDist) <- list(
rownames(wrk.df[wrk.df$k1born == 1, ]),
rownames(wrk.df[wrk.df$k1born == 0, ])
)
```
Don't match same person to self for purposes of design (maybe use own pre-baby
participation as a change score after matching) (Why or why not woudl we try
Difference-In-Differences here with or without matching?)
```{r}
id65Dist <- with(wrk.df, outer(id65[k1born == 1], id65[k1born == 0], function(x, y) {
as.numeric(x == y)
}))
dimnames(id65Dist) <- list(
rownames(wrk.df[wrk.df$k1born == 1, ]),
rownames(wrk.df[wrk.df$k1born == 0, ])
)
```
A mahalanobis distance:
```{r}
## Make a version of the dataset with no missing data
wrk.df.noNAs <- fill.NAs(wrk.df)
## Make a mh distance matrix
mhDist <- match_on(k1born ~ baselineacts + baselinecumacts + maxdegP + pmdeg65 +
pmdeg65.NA + hsact65 + civics65 + churP + churP.NA,
data = wrk.df.noNAs,
within = exactMatch(k1born ~ female, data = wrk.df.noNAs)
)
summary(mhDist)
```
Also a propensity score:
```{r}
## A propensity to have kids year score
## Look at cov scaling
## afmla <- k1born~baselineacts+baselinecumacts+maxdegP+pmdeg65+
## pmdeg65.NA+hsact65+civics65+churP+churP.NA
## summary(wrk.df.noNAs[,all.vars(afmla)])
library(lme4)
pscore_glm <- glmer(k1born ~ female + I(scale(baselineacts)) + I(scale(baselinecumacts)) + maxdegP + pmdeg65 +
hsact65 + civics65 + churP + churP.NA + (1 | id65),
family = binomial(link = "logit"), data = wrk.df.noNAs
)
## res <- allFit(pscore_glm)
wrk.df.noNAs$psscore <- predict(pscore_glm)
psDist <- match_on(k1born ~ psscore,
data = wrk.df.noNAs,
within = exactMatch(k1born ~ female, data = wrk.df.noNAs)
)
summary(psDist)
```
Now create the overall distance matrix reflecting these decision:
```{r}
## make self-matches infinite
baselineactsDist.pen1 <- actsDist + caliper(id65Dist, width = 0)
sometrtnms <- rownames(actsDist)[1:10]
as.matrix(id65Dist)[sometrtnms, grep("^8\\.", colnames(id65Dist), value = TRUE)]
as.matrix(baselineactsDist.pen1)[sometrtnms, grep("^8\\.", colnames(baselineactsDist.pen1), value = TRUE)]
## Don't match to controls in the future:
baselineactsDist.pen2 <- baselineactsDist.pen1 + caliper(yearDist, width = 0, compare = `>`)
## Require matches close in year
baselineactsDist.pen3 <- baselineactsDist.pen2 + caliper(yearDist, width = 3)
## And not too distant in mhDist or psDist
baselineactsDist.pen4 <- baselineactsDist.pen3 + caliper(mhDist, quantile(as.vector(mhDist), .9)) + caliper(psDist, quantile(as.vector(psDist), .9))
```
We have 85 moments where a child is born and we want to find moments where
children are not born for people like those who decided to have a child, near
the same moments in their lives and also the same moments in history.
```{r}
## How many "treated" observations?
table(wrk.df.noNAs$k1born,exclude=c())
pm1 <- pairmatch(baselineactsDist.pen4,
data = wrk.df.noNAs,
remove.unmatchables = TRUE
)
summary(pm1)
fm1 <- fullmatch(baselineactsDist.pen4,
data = wrk.df.noNAs,min.controls=1
)
summary(fm1,min.controls=0,max.controls=Inf)
wrk.df.noNAs$pm1 <- pm1
wrk.df.noNAs %>% filter(pm1 == 1.1)
```
## Where to go from here
See Chapter 13, "Risk Set Matching", of @rosenbaum2010 and associated references.
Promising work. I haven't read it.
- Maybe promising review: @thomas2020matching
- @blackwell2018telescopematching
- @imai2019matching (Also Kim and Imai and colleagues have a series of papers
on "Fixed effects" models that addresses the same design topic ---
obervational studies with multiple observations per unit, and a desire to
estimate average causal effects using stratification but **not** using
future observations (say) to estimate effects of past events.)
- Consider Mahalanobis distance on all past outcomes? (See Nielson on
"Matching with Time-Series Cross-Sectional Data")
# Interference
## Overview
Just to demonstrate what is behind the presumption of "no interference" and to
show just one very simple approach (that you already know).
## Statistical inference with interference?
\includegraphics[width=.3\textwidth]{images/complete-graph.pdf}
\includegraphics[width=.95\textwidth]{images/interference-example.pdf}
On estimation see (Sobel, Aronow, Basse, Eckles, Feller, Samii, Hudgens,
Ogburn, VanderWeele, Toulis, Kao, Coppock, Sicar, Raudenbush, Hong, Volfovsky).
The key question for that work: What is the function of potential outcomes that
we can estimate using observed data?
## Statistical inference with interference?
\includegraphics[width=.3\textwidth]{images/complete-graph.pdf}
\includegraphics[width=.95\textwidth]{images/interference-example-2.pdf}
Introducing the potential outcome when no treatment has been applied to any
node aka \textbf{the uniformity trial} $\equiv \by_{i,0000}$ (Rosenbaum, 2007).
## Imagine an experiment on a network:
\includegraphics[width=.95\textwidth]{images/fishersutvaNetwork.pdf}
## A model of propagation
- The direct effect of treatment is $\beta$ (it is a multiplicative effect).
- Treatment effects flows from treated to control units, increasing with
number of treated neighbors, with rate of growth of effect $\tau$.
\includegraphics[width=.7\textwidth]{images/fishersutvaModel.pdf}
## Evidence about the parameters
\includegraphics[width=.5\textwidth]{images/ssr2d.pdf}
Plot shows the proportion of $p$-values less than .05 for randomization tests of
joint hypotheses about $\tau$ and $\beta$. Darker values mean less rejection.
Truth is at $\tau = .5$, $\beta= 2$. All tests reject the truth no more than 5% of the
time at $\alpha = .05$. All simulations using permutation-based reference distributions.
## Summary of Hypotheses as Causal Models
A causal model relates potential outcomes to each other and a research design
relates potential outcomes to observed data (and to sources of uncertainty). For examples:
- For no effects at all: $y_{i,0} = \HH(Y_i,Z_i,\tau_i) = Y_{i}$.
- For constant, additive effects: $y_{i,0} = \HH(Y_i,Z_i,\tau_i) = Y_{i} - Z_i \tau$.
- For vector valued outcomes in a network with nonlinear propagation of causal effects:
\begin{equation}
\yu = \HH(\by_{\bz}, \bzero, \beta, \tau) = ( \beta + (1 - z_i) (1 - \beta) \exp( - \tau^2 \bz^{T} \bS ) )^{-1} \by \bz
\end{equation}
In fact *any function* that produces vectors of $\yu$ could be used to
represent these kinds of causal models (including agent based models ---
imagining agents revealing potential outcomes depending on network, treatment,
and covariates).
## A General Testing-Based Causal Inference Algorithm
1. Write a model converting uniformity trial potential outcomes (like $\yu$ or
simply $y_{i,0}$) into observed data (like $\by_{\bz}$ or simply $Y_i$):
What is the mechanism by which the treatment changes the outcomes of the
units? (This is a structural model of potential outcomes. It could be an
agent-based model.)
2. Solve for $\yu$. What adjustment of the observed data does this model
imply? $\HH(\by_{\bz}, \bzero, \theta_0) = \yu$ like $y_{i,0} = Y_i + Z
\tau$ for the simple constant, additive effects model.
3. Select a test statistic that is effect-increasing in all relevant
dimensions (like the sum of squared residuals test statistic or the KS-test
statistic for certain models, or, I conjecture, an energy-statistic).
4. Compute $p$-values for substantively meaningful range of $\theta$. Or
calculate boundaries of regions. (Perhaps collapse or aggregate the
rejection regions to aid interpretation.)
## Where to to go from here
**Neyman:** Estimating some average causal effects @aronowsamii2017 weighting by
probability of exposure to the treatment via the network (see associated
@toulis2013estimation).
\medskip
Lots of work here now. Estimating average effects when agnostic to
networks, for example. Etc. Check out work by Hudgens
<http://www.bios.unc.edu/~mhudgens/>, Ogburn <https://www.eogburn.com>,
Basse <https://gwbasse.com>, Eckles <https://www.deaneckles.com>, Toulis
<https://www.ptoulis.com/working-papers>, Volfovsky
<https://volfovsky.github.io>.
Check out:
- <https://cran.r-project.org/web/packages/inferference/index.html>
- <https://github.com/szonszein/interference> (See for example <https://rdrr.io/github/szonszein/interference/man/estimates.html>)
\medskip
**Fisher:** Testing causal theories that generate hypotheses about effects (as
it propagates across a network) @bowers2013sutva, @bowers2016research,
@bowers2018models.
# Sensitivity Analysis for Regression Models
## Sensitivity Analysis for Regression Models: A pointer
Read @cinellihazlett2020 and @hosman2010 and @chaudoinetal2018.
@cinellihazlett2020 is current state of the art (see their easy to use R
package, youtube videos, etc).
# Futher Topics and Discussion?
## Questions, Thoughts?
## References