Fix last section and remove by hand numbers

egap · Aug 14, 2024 · 8977ec6 · 8977ec6
1 parent 1654a5b
commit 8977ec6
Showing 2 changed files with 39 additions and 33 deletions.
diff --git a/guides/causal-inference/x-cause-y.en.qmd b/guides/causal-inference/x-cause-y.en.qmd
@@ -1,6 +1,6 @@
 ---
 title: "10 Strategies for Figuring out whether X Causes Y"
-author: 
+author:
   - name: "Macartan Humphreys"
     url: https://macartan.github.io
 image: x-cause-y.png
@@ -17,19 +17,19 @@ The strategy used in randomized control trials (or randomized interventions, ran
 
 A second strategy used more in lab settings and also in the physical sciences is to use experimental control to ensure that two units are identical to each other in all relevant respects except for treatment. For example if you wanted to see if a heavy ball falls faster than a lighter ball you might make sure that they have the same shape and size and drop them both at the same time, under the same weather conditions, and so on. You then attribute any differences in outcomes to the feature that you did not keep constant between the two units. This strategy is fundamentally different to that used in randomized trials. In randomized trials you normally give up on the idea of keeping everything fixed and seek instead to make sure that natural variation—on variables that you can or cannot observe—does not produce bias in your estimates; in addition you normally seek to assess average effects across a range of background conditions rather than for a fixed set of background conditions. The merits of the control approach depend on your confidence that you can indeed control all relevant factors; if you cannot, then a randomized approach may be superior.
 
-# Natural experiments (as-if randomization) 
+# Natural experiments (as-if randomization)
 
 Sometimes researchers are not able to randomize, but causal inference is still possible because nature has done the randomization for you. The key feature of the "natural experiment" approach is that you have reason to believe that variation in some natural treatment is "as-if random." For example say that seats in a school are allocated by lottery. Then you might be able to analyze the effects of school attendance as if it were a randomized control trial. One clever study of the effects of conflict on children by @blattman_annan_2010 used the fact that the Lord’s Resistance Army (LRA) in Uganda abducted children in a fairly random fashion. Another clever study on Disarmament, Demobilization, and Reintegration (DDR) programs by @gilligan_et_al_2012 used the fact that an NGO’s operations were interrupted because of a contract dispute, which resulted in a “natural” control group of ex-combatants that did not receive demobilization programs. See @dunning_2012 for a guide to finding and analyzing natural experiments.
 
-# Before/after comparisons 
+# Before/after comparisons
 
 Often the first thing that people look to in order to work out causal effects is the comparison of units before and after treatment. Here you use the past as a control for the present. The basic idea is very intuitive: you switch the lightswitch off and you see the light switch off; attributing the light change to the action seems easy even in the absence of any randomization or control. But for many social interventions the approach is not that reliable, especially in changing environments. The problem is that things get better or worse for many reasons unrelated to treatments or programs you are interested in. In fact it is possible that because of all the other things that are changing, things can get worse in a program area even if the programs had a positive effect (so they get worse but are still not as bad as they would have been without the program!). A more sophisticated approach than simple before/after comparison is called "difference in differences" – basically you compare the before/after difference in treatment areas with those in control areas. This is a good approach but you still need to be sure that you have good control groups and in particular that control and treatment groups are not likely to change differently for reasons other than the treatment.
 
-# Ex Post Controlling I: Regression 
+# Ex Post Controlling I: Regression
 
 Perhaps the most common approach to causal identification in applied statistical work is the use of multiple regression to control for possible confounders. The idea is to try to use whatever information you have about why treatment and control areas are not readily comparable and adjust for these differences statistically. This approach works well to the extent that you can figure out and measure the confounders and how they are related to treatment, but is not good if you don’t know what the confounders are. In general we just don’t know what all the confounders are and that exposes this approach to all kinds of biases (indeed if you control for the wrong variables it is possible to *introduce* bias where none existed previously).
 
-# Ex Post Controlling II: Matching and Weighting 
+# Ex Post Controlling II: Matching and Weighting
 
 A variety of alternative approaches seek to account for confounding variables by carefully matching treatment units to one or many control units. Matching has some advantages over regression (for example, estimates can be less sensitive to choices of functional form), but the basic idea is nevertheless similar, and indeed matching methods can be implemented in a regression framework using appropriate weights. Like regression, at its core, this strategy depends on a conviction that there are no important confounding variables that the researcher is unaware of or is unable to measure. Specific methods include:
 
@@ -45,22 +45,22 @@ A variety of alternative approaches seek to account for confounding variables by
 * Stable balancing weights [@zubizarreta_2015], and the use of
 * Synthetic controls [@abadie_et_al_2015].
 
-# Instrumental variables (IV) 
+# Instrumental variables (IV)
 
 Another approach to identifying causal effects is to look for a feature that explains why a given group got a treatment but which is otherwise unrelated to the outcome of interest. Such a feature is called an instrument. For example say you are interested in the effect of a livelihoods program on employment, and say it turned out that most people who got access to the livelihoods program did so because they were a relative of a particular program officer. Now suppose that being a relative of the program officer does not affect job prospects in any way other than through its effect on getting access to the livelihoods program. If so, then you can work out the effect of the program by understanding the effect of being a relative of the program officer on job prospects. This has been a fairly popular approach but the enthusiasm for it has died a bit, basically because it is hard to find a good instrument. One smart application are studies on the effects of poverty on conflict which use rainfall in Africa as an instrument for income/growth. While there are worries that the correlation between conflict and poverty may be due to the fact that conflict causes poverty, it does not seem plausible that conflict causes rainfall! So using rainfall as an instrument here gave a lot more confidence that really there is a causal, and not just correlational, relationship between poverty and conflict [@miguel_et_al_2004].
 
-# Regression discontinuity designs (RDD) 
+# Regression discontinuity designs (RDD)
 
 The regression discontinuity approach works as follows. Say that some program is going to be made available to a set of potential beneficiaries. These potential beneficiaries are all ranked on a set of relevant criteria, such as prior education levels, employment status, and so on. These criteria can be quantitative; but they can also include qualitative information such as assessments from interviews. These individual criteria are then aggregated into a single score and a threshold is identified. Candidates scoring above this threshold are admitted to the program, while those below are not. "Project" and "comparison" groups are then identified by selecting applicants that are close to this threshold on either side. Using this method we can be sure that treated and control units are similar, at least around the threshold. Moreover, we have a direct measure of the main feature on which they differ (their score on the selection criteria). This information provides the key to estimating a program effect from comparing outcomes between these two groups. The advantage of this approach is that all that is needed is that the implementing agency uses a clear set of criteria (which can be turned into a score) upon which they make treatment assignment decisions. The disadvantage is that really reliable estimates of impact can only be made for units right around the threshold. For overviews of RDD, see @skovron_titiunik_2015 and @lee_lemieux_2013; for two interesting applications, see @manacorda_et_al_2011 on Uruguay and @samii_2013 on Burundi.
 
-# Process tracing 
+# Process tracing
 
 In much qualitative work researchers try to establish causality by looking not just at whether being in a program is associated with better outcomes but (a) looking for steps in the process along the way that would tell you whether a program had the effects you think it had and (b) looking for evidence of other outcomes that should be seen if (or perhaps: if and only if) the program was effective. For example not just whether people in a livelihoods program got a job but whether they got trained in something useful, got help from people in the program to find an employer in that area, and so on. If all these steps are there, that gives confidence that the relationship is causal and not spurious. If a program was implemented but no one actually took part in it, this might give grounds to suspect that any correlation between treatment and outcomes is spurious. The difficulty with this approach is that it can be hard to know whether any piece of within-case evidence has probative value. For example a program may have positive (or negative) effects through lots of processes that you don’t know anything about and processes that you think are important, might not be. See @humphreys_jacobs_2015 for a description of the Bayesian logic underlying process tracing and illustrations of how to combine it with other statistical approaches.
 
-# Front Door Strategies (Argument from mechanisms) 
+# Front Door Strategies (Argument from mechanisms)
 
 A final approach, conceptually close to process tracing, is to make use of mechanisms. Say you know, as depicted in the picture below, that $A$ can cause $C$ only through $B$. Say moreover that you know that no third variable causes both $B$ and $C$ (other than, perhaps, via $A$) and no third variable causes both $A$ and $B$. Then covariation between $A$ and $B$ and between $B$ and $C$ can be used to assess the effect of $A$ on $C$. The advantage is that causality can be established even in the presence of confounders — for example even if, as in the picture below, unobserved variables cause both $A$ and $C$. The difficulty however is that the strategy requires a lot of confidence in your beliefs about the structure of causal relations. For more see @pearl_2000.
 
 ![](x-cause-y_dag.png)
 
-# References {.unnumbered .unlisted}
+# References {.unnumbered .unlisted}
diff --git a/guides/research-questions/effect-types_en.qmd b/guides/research-questions/effect-types_en.qmd
@@ -16,8 +16,7 @@ on [10 Things to Know About Causal
 Inference](https://methods.egap.org/guides/causal-inference/causal-inference_en.html).
 
 
-1. Causal effects as comparisons between potential outcomes
-==
+# Causal effects as comparisons between potential outcomes
 
 If a person who took an aspirin would have had a worse headache if she had not
 taken the aspirin, then we say that the aspirin caused an improvement in her
@@ -36,8 +35,8 @@ multiple people), or perhaps some other quantity like a ratio. This guide
 describes a few common causal effects that focus on averages.
 
 
-2. Individual treatment effects
-==
+# Individual treatment effects
+
 
 A first quantity of interest is the individual treatment effect. For this
 estimand a researcher would be interested in the effect of a treatment on each
@@ -64,8 +63,7 @@ Testing](https://egap.org/resource/10-things-to-know-about-hypothesis-testing/)
 to learn about hypothesis testing for causal inference.]
 
 
-3. Average treatment effects
-==
+# Average treatment effects
 
 Some reseachers want to learn about the average treatment effect (ATE) across all observations in our experiment. Which we can define as the average of the individual level additive treatment effects across all units in the study.
 
@@ -76,8 +74,7 @@ See [10 Strategies for Figuring out if X Causes Y](https://methods.egap.org/guid
 
 [^1]: See also Chapter 2 of @gerber_green_2012 or Chapter 2 of @angrist_pischke_2008 for more information
 
-4. Population and sample average treatment effects
-==
+# Population and sample average treatment effects
 
 When defining the average treatment effect, it isn't immediately clear which
 units should be included in the *all* part of *all units*. Often we want to use
@@ -96,8 +93,7 @@ SATE and PATE are equal in expectation.[^3]
 
 [^3]: See @imbens_wooldridge_2007.
 
-5. Conditional average treatment effects
-==
+# Conditional average treatment effects
 
 Sometimes the theory or policy intervention motivating the study suggests that the treatment effect should differ for different sorts of people.
 
@@ -117,8 +113,7 @@ Informally, the ATT is the effect for those that we treated; ATC is what the eff
 
 Those interested in further reading on conditional average treatment effects should see chapter 9 of @gerber_green_2012.
 
-6. Intent-to-treat effects
-==
+# Intent-to-treat effects
 
 Outside of a controlled laboratory setting, the subjects we assign to treatment
 often are not the same as the subjects who actually receive the treatment. For
@@ -144,9 +139,7 @@ Which is the average effect conditional on the subject being a complier, a type
 
 [^6]: Detailed discussion of both ITT and CACE can be found in chapters 5 and 6 of @gerber_green_2012.
 
-
-7. Quantile Average Treatment Effects
-==
+# Quantile Average Treatment Effects
 
 The ATE focuses on the effect for a typical person, but we often also care about the distributional consequences of our treatment. We want to know not just whether our treatment raised average income, but also whether it made the distribution of income in the study more or less equal.
 
@@ -160,18 +153,15 @@ If these assumptions are justified for our data, we can obtain consistent estima
 
 [^19]: See Koenker, Roger, and Kevin Hallock. 2001. “Quantile Regression: An Introduction.” Journal of Economic Perspectives 15 (4): 43–56. for a concise overview of quantile regression
 
-8. Average marginal component effect
-==
+# Average marginal component effect
 
 For a conjoint experiment (see [10 Things to Know About Survey Experiments](https://methods.egap.org/guides/data-collection/survey-experiments_en.html)), one might be interested in the marginal effect of changing one attribute. The average marginal component effect (AMCE) gives  the average causal effect of changing an attribute from one value to another, while holding equal the joint distribution of the other attributes in the design, averaged over this distribution.
 
 The probabilities associated with each factor are also informed by the choice of estimand. For the uniform AMCE (uAMCE),  each factor is independently and uniformly marginalized. In contrast, the population AMCE (pAMCE) is marginalized over the target population distribution of profiles. The AMCE is always defined with respect to the distribution used for the random assignment; so if you change the randomization distribution, the interpretation changes.[^7]
 
 [^7]: For more on the AMCe and conjoint experiments, see @bansak2022using and @liu2023multiple
 
-
-9. Direct and indirect effects and Eliminated Effects
-==
+# Direct and indirect effects and Eliminated Effects
 
 When discussing the effects of a treatment, one could be interested in the total effect of the treatment on the outcome, as assumed above, or components of the effect of the treatment. Specifically, one might be interested in the direct effect of the treatment or any number of indirect effects of a treatment. Largely, interest in direct and indirect effects comes from an interest in [mechanisms](https://methods.egap.org/guides/research-questions/mechanisms_en.html). It might be of interest to a researcher *how* the effect happens, rather than simply knowing whether all of the mechanisms together produce an effect.
 
@@ -211,14 +201,30 @@ This effect can be observed, as both the total effect and the controlled direct
 
 [^8]: See @acharya2018analyzing.
 
-10. Treatment Effects for Binary Outcomes: Log-Odds Treatment Effects and Attributable Effects
-==
-Average treatment effects seem a bit hard to interpret when outcomes are not continuous. For example, a very common binary outcome in the study of elections is coded as 1 when subjects voted, and 0 when they did not. The average effect might be 0.2, but what does it really mean to say that a treatment increased voting by 0.2 for individual? Estimating causal effects for dichotomous outcomes requires some additional care, particularly when including covariates. A common quantity of causal interest for dichotomous outcomes is our treatment’s effect on the log-odds of success, defined for the experimental pool as:
+# Treatment Effects for Binary Outcomes: Log-Odds Treatment Effects and Attributable Effects
+
+Although a difference in proportion is an easy to interpret quantity (and it is
+the ATE when the outcomes are binary), there are at least two other effects
+that are in use when outcomes are binary.
+
+Average treatment effects seem a bit hard to interpret when outcomes are not
+continuous. For example, a very common binary outcome in the study of elections
+is coded as 1 when subjects voted, and 0 when they did not. The average effect
+might be 0.2, but what does it really mean to say that a treatment increased
+voting by 0.2 for individual?
+
+## Log-Odds Treatment Effects
+
+A common quantity of causal interest for dichotomous outcomes is our
+treatment’s effect on the log-odds of success, defined for the experimental
+pool as:
 
 $$\Delta = log\frac{E(Y_i(1))}{1-E(Y_i(1))} - log\frac{E(Y_i(0))}{1-E(Y_i(0))}$$
 
 @freemand_2008b shows that the coefficient from a logistic regression adjusting for covariates in a randomized experiments produces biased estimates of this causal effect. The basic intuition for Freedman’s argument comes from the fact that taking the log of averages is not the same as taking the average of logs and so the treatment coefficient estimated from a logistic regression conditioning on covariates will not provide a consistent estimator of log-odds of success. Instead, Freedman recommends taking the predicted probabilities varying subjects’ treatment status but maintaining their observed covariate profiles to produce a consistent estimator of the log-odds.
 
+## Attributable Effects
+
 An alternative approach with binary outcomes is to infer about the *sum* of successes rather than the difference in proportions. @rosenbaum_2002a introduces this quantity in the context of matched observational studies and @hansen_bowers_2009 use it in a randomized field experiment where voting or not voting is measured at the individual level.
 
 Consider a simple case with a dichotomous outcome and treatment. Let $A$ be the number of outcomes attributable to treatment, that is, the number of cases in which $Y_i$ equaled 1 among treated subjects which would not have occurred had these units been assigned to control. For a range of $A$’s, we adjust the observed contingency table of outcomes among the treated, and compare this resulting distribution to a known null distribution (the distribution of outcomes we would have observed had treatment had no effect). The resulting range of $A$’s for which our test continues to reject the null hypothesis of no effect provides a range of effects that are attributable to our treatment.