Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Aug 14, 2024
1 parent 74742e1 commit 0b3850e
Show file tree
Hide file tree
Showing 39 changed files with 665 additions and 129 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
4e29381a
764b61a0
56 changes: 28 additions & 28 deletions guides.html

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions guides/analysis-procedures/covariates_en.html
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,8 @@ <h1 data-number="4"><span class="header-section-number">4</span> Why to do it</h
<p>Controlling for these covariates tends to improve precision if the covariates are predictive of potential outcomes. Take a look at the following example, which is loosely based on <span class="citation" data-cites="gine_mansuri_2012">Giné and Mansuri (<a href="#ref-gine_mansuri_2012" role="doc-biblioref">2012</a>)</span>, an experiment on female voting behavior in Pakistan. In this experiment, the authors randomized an information campaign to women in Pakistan to study its effects on their turnout behavior, the independence of their candidate choice, and their political knowledge. They carried out a baseline survey which provided them with several covariates.</p>
<p>The following code imitates this experiment by creating fake data for four of the covariates they collect: whether the woman owns an identification card, whether the woman has formal schooling, the woman’s age, and whether the woman has access to TV. It also creates two <a href="https://methods.egap.org/guides/causal-inference/causal-inference_en.html">potential outcomes</a> (the outcomes that would occur if she were assigned to treatment and if not) for a measure of the extent to which a woman’s choice of candidate was independent of the opinions of the men in her family. The potential outcomes are correlated with all four covariates, and the built-in “true” treatment effect on the independence measure here is 1. To figure out whether our estimator is biased or not, we simulate 10,000 replications of our experiment. On each replication, we randomly assign treatment and then regress the observed outcome <span class="math inline">\(Y\)</span> on the treatment indicator <span class="math inline">\(Z\)</span>, with and without controlling for covariates. Thus, we are simulating two methods (unadjusted and covariate-adjusted) for estimating the ATE. To estimate the bias of each method, we take the difference between the average of the 10,000 simulated estimates and the “true” treatment effect.</p>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">rm</span>(<span class="at">list=</span><span class="fu">ls</span>())</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">20140714</span>)</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>N <span class="ot">=</span> <span class="dv">2000</span></span>
Expand Down Expand Up @@ -332,6 +334,7 @@ <h1 data-number="4"><span class="header-section-number">4</span> Why to do it</h
<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a><span class="co"># Margin of error (at 95% confidence level) for each estimated bias</span></span>
<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a><span class="fl">1.96</span> <span class="sc">*</span> sd.of.unadj <span class="sc">/</span> <span class="fu">sqrt</span>(Replications)</span>
<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a><span class="fl">1.96</span> <span class="sc">*</span> sd.of.adj <span class="sc">/</span> <span class="fu">sqrt</span>(Replications)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p>Both methods—with and without covariates—yield the true treatment effect of 1 on average. When we ran the regression without covariates, our estimated ATE averaged 1.0008 across the 10,000 replications, and with covariates, it averaged 1.0003. Notice that the regression-adjusted estimate is essentially unbiased even though our regression model is misspecified—we control for age linearly when the true data generating process involves the log of age.<a href="#fn4" class="footnote-ref" id="fnref4" role="doc-noteref"><sup>4</sup></a></p>
<p>The real gains come in the precision of our estimates. The standard error (the standard deviation of the sampling distribution) of our estimated ATE when we ignore covariates is 0.121. When we include covariates in the model, our estimate becomes a bit tighter: the standard error is 0.093. Because our covariates were prognostic of our outcome, including them in the regression explained some noise in our data so that we could tighten our estimate of ATE.</p>
Expand All @@ -342,6 +345,8 @@ <h1 data-number="5"><span class="header-section-number">5</span> When will it he
<p>Covariate adjustment will be most helpful when your covariates are strongly predictive (or “prognostic”) of your outcomes. Covariate adjustment essentially enables you to make use of information about relationships between baseline characteristics and your outcome so that you can better identify the relationship between treatment and the outcome. But if the baseline characteristics are only weakly correlated with the outcome, covariate adjustment won’t do you much good. The covariates you will want to adjust for are the ones that are strongly correlated with outcomes.</p>
<p>The following graph demonstrates the relationship between how prognostic your covariate is and the gains you get from adjusting for it. On the x-axis is the sample size, and on the y-axis is the <a href="https://en.wikipedia.org/wiki/Root-mean-square_deviation">root mean squared error</a> (RMSE), the square root of the average squared difference between the estimator and the true ATE. We want our RMSE to be small, and covariate adjustment should help us reduce it.</p>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">rm</span>(<span class="at">list=</span><span class="fu">ls</span>())</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(MASS) <span class="co"># for mvrnorm()</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1234567</span>)</span>
Expand Down Expand Up @@ -383,6 +388,7 @@ <h1 data-number="5"><span class="header-section-number">5</span> When will it he
<span id="cb2-39"><a href="#cb2-39" aria-hidden="true" tabindex="-1"></a> <span class="fu">expression</span>(<span class="fu">paste</span>(rho, <span class="st">"=0"</span>)), <span class="fu">expression</span>(<span class="fu">paste</span>(rho, <span class="st">"=0.5"</span>)),</span>
<span id="cb2-40"><a href="#cb2-40" aria-hidden="true" tabindex="-1"></a> <span class="fu">expression</span>(<span class="fu">paste</span>(rho, <span class="st">"=0.9"</span>))),</span>
<span id="cb2-41"><a href="#cb2-41" aria-hidden="true" tabindex="-1"></a> <span class="at">col=</span><span class="fu">c</span>(<span class="st">"black"</span>, <span class="st">"yellow"</span>,<span class="st">"orange"</span>, <span class="st">"red"</span>), <span class="at">lty =</span> <span class="dv">1</span>, <span class="at">lwd=</span><span class="dv">2</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p><img src="covariates_rmse.png" class="img-fluid"></p>
<p>The black line shows the RMSE when we don’t adjust for a covariate. The red line shows the RMSE when we adjust for a highly prognostic covariate (the correlation between the covariate and the outcome is 0.9). You can see that the red line is always below the black line, which is to say that the RMSE is lower when you adjust for a prognostic covariate. The orange line represents the RMSE when we adjust for a moderately prognostic covariate (the correlation between the covariate and the outcome is 0.5). We still are getting gains in precision relative to the black line, but not nearly as much as we did with the red line. Finally, the yellow line shows what happens if you control for a covariate that is not at all predictive of the outcome. The yellow line is almost identical to the black line. You received no improvement in precision by controlling for a non-prognostic covariate; in fact, you paid a slight penalty because you wasted a degree of freedom, which is especially costly when the sample size is small. This exercise demonstrates that you’ll get the most gains in precision by controlling for covariates that strongly predict outcomes.</p>
Expand Down Expand Up @@ -411,6 +417,8 @@ <h1 data-number="7"><span class="header-section-number">7</span> When not to do
<p>Suppose, for example, that Giné and Mansuri had collected data on how many political rallies a woman attended after receiving the treatment. In estimating the treatment effect on independence of political choice, you may be tempted to include this variable as a covariate in your regression. But including this variable, even if it strongly predicts the outcome, may distort the estimated effect of the treatment.</p>
<p>Let’s create this fake variable, which is correlated (like the outcome measure) with baseline covariates and also with treatment. Here, by construction, the treatment effect on number of political rallies attended is 2. When we included the rallies variable as a covariate, the estimated average treatment effect on independence of candidate choice averaged 0.54 across the 10,000 replications. Recall that the true treatment effect on this outcome is 1. This is severe bias, all because we controlled for a post-treatment covariate!<a href="#fn7" class="footnote-ref" id="fnref7" role="doc-noteref"><sup>7</sup></a> This bias results from the fact that the covariate is correlated with treatment.</p>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Create post-treatment covariate that's correlated with pre-treatment covariates</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a>rallies0 <span class="ot">=</span> <span class="fu">round</span>(.<span class="dv">5</span><span class="sc">*</span>owns.id.card <span class="sc">+</span> has.formal.schooling <span class="sc">+</span> <span class="fl">1.5</span><span class="sc">*</span>TV.access <span class="sc">+</span> <span class="fu">log</span>(age))</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>rallies1 <span class="ot">=</span> rallies0 <span class="sc">+</span> <span class="dv">2</span></span>
Expand All @@ -428,6 +436,7 @@ <h1 data-number="7"><span class="header-section-number">7</span> When not to do
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a><span class="fu">mean</span>(post.adjusted.estimates) <span class="sc">-</span> true.treatment.effect</span>
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a><span class="co"># Margin of error (at 95% confidence level) for the estimated bias</span></span>
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a><span class="fl">1.96</span> <span class="sc">*</span> <span class="fu">sd</span>(post.adjusted.estimates) <span class="sc">/</span> <span class="fu">sqrt</span>(Replications)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<p>Just because you should not adjust for post-treatment covariates does not mean you cannot collect covariate data post-treatment, but you must exercise caution. Some measures could be collected post-treatment but are unlikely to be affected by treatment (e.g., age and gender). Be careful about measures that may be subject to evaluation-driven effects, though: for example, treated women may be more acutely aware of the expectation of political participation and may retrospectively report that they were more politically active than they actually were several years prior.</p>
</section>
Expand Down
15 changes: 15 additions & 0 deletions guides/analysis-procedures/how-to-analyze-experiments_en.html
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,8 @@ <h1 data-number="10"><span class="header-section-number">10</span> # Write up de
<h1 data-number="11"><span class="header-section-number">11</span> # Example code</h1>
<p>In this example, we use data from <span class="citation" data-cites="gaikwad_nellis_2021">Gaikwad and Nellis (<a href="#ref-gaikwad_nellis_2021" role="doc-biblioref">2021</a>)</span>. They employ a door-to-door field experiment in two Indian cities to increase the political inclusion of migrants. The treatment provided intensive assistance in applying for a voter identification card. They conduct a simple randomization where 2306 migrants were either assigned to treatment or control with a 50 percent probability. They look at the impact of the treatment on several outcomes, one of which whether an individual voted in India’s 2019 national election.</p>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(magrittr)</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(kableExtra)</span>
Expand All @@ -308,8 +310,11 @@ <h1 data-number="11"><span class="header-section-number">11</span> # Example cod
<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a> <span class="fu">summarise</span>(<span class="at">n =</span> <span class="fu">n</span>()) <span class="sc">%&gt;%</span></span>
<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a> <span class="fu">mutate</span>(<span class="at">freq =</span> n <span class="sc">/</span> <span class="fu">sum</span>(n)) <span class="sc">%&gt;%</span></span>
<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a> <span class="fu">kable</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(skimr)</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="do">### step 2 </span><span class="al">###</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="co"># outcome and covariate inspection</span></span>
Expand Down Expand Up @@ -370,8 +375,11 @@ <h1 data-number="11"><span class="header-section-number">11</span> # Example cod
<span id="cb2-58"><a href="#cb2-58" aria-hidden="true" tabindex="-1"></a> <span class="fu">scale_x_discrete</span>(<span class="at">labels =</span> <span class="fu">c</span>(<span class="st">"Original"</span>)) <span class="sc">+</span></span>
<span id="cb2-59"><a href="#cb2-59" aria-hidden="true" tabindex="-1"></a> <span class="fu">xlab</span>(<span class="st">""</span>) <span class="sc">+</span></span>
<span id="cb2-60"><a href="#cb2-60" aria-hidden="true" tabindex="-1"></a> <span class="fu">ylab</span>(<span class="st">"Income (000s INR)"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="do">### step 3 </span><span class="al">###</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="co"># dealing with outliers</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a></span>
Expand Down Expand Up @@ -411,8 +419,11 @@ <h1 data-number="11"><span class="header-section-number">11</span> # Example cod
<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a> <span class="fu">scale_x_discrete</span>(<span class="at">labels =</span> <span class="fu">c</span>(<span class="st">"Original"</span>, <span class="st">"Winsorized"</span>)) <span class="sc">+</span></span>
<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a> <span class="fu">xlab</span>(<span class="st">""</span>) <span class="sc">+</span></span>
<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a> <span class="fu">ylab</span>(<span class="st">"Income (000s INR)"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="do">### step 4 </span><span class="al">###</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="co"># checking for imbalance</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a></span>
Expand Down Expand Up @@ -531,8 +542,11 @@ <h1 data-number="11"><span class="header-section-number">11</span> # Example cod
<span id="cb4-116"><a href="#cb4-116" aria-hidden="true" tabindex="-1"></a><span class="co"># Rbeta.hat &lt;- coef(t1bal)[-1]</span></span>
<span id="cb4-117"><a href="#cb4-117" aria-hidden="true" tabindex="-1"></a><span class="co"># RVR &lt;- vcovHC(t1bal, type &lt;- 'HC0')[-1,-1]</span></span>
<span id="cb4-118"><a href="#cb4-118" aria-hidden="true" tabindex="-1"></a><span class="co"># W_obs &lt;- as.numeric(Rbeta.hat %*% solve(RVR, Rbeta.hat))</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
<div class="cell">
<details class="code-fold">
<summary>Code</summary>
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="do">### step 4 </span><span class="al">###</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="co"># checking for attrition</span></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a></span>
Expand Down Expand Up @@ -581,6 +595,7 @@ <h1 data-number="11"><span class="header-section-number">11</span> # Example cod
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a> </span>
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a><span class="do">## </span><span class="al">TODO</span><span class="do">: add in balance (missingness on treatment by covariate interaction, then F stat?)</span></span>
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a> <span class="co"># need to wait to hear back about this part</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
</div>
</section>
<section id="references" class="level1" data-number="12">
Expand Down
Loading

0 comments on commit 0b3850e

Please sign in to comment.