Skip to content

Commit

Permalink
SDID and DR-learner example formatting changes (#233)
Browse files Browse the repository at this point in the history
* Changed header formatting on SDID notebook to be consistent with existing notebooks

* Changed header formatting on DR-learner notebook to be consistent with existing notebooks

* added outline and moved around headers

* updated sdid notebook headers and minor rearranging
  • Loading branch information
SamWitty authored Aug 7, 2023
1 parent 19a395c commit c979555
Show file tree
Hide file tree
Showing 2 changed files with 94 additions and 18 deletions.
49 changes: 41 additions & 8 deletions docs/source/dr_learner.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,33 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Doubly robust estimation with Chirho"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Outline\n",
"\n",
"- [Setup](#setup)\n",
"- [Overview: Robust Causal Inference with Cut Modules](#overview:-robust-causal-inference-with-cut-modules)\n",
"- [Example: Synthetic data generation from a high-dimensional generalized linear model](#example:-synthetic-data-generation-from-a-high-dimensional-generalized-linear-model)\n",
"- [Effect estimation using cut modules](#effect-estimation-using-cut-modules)\n",
"- [References](#references)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand Down Expand Up @@ -27,11 +55,10 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Doubly robust estimation with Chirho\n",
"## Overview: Robust causal inference with cut modules\n",
"\n",
"In this notebook, we implement a Bayesian analogue of the DR-Learner estimator in Kennedy (2022). The DR-Learner estimator is a doubly robust estimator for the conditional average treatment effect (CATE). It works by regressing a \"psuedo outcome\" on treatment, where the \"psuedo outcome\" is contructed by approximating the outcome and propensity score functions. The DR-Learner estimator is doubly robust in the sense that it is consistent if either the outcome or propensity score models are correctly specified. Moreoever, as long as the outcome and propensity score models can be estimated at $O(N^{-1/4})$ rates, the DR-Learner estimator can estimate CATE at the *parametric* $O(N^{-1/2})$ fast rate.\n",
"\n",
Expand Down Expand Up @@ -137,7 +164,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Synthetic data generation from a high-dimensional generalized linear model\n",
"## Example: Synthetic data generation from a high-dimensional generalized linear model\n",
"\n",
"We use the classes below to generate synthetic data from a high-dimensional generalized linear model. Further, we will use this class to implement the standard outcome-regression approach to estimate CATE. That is, we regress $Y$ on $X$ and $T$ to obtain an estimate of $E[Y | X, A=1] - E[Y | X, A=0]$. This approach is called the \"plug-in\" approach in Kennedy (2022)."
]
Expand Down Expand Up @@ -188,7 +215,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Below we generate synthetic data as in Figure 4b of Kennedy (2022)."
"Below we generate synthetic data as in Figure 4b of Kennedy (2022)."
]
},
{
Expand Down Expand Up @@ -223,6 +250,13 @@
"D_test = {\"X\": X_test, \"A\": A_test, \"Y\": Y_test}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Effect estimation using cut modules"
]
},
{
"cell_type": "code",
"execution_count": 5,
Expand Down Expand Up @@ -350,7 +384,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Look at how well each method estimates average treatment effect"
"Look at how well each method estimates average treatment effect"
]
},
{
Expand Down Expand Up @@ -381,7 +415,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Because we use Bayesian inference, we also get uncertainity estimates for ATE\n"
"Because we use Bayesian inference, we also get uncertainity estimates for ATE\n"
]
},
{
Expand Down Expand Up @@ -433,15 +467,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# References\n",
"## References\n",
"\n",
"Kennedy, Edward. \"Towards optimal doubly robust estimation of heterogeneous causal effects\", 2022. https://arxiv.org/abs/2004.14497.\n",
"\n",
"Carmona, Chris U., Geoff K. Nicholls. \"Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components\", 2020. https://arxiv.org/abs/2003.06804.\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": []
Expand Down
63 changes: 53 additions & 10 deletions docs/source/sdid.ipynb
Original file line number Diff line number Diff line change
@@ -1,5 +1,37 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "92303b22",
"metadata": {},
"source": [
"# Causal effect estimation in panel data"
]
},
{
"cell_type": "markdown",
"id": "49e2f7b6",
"metadata": {},
"source": [
"## Outline\n",
"\n",
"- [Setup](#setup)\n",
"- [Overview: Robust Causal Inference with Panel Data](#overview:-robust-causal-inference-with-panel-data)\n",
"- [Example: California Smoking Cessation](#example:-california-smoking-cessation)\n",
"- [Causal Query: Counterfactual Prediction](#causal-query:-counterfactual-prediction)\n",
"- [Effect estimation with ordinary Bayesian inference](#effect-estimation-with-ordinary-bayesian-inference)\n",
"- [Robust effect estimation with modular Bayesian inference](#robust-effect-estimation-with-modular-bayesian-inference)\n",
"- [References](#references)"
]
},
{
"cell_type": "markdown",
"id": "add55da8",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand Down Expand Up @@ -32,10 +64,17 @@
"id": "39828795",
"metadata": {},
"source": [
"# Causal effect estimation in panel data\n",
"\n",
"In this notebook, we implement the synthetic difference-in-differences (SDID) estimator proposed in [1]. The SDID estimator combines the strengths of difference-in-differences and synthetic control methods through a two-stage weighted regression. \n",
"## Overview: Robust Causal Inference with Panel Data\n",
"\n",
"In this notebook, we implement the synthetic difference-in-differences (SDID) estimator proposed in [1]. The SDID estimator combines the strengths of difference-in-differences and synthetic control methods through a two-stage weighted regression. \n"
]
},
{
"cell_type": "markdown",
"id": "8e9171f3",
"metadata": {},
"source": [
"## Example: California Smoking Cessation\n",
"As in [1], we analyze the California Smoking Cessation dataset [2] to estimate the effect cigarette taxes had in California. Specifically, in 1989, California passed Proposition 99 which increased cigarette taxes. We will estimate the impact this policy had on cigarette consumption using the California smoking cessation program dataset. This dataset consists of cigarette consumption of 39 states between 1970 to 2000, 38 of which are control units.\n",
"\n",
"We start by loading and visualizing the dataset."
Expand Down Expand Up @@ -245,14 +284,16 @@
"id": "66cb6c41",
"metadata": {},
"source": [
"### We would like to estimate a counterfactual: had California not raised cigarette taxes, what would have cigarette consumption been?\n",
"## Causal Query: Counterfactual Prediction\n",
"\n",
"In this setting we would like to estimate a counterfactual: had California not raised cigarette taxes, what would have cigarette consumption been?\n",
"\n",
"To estimate this effect, we implement a Bayesian analogue of the Synthetic Difference-in-Differences (SDID) estimator proposed in [1]. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"id": "d7566e08",
"metadata": {},
"outputs": [],
Expand Down Expand Up @@ -355,7 +396,7 @@
"id": "a10e36ed",
"metadata": {},
"source": [
"### Let's visualize our Bayesian SDID probabilistic model."
"Let's visualize our Bayesian SDID probabilistic model."
]
},
{
Expand Down Expand Up @@ -555,7 +596,9 @@
"id": "9741babe",
"metadata": {},
"source": [
"### First, we estimate $\\tau$ (the effect of Proposition 99) by performing joint Bayesian inference over all latents parameters in the model. We report the marginal approximate posterior over $\\tau$."
"## Effect estimation with ordinary Bayesian inference\n",
"\n",
"First, we estimate $\\tau$ (the effect of Proposition 99) by performing joint Bayesian inference over all latents parameters in the model. We report the marginal approximate posterior over $\\tau$."
]
},
{
Expand Down Expand Up @@ -717,7 +760,7 @@
"id": "5d72fef0",
"metadata": {},
"source": [
"### Robustification with Modular Bayesian Inference\n",
"## Robust effect estimation with modular Bayesian inference\n",
"\n",
"From the figure above, we see that the estimated synthetic control has non-trivial deviations from California during the pre-treatment period. To robustify our causal effect estimates, we use modular Bayesian inference and compute the \"cut posterior\" for $\\tau$ [3]. Specifically, we define \"module one\" as all observed and latent variables associated with the time and synthetic control weights. We define \"module two\" as the latent variables used to compute the response likelihood. \n",
"\n",
Expand Down Expand Up @@ -771,7 +814,7 @@
"id": "1a20e980",
"metadata": {},
"source": [
"### Below we see that the synthetic control unit estimated from the cut posterior is a better fit to the treated unit (California) during the pre-treatment period "
"Below we see that the synthetic control unit estimated from the cut posterior is a better fit to the treated unit (California) during the pre-treatment period "
]
},
{
Expand Down Expand Up @@ -841,7 +884,7 @@
"id": "b681f030",
"metadata": {},
"source": [
"# References\n",
"## References\n",
"1. https://www.aeaweb.org/articles?id=10.1257/aer.20190159\n",
"2. https://www.tandfonline.com/doi/abs/10.1198/jasa.2009.ap08746\n",
"3. https://projecteuclid.org/journals/bayesian-analysis/volume-4/issue-1/Modularization-in-Bayesian-analysis-with-emphasis-on-analysis-of-computer/10.1214/09-BA404.full\n",
Expand Down

0 comments on commit c979555

Please sign in to comment.