Skip to content

Latest commit

 

History

History
125 lines (102 loc) · 6.28 KB

README.md

File metadata and controls

125 lines (102 loc) · 6.28 KB

Bayesian Causal Forest with Instrumental Variable [BayesIV]

In this repository we provide the code for the BCF-IV and BCF-ITT functions of the paper "Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach" by F.J. Bargagli-Stoffi, K. De Witte and G. Gnecco published in The Annals of Applied Statistics.

The article has also been covered and summarized in two blog posts on R-bloggers and YoungStatS . Check them out for a coincise summary of the main novelties introduced in the paper.

Getting Started

Installing the latest developing version:

library(devtools)
install_github("fbargaglistoffi/BCF-IV", ref="master")

Import:

library("BayesIV")

Attention: BayesIV depends on bcf package, which, unfortunately, has just been removed from CRAN due to an un-addressed Issue. In order to run BayesIV package, manually install bcf package from GitHub following its installation guideline.

BCF-IV

The bcf-iv function discovers and estimates, in an interpretable manner, the effects heterogeneity in settings where the assignment mechanism is irregular (e.g., instrumental variable and fuzzy regression discontinuity scenarios). This function is directly built to discover and estimate the heterogeneity in the Complier Average Treatment Effects (CACE). The function takes as inputs:

  • y: the outcome variable;
  • w: the reception of the treatment variable (binary);
  • z: the assignment to the treatment variable (binary);
  • x: the covariate matrix;
  • binary: TRUE if the outcome variable is binary, FALSEotherwise (default: FALSE);
  • n_burn: the number of iterations discarded by the BCF-IV algorithm for the burn-in (default: 500);
  • n_sim: the number of iterations used by the BCF-IV algorithm to get the posterior distribution of the estimands (default: 500);
  • inference_ratio: the ratio of observations to be assigned to the interence subsample (default: 0.5);
  • max_depth: the maximal depth of the tree generated by the function (default: 2);
  • 'cp: complexity parameter for the generated CART (default: 0.01);
  • minsplit: minimum observations needed to perform a binary split in the tree (default: 10);
  • adj_method: p-value adjustment method, options are "holm", bonferroni", "hockberg", "hommel", "BH", "BY", "fdr", "none" (default: "holm");
  • seed: random seed for reproducible results (default: 42).

The bcf_iv function returns the discovered sub-population, the conditional complier average treatment effect (CCACE), the p-value for this effect, the p-value for a weak-instrument test, the adjusted p-value, the proportion of compliers, the conditional intention-to-treat effect (CITT) and the proportion of compliers in the node.

BCF-ITT

The bcf-itt function discovers the heterogeneity in the intention-to-treat (ITT) and then estimates the effect both for the conditional ITT and the conditional CACE for the discovered subgroups. The function takes as inputs:

  • y: the outcome variable;
  • w: the reception of the treatment variable (binary);
  • z: the assignment to the treatment variable (binary);
  • x: the covariate matrix;
  • binary: TRUE if the outcome variable is binary, FALSEotherwise (default: FALSE);
  • max_depth: the maximal depth of the generated CART (default: 2);
  • n_burn: the number of iterations discarded by the BCF-IV algorithm for the burn-in (default: 500);
  • n_sim: the number of iterations used by the BCF-IV algorithm to get the posterior distribution of the estimands (default: 500);
  • inference_ratio: the ratio of observations to be assigned to the interence subsample (default: 0.5);
  • seed: random seed for reproducible results (default: 42).

The bcf_itt function returns the discovered sub-population, the conditional complier average treatment effect (CCACE), the conditional intention-to-treat (CITT), the p-value for this effect, the p-value for a weak-instrument test, the adjusted p-value, the proportion of compliers, the conditional intention-to-treat effect (CITT) and the proportion of compliers in the node.

Examples

# Generate the dataset
dataset <- generate_dataset(n = 1000, 
                            p = 10, 
                            rho = 0, 
                            null = 0, 
                            effect_size = 2, 
                            compliance = 0.75)
y <- dataset[["y"]]
w <- dataset[["w"]]
z <- dataset[["z"]]
X <- dataset[["X"]]

# BCF-IV
bcf_iv(y, w, z, X, 
       n_burn = 2000, 
       n_sim = 2000, 
       inference_ratio = 0.5, 
       binary = FALSE, 
       max_depth = 2, 
       adj_method = "holm")

# BCF-ITT
bcf_itt(y, w, z, X, 
        n_burn= 2000, 
        n_sim = 2000, 
        inference_ratio = 0.5, 
        binary = FALSE, 
        max_depth = 2)

For more exaustive synthetic examples check the folder examples/.

Reference

  • Bargagli-Stoffi, F.J., De Witte, K. and Gnecco, G., 2022. Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach. The Annals of Applied Statistics, 16(3), pp.1986-2009. [paper] [preprint]
@article{bargagli2022heterogeneous,
  title={{Heterogeneous causal effects with imperfect compliance: a Bayesian machine learning approach}},
  author={Bargagli-Stoffi, Falco J and De Witte, Kristof and Gnecco, Giorgio},
  journal={The Annals of Applied Statistics},
  volume={16},
  number={3},
  pages={1986--2009},
  year={2022},
  publisher={Institute of Mathematical Statistics}
}