P2C2M2 provides functions to read default output from BEAST 2 (Bouckaert et al. 2019) and StarBEAST2 (Ogilvie et al. 2017) and conduct posterior predictive checks of coalescent models (Reid et al. 2014) based on the functions originally provided in the P2C2M package (Gruenstaeudl et al. 2016) to parse tree and xml files from BEAST and *BEAST. It also implements estimation of type-I (false-positive) error rates using pseudo-observed datasets (“pods”) sampled from the posterior predictive distribution.
You can install the development version of P2C2M2 from GitHub with:
# install.packages("devtools")
devtools::install_version("rPython", version = "0.0-6", repos = "http://cran.us.r-project.org")
In order to generate the tree files annotated with the required branch rate estimates, the XML file has to be modified by hand to include the following code before running BEAST 2:
#For strict clocks:
Insert the following:
id=“TreeWithMetaDataLogger.t:Locus_Name” and before
#For UCLN or RLC clocks:
Insert the following:
id=“TreeWithMetaDataLogger.t:Locus_Name” and before
Basic example showing how to run the full pipeline with a single function using the provided dataset to test the multispecies coalescent model and to estimate false positive error rates per locus and across loci:
# The absolute path to the input directory is set
inPath <- system.file("extdata", "thomomys/", package="P2C2M2")
# The name of the xml-file generated by BEAUTi and located in
# "inPath" is set
inFile <- "thomomys.xml"
# Posterior predictive simulations with a setting of 10 simulation
# replicates are preformed
suppressWarnings(thomomys <- p2c2m2.complete(inPath, inFile, num.reps=10, error.rate = TRUE, save.metadata = TRUE))
#> bindings, loci, and alignments read
#> locus 26 is done
#> locus 29 is done
#> locus 47 is done
#> locus 53 is done
#> locus 59 is done
#> locus 64 is done
#> locus 72 is done
#> gene trees done
#> species trees done
#> Calculating metrics for 91 trees of the posterior distribution
#> Calculating metrics for the trees of the post. predictive distribution
#> Calculating differences btw. the posterior and the post. predictive distribution
#> Calculating differences btw. 1000 pods and the post. predictive distribution to estimate error rates
#> $perGene
#> LCWT[2] NDC[2]
#> 26 "-2.18 (±5.52) n.s." "6.71 (±3.52) n.s."
#> 29 "-2.05 (±5.86) n.s." "4.58 (±4) n.s."
#> 47 "-2.25 (±7.4) n.s." "4.11 (±3.55) n.s."
#> 53 "-1.96 (±5.34) n.s." "4.18 (±3.59) n.s."
#> 59 "-0.42 (±8.7) n.s." "4.53 (±4.25) n.s."
#> 64 "-2.35 (±6.31) n.s." "6.63 (±3.47) n.s."
#> 72 "-0.69 (±6.48) n.s." "1.67 (±4.9) n.s."
#> $acrGenes
#> LCWT[2] NDC[2]
#> Sum "-11.9 (±26.71) n.s." "32.41 (±14.28) *"
#> Mean "-1.7 (±3.82) n.s." "4.63 (±2.04) *"
#> Median "0 (±0) n.s." "4.86 (±1.65) *"
#> Mode "0 (±0) 0" "6.29 (±1.99) *"
#> CV "5.02 (±70.76) n.s." "1.31 (±2.48) *"
#> $perGene.error
#> LCWT[2] NDC[2]
#> 26 0.080 0.039
#> 29 0.065 0.065
#> 47 0.078 0.073
#> 53 0.069 0.052
#> 59 0.067 0.060
#> 64 0.072 0.056
#> 72 0.067 0.061
#> $acrGenes.error
#> LCWT[2] NDC[2]
#> Sum 0.092 0.069
#> Mean 0.092 0.069
#> Median 0.701 0.232
#> Mode 0.539 0.207
#> CV 0.052 0.069
#> [1] "Differences between the posterior and the posterior predictive distributions per locus and across loci. Each cell contains the following information in said order: mean, standard deviation, significance level. Error rates (if estimated with option error.rate=TRUE) are based on differences between the pods and the posterior predictive distributions. Codes in square brackets indicate the number of tails. Alpha values are automatically adjusted for the number of tails."
Plot distributions of difference values for each locus based on the NDC statistic:
graphics::par(mfrow=c(2,3), mar=c(5, 2, 5, 2) + 0.1)
for (i in 1:6){
abline(v = 0, lty = "dotted", lwd = 1.5, col = "black")
abline(v = stats::quantile (thomomys$metaData$NDC$dif[,i], probs=c(0.025)), lwd = 1, col = "red")
abline(v = stats::quantile (thomomys$metaData$NDC$dif[,i], probs=c(0.975)), lwd = 1, col = "red")}
mtext("Difference values",side=1,cex=1.1,line=3)
