From 35dfb75b4c3b3c1eadb3c65ca40ed86607a23af3 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 06:20:51 -0700 Subject: [PATCH 01/13] add article --- .Rbuildignore | 1 + vignettes/articles/confidence-intervals.Rmd | 186 ++++++++++++++++++++ 2 files changed, 187 insertions(+) create mode 100644 vignettes/articles/confidence-intervals.Rmd diff --git a/.Rbuildignore b/.Rbuildignore index 649cc6656..f4e277d7b 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -21,3 +21,4 @@ ^COMPLIANCE\.yaml$ ^CRAN-SUBMISSION$ ^CODE_OF_CONDUCT\.md$ +^vignettes/articles$ diff --git a/vignettes/articles/confidence-intervals.Rmd b/vignettes/articles/confidence-intervals.Rmd new file mode 100644 index 000000000..9f8fa0204 --- /dev/null +++ b/vignettes/articles/confidence-intervals.Rmd @@ -0,0 +1,186 @@ +--- +title: "Confidence Intervals for Hazard Concentrations" +author: "ssdtools Team" +date: "`r Sys.Date()`" +bibliography: references.bib +csl: my-style.csl +latex_engine: MathJax +mainfont: Arial +mathfont: Courier +--- + +```{r, include = FALSE} +library(ssdtools) +load("confidence_intervals.RData") +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + fig.height = 4, + fig.width = 6 +) +``` + +## Bootstrap confidence intervals + +Bootstrapping is a resampling technique used to obtain estimates of summary statistics. +The team have explored the use of alternative methods for obtaining the confidence interval of *HCx* estimates. +This included using the closed-form expression for the variance-covariance matrix of the parameters +of the Burr III distribution, coupled with the delta-method, as well as an alternative bootstrap method +for the inverse Pareto distribution based on statistical properties of the parameters [@fox_methodologies_2021]. In both cases, it +appeared that these methods can give results similar to other traditional bootstrapping approaches +in much less time, and are therefore potentially worth further investigation. However, +implementation of such methods across all the distributions now available in ssdtools would be a +substantial undertaking. + +The revised version of ssdtools retains the computationally intensive bootstrapping method to obtain confidence intervals and an estimate of standard errors. We recommend a minimum bootstrap sample of 1,000 (the current default - see argument nboot in *?ssd_hc()*). However, more reliable results can be obtained using samples of 5,000 or 10,000. We recommend larger bootstrap samples for final reporting. + +## Parametric versus non-parametric bootstrapping + +Burrlioz 2.0 uses a non-parametric bootstrap method to obtain confidence intervals on the *HCx* +estimate. Non-parametric bootstrapping is carried out by repeatedly resampling the raw data with +replacement, and refitting the distribution many times. The 95% confidence limits are then obtained +by calculating the lower 0.025th and upper 0.975th quantiles of the resulting *HCx* estimates across all56 +the bootstrap samples (typically >1000). This type of bootstrap takes into account uncertainty in the +distribution fit based on uncertainty in the data. + +The ssdtools package by default uses a parametric bootstrap. Instead of resampling the data, +parametric bootstrapping draws a random a set of new data (of the same sample size as the original) +from the fitted distribution to repeatedly refit the distribution. Upper and lower 95% bounds are again +calculated as the lower 0.025th and upper 0.975th quantiles of the resulting *HCx* estimates across all the +bootstrap samples (again, typically >1000). This will capture the possible uncertainty that may occur +for a sample size from a given distribution, but it assumes no uncertainty in that original fit, so it is not +accounting for uncertainty in the input data. + +The new TMB version of ssdtools has the capacity to do bootstrapping either using the Burrlioz +non-parametric method, or the original parametric method of ssdtools (based on +fitdistrplus [@fitdistrplus]). + +Using simulation studies the ssdtools team examined bias and compared the resulting coverage +of the parametric and non-parametric bootstrapping methods [@fox_methodologies_2021]. They found that coverage was better using the parametric bootstrapping method, and this has been retained as the default bootstrapping method in the update to ssdtools. + + +## Bootstrapping model-averaged SSDs + +Bootstrapping to obtain confidence intervals for individual fitted distributions is relatively straightforward. However, obtaining bootstrap confidence intervals for model-averaged SSDs requires careful consideration, as the procedure is subject to the same pitfalls evident when obtaining model-averaged *HCx* estimates. The [Model Average SSDs](https://poissonconsulting.github.io/ssdtools/articles/A_model_averaging.html) vignette contains a detailed explanation of the fallacy of using the summed weighting of individual *HCx* values (as weighted arithmetic average), and how this can lead to spurious results. Model-averaged estimates and/or confidence intervals (including standard error) can be calculated by treating the distributions as constituting a single mixture distribution versus 'taking the mean'. When calculating the model-averaged estimates treating the distributions as constituting a single mixture distribution ensures that *ssd_hc()* is the inverse of *ssd_hp()*, and this applies for model-averaged confidence intervals. + +The revised version of ssdtools supports three weighting methods for obtaining bootstrap confidence intervals and an estimate of the standard error, and these are discussed in detail below. + +### Weighted arithmetic mean + +The early versions of ssdtools provided model-averaged confidence intervals (cis) and standard errors (se) that were calculated as weighted arithmetic means of the upper and lower cis and se values obtained via bootstrap simulation from each of the individual candidate distributions independently. This method is incorrect and may lead to spurious results (as described above) and has been shown via simulations studies to result in confidence intervals with very low coverage. The current version of ssdtools retains the functionality to reproduce the original behavior of ssdtools. + +```{r hc1, eval=FALSE} +fit <- ssd_fit_dists(data = ssddata::ccme_silver) +set.seed = 99 + +# Using the original ssdtools weighted arithmetic mean +hc1 <- ssd_hc(fit, ci = TRUE, multi_est = FALSE, multi_ci = FALSE, weighted = FALSE) +``` + +```{r} +hc1 +``` +Use of this method for obtaining ci and se values is not recommended and only retained for legacy comparison purposes. It is both technically incorrect, and computationally inefficient. + +### Weighted mixture distribution + +A more theoretically correct way of obtaining ci and se values is to consider the model average set as a mixture distribution (see above, and the [Model Average SSDs](https://poissonconsulting.github.io/ssdtools/articles/A_model_averaging.html) vignette). When we consider the model set as a mixture distribution, bootstrapping is achieved by resampling from the model set according to the AICc based model weights. A method for sampling from mixture distributions has been implemented in ssdtools, via the function *ssd_rmulti()*, which will generate random samples from a mixture of any combination of distributions currently implemented in `ssdtools`. Setting "multi_ci = TRUE" in the *ssd_hc()* call will ensure that bootstrap samples are drawn from a mixture distribution, instead of individual candidate distributions. + +When bootstrapping from the mixture distribution, a question arises whether the model weights should be re-estimated for every bootstrap sample, or fixed at the values estimated from the models fitted to the original sample of toxicity data? This is an interesting question that may warrant further investigation, however our current view is that they should be fixed at their nominal values in the same way that the component distributions to be used in bootstrapping are informed by the fit to the sample toxicity data. Using simulation studies we explored the coverage and bias of ci values obtained without and without fixing the distribution weights, and results indicate little difference. + +If treating the distributions as a single mixture distribution when calculating model average confidence intervals (i.e. with "multi_ci = TRUE"), then setting "weighted = FALSE" specifies to use the original model weights. Setting "weighted = TRUE" will result in bootstrapping that will re-estimate weights for each bootstrap sample. + +The following code can be used to obtain confidence intervals for *HCx* estimates via bootstrapping from the weighted mixture distribution (using *ssd_rmutli()*), with and without fixed weight values respectively. + + +```{r hc2, eval=FALSE} +# Using the rmulti boostrapping method with fixed weights +hc2 <- ssd_hc(fit, ci = TRUE, multi_est = TRUE, multi_ci = TRUE, weighted = FALSE) +``` + +```{r} +hc2 +``` + +```{r hc3, eval=FALSE} +# Using the rmulti boostrapping method with fixed weights +hc3 <- ssd_hc(fit, ci = TRUE, multi_est = TRUE, multi_ci = TRUE, weighted = TRUE) +``` + +```{r} +hc3 +``` + +Use of this method (without or without fixed weights) is theoretically correct, but is computationally very inefficient. + +### Weighted bootstrap sample + +The developers of `ssdtools` investigated a third method for obtaining confidence intervals for the model-averaged SSD. This method bootstraps from each of the distributions individually, taking a weighted sample from each, and then combining these into a pooled bootstrap sample for estimation of te ci and se values. Psuedo code for this method is as follows: + + * For each distribution in the `fitdists` object, the proportional number of bootstrap samples to draw (`nboot_vals`) is found using `round(nboot * weight)`, where `nboot` is the total number of bootstrap samples and weight is the AICc based model weights for each distribution based on the original `ssd_fitdist` fit. + +* For each of the `nboot_vals` for each distribution, a random sample of size N is drawn (the total number of original data points included in the original SSD fit) based on the estimated parameters from the original data for that distribution. + +* The random sample is re-fitting using that distribution. + +* *HCx* is estimated from the re-fitted bootstrap fit. + +* The *HCx* estimates for all `nboot_vals` for each distribution are then pooled across all distributions, and *quantile()* is used to determine the lower and upper confidence bounds for this pooled weighted bootstrap sample of *HCx* values. + +This method does not draw random samples from the mixture distribution using *ssd_rmulti* (thus "multi_ci = FALSE"). While mathematically the method shares some properties with obtaining *HCx* estimates via summing the weighted values (weighted arithmetic mean), simulation studies have shown that, as a method for obtaining confidence intervals, this pooled weighted sample method yields similar ci values and coverage the *ssd_rmulti()* method, and is computationally much faster. + +This method is currently the default method in ssdtools, and can be implemented by setting "multi_ci = FALSE" and "weighted = TRUE" in the ssd_hc() call. + +```{r hc4, eval=FALSE} +# Using a weighted pooled bootstrap sample +hc4 <- ssd_hc(fit, ci = TRUE, multi_est = FALSE, multi_ci = FALSE, weighted = TRUE) +``` + +```{r} +hc4 +``` + +Here, the argument "weighted = TRUE" specifies to take bootstrap samples from each distribution proportional to its weight (so that they sum to nboot). + +## Comparing bootrapping methods + +We have undertaken extensive simulation studies comparing the implemented methods, and the results of these are reported elsewhere. For illustrative purposes, here we compare upper and lower confidence intervals using only a single example data set, the Silver data set from the Canadian Council of Ministers of the Environment (ccme). + +Using the default settings for ssdtools, we compared the upper and lower confidence intervals for the four bootstrapping methods described above. Estimate upper confidence limits are relatively similar among the four methods. However, the lower confidence interval obtained using the weighted arithmetic mean (the method implemented in earlier versions of ssdtools) is much higher than the other three methods, potentially accounting for the relatively poor coverage of this method in our simulation studies. + +```{r fig.width=7,fig.height=5} +library(ggplot2) +library(ggpubr) +p1 <- ggplot(compare_dat, aes(method, ucl, fill = method)) + + geom_bar(stat="identity", position=position_dodge()) + + theme_classic() + + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +p2 <- ggplot(compare_dat, aes(method, lcl, fill = method)) + + geom_bar(stat="identity", position=position_dodge()) + + theme_classic() + + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) + +ggarrange(p1, p2,common.legend = TRUE) +``` + +Given the similarity of upper and lower confidence intervals of the weighted bootstrap sample method compared to the potentially more theoretically correct, but computationally more intensive weighted mixture method (via *ssd_rmulti()*), we also compared the time taken to undertake bootstrapping across the methods. + +Using the default 1,000 bootstrap samples, the elapsed time to undertake bootstrapping for the mixture method was `r t2["elapsed"]` seconds, compared to `r t4["elapsed"]` seconds for the weighted bootstrap sample. This means that the weighted bootstrap method is ~ `r round(t2["elapsed"]/t4["elapsed"])` times faster, representing a considerable computational saving across many SSDs. For this reason, this method is currently set as the default method for confidence interval estimation in ssdtools. + + +```{r fig.width=7,fig.height=5} +p3 <- ggplot(compare_dat, aes(method, time, fill = method)) + + geom_bar(stat="identity", position=position_dodge()) + + ylab("Elapsed time (seconds)") + + theme_classic() + + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +p3 +``` + +## References + +
+ +```{r, results = "asis", echo = FALSE} +cat(licensing_md()) +``` From 45ca6ba5b26e26f5101c19e9bead800a25683684 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 06:38:28 -0700 Subject: [PATCH 02/13] ci.RData --- vignettes/articles/confidence_intervals.RData | Bin 0 -> 764 bytes 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 vignettes/articles/confidence_intervals.RData diff --git a/vignettes/articles/confidence_intervals.RData b/vignettes/articles/confidence_intervals.RData new file mode 100644 index 0000000000000000000000000000000000000000..1290541d9282c4f6129743c91bfdfcd2e8d8c92c GIT binary patch literal 764 zcmV&N7#LWXfE-2! z76wj`f{bKCASn!Fas#mdk`#MlS!z*YdMc2~2NJelITHkC!TBfV|2p9>m>MTxnoB}JLZFqIDd4ltDvo$uez z-K5JG26Z?iRGb0LZ>-5ViN(ce#Afp zCgtbDq*x1(IPArVxdk~O7lE9~^dIUnW|%Q0NjdQ;X)v3BJeV_65=#>G(m-L2B{+cT z!iem25pKk!wxI}-E;LU^C9QLX^QWm#3OJ$F2 zlOJ@E%KRUDnxtZ68&fN?PpwKlaa-C75!sqIZL5Uc;rv%jeRCX@3P&S*03+LkY9)T+ zx9!u^p2GQ)dN?QKxgkZilvC}6B|TJ#?DyZ#-4tur9*u0$BAclMRJ((Miq%1=rmZ=K{-~)OcVX zf|e|-P#3U3mGG9QW~OJ9q^2n3rY6D)2=Uy^ijvZzR0WN+%nG0sL`|_K%u!HjD3>2@ z1SIQ2%PtmJ282bwgKe=f5WoWwrViF_hP9JmRVZ5N$^vU|!`fG{su!(PW`ULBu&NtD u!%BCoae-REv4CR@qdY_|B|)Vq_IO7rv#=I}e;~2N@DBjMG~Kq#3;+PT5?Js6 literal 0 HcmV?d00001 From 8f7b251643ec925bed011d4abe28ff4131451326 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 06:53:55 -0700 Subject: [PATCH 03/13] copy style and refs --- vignettes/articles/my-style.csl | 19 +++ vignettes/articles/references.bib | 270 ++++++++++++++++++++++++++++++ 2 files changed, 289 insertions(+) create mode 100644 vignettes/articles/my-style.csl create mode 100644 vignettes/articles/references.bib diff --git a/vignettes/articles/my-style.csl b/vignettes/articles/my-style.csl new file mode 100644 index 000000000..ef13d9347 --- /dev/null +++ b/vignettes/articles/my-style.csl @@ -0,0 +1,19 @@ + + diff --git a/vignettes/articles/references.bib b/vignettes/articles/references.bib new file mode 100644 index 000000000..203d3bbbe --- /dev/null +++ b/vignettes/articles/references.bib @@ -0,0 +1,270 @@ + +@article{fox_recent_2021, + title = {Recent {Developments} in {Species} {Sensitivity} {Distribution} {Modeling}}, + volume = {40}, + issn = {0730-7268, 1552-8618}, + url = {https://setac.onlinelibrary.wiley.com/doi/10.1002/etc.4925}, + doi = {10.1002/etc.4925}, + language = {en}, + number = {2}, + urldate = {2021-01-25}, + journal = {Environmental Toxicology and Chemistry}, + author = {Fox, D.R. and Dam, R.A. and Fisher, R. and Batley, G.E. and Tillmanns, A.R. and Thorley, J. and Schwarz, C.J. and Spry, D.J. and McTavish, K.}, + month = feb, + year = {2021}, + keywords = {ssdtools}, + pages = {293--308}, + file = {Fox et al. - 2021 - Recent Developments in Species Sensitivity Distrib.pdf:/Users/joe/Zotero/storage/MMSH9Q2S/Fox et al. - 2021 - Recent Developments in Species Sensitivity Distrib.pdf:application/pdf}, +} + +@article{verdonck_limitations_2003, + title = {Limitations of {Current} {Risk} {Characterization} {Methods} in {Probabilistic} {Environmental} {Risk} {Assessment}}, + volume = {22}, + issn = {0730-7268, 1552-8618}, + url = {http://doi.wiley.com/10.1897/02-435}, + doi = {10.1897/02-435}, + abstract = {In probabilistic environmental risk assessment, the likelihood and the extent of adverse effects occurring in ecological systems because of exposure(s) to substances are estimated. It is based on the comparison of an exposure/environmental concentration distribution, with a species sensitivity distribution derived from toxicity data. The calculation of a probabilistic risk can be performed in many ways (e.g., area under the curve in joint probability curves). However, several (hypothetical) examples and some theoretical considerations illustrate that the current risk characterisation methods have an integrative character and they focus on the statistical comparison of two distributions without properly considering the environmental interpretation of these underlying distributions. Several scenarios with varying exposure/environmental concentration distribution and species sensitivity distribution standard deviations are discussed.}, + language = {en}, + number = {9}, + urldate = {2019-11-21}, + journal = {Environmental Toxicology and Chemistry}, + author = {Verdonck, Frederik A.M. and Aldenberg, Tom and Jaworska, Joanna and Vanrolleghem, Peter A.}, + year = {2003}, + pages = {2209}, + file = {Verdonck et al. - 2003 - LIMITATIONS OF CURRENT RISK CHARACTERIZATION METHO.pdf:/Users/joe/Zotero/storage/I4DFHG9K/Verdonck et al. - 2003 - LIMITATIONS OF CURRENT RISK CHARACTERIZATION METHO.pdf:application/pdf} +} + +@article{burr_cumulative_1942, + title = {Cumulative {Frequency} {Functions}}, + volume = {13}, + issn = {00034851}, + number = {2}, + journal = {The Annals of Mathematical Statistics}, + author = {Burr, Irving W.}, + year = {1942}, + note = {Publisher: Institute of Mathematical Statistics}, + pages = {215--232}, + file = {Burr - 1942 - Cumulative Frequency Functions.pdf:/Users/joe/Zotero/storage/IMIF252X/Burr - 1942 - Cumulative Frequency Functions.pdf:application/pdf} +} + +@article{shao_estimation_2000, + title = {Estimation for hazardous concentrations based on {NOEC} toxicity data: an alternative approach}, + abstract = {A common task in environmental studies is to determine toxicant concentrations at which a certain proportion (typically 95 per cent) of the biological species is protected. Extrapolation techniques need to be employed for small sample sizes. By de®nition, our interest focuses on the lower tail of the NOEC (no observed e€ect concentration) distribution, which is very sensitive to the choice of the underlying distribution. In this paper we investigate the use of the three-parameter Burr Type III distribution because of its ¯exibility and ease-of-use. The Constrained Maximum Likelihood (CML) method was used to estimate parameters. Collinearity between parameter estimates was overcome by reparameterisation techniques. As an alternative to the computation of adjustment factors, we estimate the lower con®dence limits of percentile estimates using the Delta-method. When the NOEC sample sizes are small, we employ Bootstrapping, a computer intensive technique. Our technique is easily extended to mixtures of the three-parameter Burr type III distributions, which can be used to model multimodal distributions. Copyright \# 2000 John Wiley \& Sons, Ltd.}, + language = {en}, + author = {Shao, Quanxi}, + year = {2000}, + keywords = {ssdtools}, + pages = {13}, + file = {Shao - 2000 - Estimation for hazardous concentrations based on N.pdf:/Users/joe/Zotero/storage/VFI2YKDE/Shao - 2000 - Estimation for hazardous concentrations based on N.pdf:application/pdf} +} + + +@article{aldenberg_confidence_1993, + title = {Confidence {Limits} for {Hazardous} {Concentrations} {Based} on {Logistically} {Distributed} {NOEC} {Toxicity} {Data}}, + volume = {25}, + issn = {01476513}, + url = {https://linkinghub.elsevier.com/retrieve/pii/S0147651383710067}, + doi = {10.1006/eesa.1993.1006}, + language = {en}, + number = {1}, + urldate = {2019-12-08}, + journal = {Ecotoxicology and Environmental Safety}, + author = {Aldenberg, T. and Slob, W.}, + month = feb, + year = {1993}, + pages = {48--63} +} + +@techreport{schwarz_improving_2019, + address = {Victoria, BC}, + title = {Improving {Statistical} {Methods} for {Modeling} {Species} {Sensitivity} {Distributions}}, + number = {WSS2019-07}, + institution = {Province of British Columbia}, + author = {Schwarz, Carl and Tillmanns, Angeline}, + month = jul, + year = {2019}, + keywords = {ssdtools}, + file = {Schwarz and Tillmanns - 2019 - Improving Statistical Methods for Modeling Species.pdf:/Users/joe/Zotero/storage/M9Z9M872/Schwarz and Tillmanns - 2019 - Improving Statistical Methods for Modeling Species.pdf:application/pdf} +} + +@article{dalgarno_shinyssdtools_2021, + title = {shinyssdtools: {A} web application for fitting {Species} {Sensitivity} {Distributions} ({SSDs})}, + volume = {6}, + issn = {2475-9066}, + shorttitle = {shinyssdtools}, + url = {https://doi.org/10.21105/joss.02848}, + doi = {10.21105/joss.02848}, + abstract = {The species sensitivity distribution (SSD) is the most widely used method for getting water quality benchmarks to characterize effects of chemical contaminants for water quality or ecological risk assessment (Fox et al., 2020). This typically involves estimating the concentration of a chemical that affects 5\% of the species considered (Posthuma et al., 2001). The ssdtools R package (Thorley \& Schwarz, 2018) has recently advanced SSD methods by providing model averaging using information-theoretic criteria and the construction of confidence intervals using bootstrapping (Fox et al., 2020).}, + language = {en}, + number = {57}, + urldate = {2021-01-26}, + journal = {Journal of Open Source Software}, + author = {Dalgarno, Seb}, + month = jan, + year = {2021}, + pages = {2848}, + file = {Dalgarno - 2021 - shinyssdtools A web application for fitting Speci.pdf:/Users/joe/Zotero/storage/CJPKHJ73/Dalgarno - 2021 - shinyssdtools A web application for fitting Speci.pdf:application/pdf} +} + +@book{posthuma_species_2001, + title = {Species sensitivity distributions in ecotoxicology}, + publisher = {CRC press}, + author = {Posthuma, Leo and Suter II, Glenn W and Traas, Theo P}, + year = {2001}, + url = {https://www.routledge.com/Species-Sensitivity-Distributions-in-Ecotoxicology/Posthuma-II-Traas/p/book/9781566705783} +} + +@book{model_averaging, + title = {Model Selection and Multimodel Inference - A Practical Information-Theoretic Approach}, + publisher = {Springer}, + author = {Burnham, Kenneth and Anderson, David}, + year = {2002}, + url = {https://link.springer.com/book/10.1007/b97636} +} + +@book{fletcher, + title = {Model Averaging}, + publisher = {Springer}, + author = {Fletcher, David}, + year = {2018}, + url = {https://link.springer.com/book/10.1007/978-3-662-58541-2} +} + +@Article{fitdistrplus, + title = {{fitdistrplus}: An {R} Package for Fitting Distributions}, + author = {Marie Laure Delignette-Muller and Christophe Dutang}, + journal = {Journal of Statistical Software}, + year = {2015}, + volume = {64}, + number = {4}, + pages = {1--34}, + url = {http://www.jstatsoft.org/v64/i04/} + } + +@Book{ggplot2, + author = {Hadley Wickham}, + title = {{ggplot2}: Elegant Graphics for Data Analysis}, + publisher = {Springer-Verlag New York}, + year = {2016}, + isbn = {978-3-319-24277-4}, + url = {https://ggplot2.tidyverse.org} + } + + @Manual{r, + title = {R: A Language and Environment for Statistical Computing}, + author = {{R Core Team}}, + organization = {R Foundation for Statistical Computing}, + address = {Vienna, Austria}, + year = {2018}, + url = {https://www.R-project.org/} + } + +@book{burnham_model_2002, + address = {New York, NY}, + title = {Model {Selection} and {Multimodel} {Inference}}, + isbn = {978-0-387-95364-9}, + url = {https://link.springer.com/10.1007/b97636}, + language = {en}, + urldate = {2018-11-23}, + publisher = {Springer New York}, + editor = {Burnham, Kenneth P. and Anderson, David R.}, + year = {2002}, + doi = {10.1007/b97636} +} + +@techreport{schwarz_improving_2019, + address = {Victoria, BC}, + title = {Improving {Statistical} {Methods} for {Modeling} {Species} {Sensitivity} {Distributions}}, + number = {WSS2019-07}, + institution = {Province of British Columbia}, + author = {Schwarz, Carl and Tillmanns, Angeline}, + month = jul, + year = {2019}, + file = {Schwarz and Tillmanns - 2019 - Improving Statistical Methods for Modeling Species.pdf:/Users/joe/Zotero/storage/M9Z9M872/Schwarz and Tillmanns - 2019 - Improving Statistical Methods for Modeling Species.pdf:application/pdf} +} + +@book{wickham_r_2016, + address = {Sebastopol, CA}, + edition = {First edition}, + title = {R for data science: import, tidy, transform, visualize, and model data}, + isbn = {978-1-4919-1039-9 978-1-4919-1036-8}, + shorttitle = {R for data science}, + url = {https://r4ds.had.co.nz}, + abstract = {"This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience"--}, + publisher = {O'Reilly}, + author = {Wickham, Hadley and Grolemund, Garrett}, + year = {2016}, + note = {OCLC: ocn968213225}, + keywords = {Big data, Computer programs, Data mining, Databases, Electronic data processing, Information visualization, R (Computer program language), Statistics} +} + +@techreport{fox_methodologies_2021, + title = {Joint investigation into {statistical} {methodologies} underpinning the {derivation} of {toxicant} {guideline} {values} in {Australia} and {New Zealand}}, + institution = {Environmetrics Australia and Australian Institute of Marine Science}, + author = {Fox, David R and Fisher, Rebecca and Thorley, Joseph L and Schwarz, Carl}, + month = mar, + year = {2021}, + url={https://environmetrics.net/docs/FOX%20and%20FISHER%20Final_final_report_rev2.3.pdf?189db0&189db0} +} + + +@article{newman_2000, + title = {Applying {species-sensitivity distributions} in {ecological} {risk assessment}: Assumptions of +{distribution type} and sufficient {numbers of species}}, + volume = {19}, + url = {https://setac.onlinelibrary.wiley.com/doi/abs/10.1002/etc.5620190233}, + doi = {10.1002/etc.5620190233}, + language = {en}, + urldate = {2024-02-08}, + journal = {Environmental Toxicology and Chemistry}, + author = {Newman, Michael C. and Ownby, David R. and Mézin, Laurent C. A. and Powell, David C. and Christensen, Tyler R. L. and Lerberg, Scott B. and Anderson, Britt-Anne }, + month = feb, + year = {2000}, + pages = {508--515} +} + +@techreport{Zajdlik_2005, + title = {Statistical analysis of the SSD approach for development of Canadian water quality guidelines}, + language = {en}, + year = {2005}, + number={CCME Project 354‐200/5}, + author={Zajdlik, B.}, + institution={Zajdlik and Associates} +} + +@article{chapman_2007, + title = {Methods of uncertainty analysis}, + journal={In: A H (ed) EUFRAM Concerted Action to Develop a European Framework for Probabilistic Risk Assessment of the Environmental Impacts of Pesticides, Vol 2, Detailed Reports on Role, Methods, Reporting and Validation}, + author = {Chapman, PF RM and Hart, A and Roelofs, W and Aldenberg, T and Solomon, K and Tarazona, J LM and Byrne, P and Powley, W and Green, J and Ferson, S and Galicia, H}, + year = {2007} +} + +@article{fox_2016, + title = {Contemporary Methods for Statistical Design and Analysis}, + url = {https://shop.elsevier.com/books/marine-ecotoxicology/blasco/978-0-12-803371-5}, + isbn={9780128033715}, + language = {en}, + urldate = {2024-02-08}, + journal = {In: Blasco J, Chapman PM, Campana O, Hampel M (eds) Marine Ecotoxicology.}, + author = {Fox, David R}, + month = aug, + year = {2016}, + publisher={Academic Press} +} + +@article{warne2018, + author = {Warne, M and Batley, GE and van Dam, RA and Chapman, JC and Fox, DR and Hickey, CW and Stauber, JL }, + title = {Revised Method for Deriving Australian and New Zealand Water Quality Guideline Values for Toxicants – update of 2015 version}, + journal = {Prepared for the revision of the Australian and New Zealand Guidelines for Fresh and Marine Water Quality. Australian and New Zealand Governments and Australian state and territory governments, Canberra, 48 pp}, + year = {2018}, + type = {Journal Article} +} + +@article{fisher2019, + author = {Fisher, Rebecca and van Dam, Rick and Batley, Graeme and Fox, David and Harford, Andrew and Humphrey, Chris and King, Cath and Menendez, Patricia and Negri, Andrew and Proctor, Abigail}, + title = {KEY ISSUES IN THE DERIVATION OF WATER QUALITY GUIDELINE VALUES: A WORKSHOP REPORT}, + year = {2019}, + type = {Journal Article} +} + + From 09d7c0fd60d3a21d1badf3f16ad4a6ca78173420 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 07:31:10 -0700 Subject: [PATCH 04/13] remove vignette C as now article --- vignettes/C_confidence_intervals.Rmd | 192 --------------------------- vignettes/confidence_intervals.RData | Bin 764 -> 0 bytes 2 files changed, 192 deletions(-) delete mode 100644 vignettes/C_confidence_intervals.Rmd delete mode 100644 vignettes/confidence_intervals.RData diff --git a/vignettes/C_confidence_intervals.Rmd b/vignettes/C_confidence_intervals.Rmd deleted file mode 100644 index e132f7083..000000000 --- a/vignettes/C_confidence_intervals.Rmd +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: "Obtaining Confidence Intervals" -author: "ssdtools Team" -date: "`r Sys.Date()`" -bibliography: references.bib -output: rmarkdown::html_vignette -#output: rmarkdown::pdf_document -csl: my-style.csl -latex_engine: MathJax -mainfont: Arial -mathfont: Courier -vignette: > - %\VignetteIndexEntry{Obtaining Confidence Intervals} - %\VignetteEngine{knitr::rmarkdown} - %\VignetteEncoding{UTF-8} ---- - -```{r, include = FALSE} -library(ssdtools) -load("confidence_intervals.RData") -knitr::opts_chunk$set( - collapse = TRUE, - comment = "#>", - fig.height = 4, - fig.width = 6 -) -``` - -## Bootstrap confidence intervals - -Bootstrapping is a resampling technique used to obtain estimates of summary statistics. -The team have explored the use of alternative methods for obtaining the confidence interval of *HCx* estimates. -This included using the closed-form expression for the variance-covariance matrix of the parameters -of the Burr III distribution, coupled with the delta-method, as well as an alternative bootstrap method -for the inverse Pareto distribution based on statistical properties of the parameters [@fox_methodologies_2021]. In both cases, it -appeared that these methods can give results similar to other traditional bootstrapping approaches -in much less time, and are therefore potentially worth further investigation. However, -implementation of such methods across all the distributions now available in ssdtools would be a -substantial undertaking. - -The revised version of ssdtools retains the computationally intensive bootstrapping method to obtain confidence intervals and an estimate of standard errors. We recommend a minimum bootstrap sample of 1,000 (the current default - see argument nboot in *?ssd_hc()*). However, more reliable results can be obtained using samples of 5,000 or 10,000. We recommend larger bootstrap samples for final reporting. - -## Parametric versus non-parametric bootstrapping - -Burrlioz 2.0 uses a non-parametric bootstrap method to obtain confidence intervals on the *HCx* -estimate. Non-parametric bootstrapping is carried out by repeatedly resampling the raw data with -replacement, and refitting the distribution many times. The 95% confidence limits are then obtained -by calculating the lower 0.025th and upper 0.975th quantiles of the resulting *HCx* estimates across all56 -the bootstrap samples (typically >1000). This type of bootstrap takes into account uncertainty in the -distribution fit based on uncertainty in the data. - -The ssdtools package by default uses a parametric bootstrap. Instead of resampling the data, -parametric bootstrapping draws a random a set of new data (of the same sample size as the original) -from the fitted distribution to repeatedly refit the distribution. Upper and lower 95% bounds are again -calculated as the lower 0.025th and upper 0.975th quantiles of the resulting *HCx* estimates across all the -bootstrap samples (again, typically >1000). This will capture the possible uncertainty that may occur -for a sample size from a given distribution, but it assumes no uncertainty in that original fit, so it is not -accounting for uncertainty in the input data. - -The new TMB version of ssdtools has the capacity to do bootstrapping either using the Burrlioz -non-parametric method, or the original parametric method of ssdtools (based on -fitdistrplus [@fitdistrplus]). - -Using simulation studies the ssdtools team examined bias and compared the resulting coverage -of the parametric and non-parametric bootstrapping methods [@fox_methodologies_2021]. They found that coverage was better using the parametric bootstrapping method, and this has been retained as the default bootstrapping method in the update to ssdtools. - - -## Bootstrapping model-averaged SSDs - -Bootstrapping to obtain confidence intervals for individual fitted distributions is relatively straightforward. However, obtaining bootstrap confidence intervals for model-averaged SSDs requires careful consideration, as the procedure is subject to the same pitfalls evident when obtaining model-averaged *HCx* estimates. The [Model Average SSDs](https://poissonconsulting.github.io/ssdtools/articles/A_model_averaging.html) vignette contains a detailed explanation of the fallacy of using the summed weighting of individual *HCx* values (as weighted arithmetic average), and how this can lead to spurious results. Model-averaged estimates and/or confidence intervals (including standard error) can be calculated by treating the distributions as constituting a single mixture distribution versus 'taking the mean'. When calculating the model-averaged estimates treating the distributions as constituting a single mixture distribution ensures that *ssd_hc()* is the inverse of *ssd_hp()*, and this applies for model-averaged confidence intervals. - -The revised version of ssdtools supports three weighting methods for obtaining bootstrap confidence intervals and an estimate of the standard error, and these are discussed in detail below. - -### Weighted arithmetic mean - -The early versions of ssdtools provided model-averaged confidence intervals (cis) and standard errors (se) that were calculated as weighted arithmetic means of the upper and lower cis and se values obtained via bootstrap simulation from each of the individual candidate distributions independently. This method is incorrect and may lead to spurious results (as described above) and has been shown via simulations studies to result in confidence intervals with very low coverage. The current version of ssdtools retains the functionality to reproduce the original behavior of ssdtools. - -```{r hc1, eval=FALSE} -fit <- ssd_fit_dists(data = ssddata::ccme_silver) -set.seed = 99 - -# Using the original ssdtools weighted arithmetic mean -hc1 <- ssd_hc(fit, ci = TRUE, multi_est = FALSE, multi_ci = FALSE, weighted = FALSE) -``` - -```{r} -hc1 -``` -Use of this method for obtaining ci and se values is not recommended and only retained for legacy comparison purposes. It is both technically incorrect, and computationally inefficient. - -### Weighted mixture distribution - -A more theoretically correct way of obtaining ci and se values is to consider the model average set as a mixture distribution (see above, and the [Model Average SSDs](https://poissonconsulting.github.io/ssdtools/articles/A_model_averaging.html) vignette). When we consider the model set as a mixture distribution, bootstrapping is achieved by resampling from the model set according to the AICc based model weights. A method for sampling from mixture distributions has been implemented in ssdtools, via the function *ssd_rmulti()*, which will generate random samples from a mixture of any combination of distributions currently implemented in `ssdtools`. Setting "multi_ci = TRUE" in the *ssd_hc()* call will ensure that bootstrap samples are drawn from a mixture distribution, instead of individual candidate distributions. - -When bootstrapping from the mixture distribution, a question arises whether the model weights should be re-estimated for every bootstrap sample, or fixed at the values estimated from the models fitted to the original sample of toxicity data? This is an interesting question that may warrant further investigation, however our current view is that they should be fixed at their nominal values in the same way that the component distributions to be used in bootstrapping are informed by the fit to the sample toxicity data. Using simulation studies we explored the coverage and bias of ci values obtained without and without fixing the distribution weights, and results indicate little difference. - -If treating the distributions as a single mixture distribution when calculating model average confidence intervals (i.e. with "multi_ci = TRUE"), then setting "weighted = FALSE" specifies to use the original model weights. Setting "weighted = TRUE" will result in bootstrapping that will re-estimate weights for each bootstrap sample. - -The following code can be used to obtain confidence intervals for *HCx* estimates via bootstrapping from the weighted mixture distribution (using *ssd_rmutli()*), with and without fixed weight values respectively. - - -```{r hc2, eval=FALSE} -# Using the rmulti boostrapping method with fixed weights -hc2 <- ssd_hc(fit, ci = TRUE, multi_est = TRUE, multi_ci = TRUE, weighted = FALSE) -``` - -```{r} -hc2 -``` - -```{r hc3, eval=FALSE} -# Using the rmulti boostrapping method with fixed weights -hc3 <- ssd_hc(fit, ci = TRUE, multi_est = TRUE, multi_ci = TRUE, weighted = TRUE) -``` - -```{r} -hc3 -``` - -Use of this method (without or without fixed weights) is theoretically correct, but is computationally very inefficient. - -### Weighted bootstrap sample - -The developers of `ssdtools` investigated a third method for obtaining confidence intervals for the model-averaged SSD. This method bootstraps from each of the distributions individually, taking a weighted sample from each, and then combining these into a pooled bootstrap sample for estimation of te ci and se values. Psuedo code for this method is as follows: - - * For each distribution in the `fitdists` object, the proportional number of bootstrap samples to draw (`nboot_vals`) is found using `round(nboot * weight)`, where `nboot` is the total number of bootstrap samples and weight is the AICc based model weights for each distribution based on the original `ssd_fitdist` fit. - -* For each of the `nboot_vals` for each distribution, a random sample of size N is drawn (the total number of original data points included in the original SSD fit) based on the estimated parameters from the original data for that distribution. - -* The random sample is re-fitting using that distribution. - -* *HCx* is estimated from the re-fitted bootstrap fit. - -* The *HCx* estimates for all `nboot_vals` for each distribution are then pooled across all distributions, and *quantile()* is used to determine the lower and upper confidence bounds for this pooled weighted bootstrap sample of *HCx* values. - -This method does not draw random samples from the mixture distribution using *ssd_rmulti* (thus "multi_ci = FALSE"). While mathematically the method shares some properties with obtaining *HCx* estimates via summing the weighted values (weighted arithmetic mean), simulation studies have shown that, as a method for obtaining confidence intervals, this pooled weighted sample method yields similar ci values and coverage the *ssd_rmulti()* method, and is computationally much faster. - -This method is currently the default method in ssdtools, and can be implemented by setting "multi_ci = FALSE" and "weighted = TRUE" in the ssd_hc() call. - -```{r hc4, eval=FALSE} -# Using a weighted pooled bootstrap sample -hc4 <- ssd_hc(fit, ci = TRUE, multi_est = FALSE, multi_ci = FALSE, weighted = TRUE) -``` - -```{r} -hc4 -``` - -Here, the argument "weighted = TRUE" specifies to take bootstrap samples from each distribution proportional to its weight (so that they sum to nboot). - -## Comparing bootrapping methods - -We have undertaken extensive simulation studies comparing the implemented methods, and the results of these are reported elsewhere. For illustrative purposes, here we compare upper and lower confidence intervals using only a single example data set, the Silver data set from the Canadian Council of Ministers of the Environment (ccme). - -Using the default settings for ssdtools, we compared the upper and lower confidence intervals for the four bootstrapping methods described above. Estimate upper confidence limits are relatively similar among the four methods. However, the lower confidence interval obtained using the weighted arithmetic mean (the method implemented in earlier versions of ssdtools) is much higher than the other three methods, potentially accounting for the relatively poor coverage of this method in our simulation studies. - -```{r fig.width=7,fig.height=5} -library(ggplot2) -library(ggpubr) -p1 <- ggplot(compare_dat, aes(method, ucl, fill = method)) + - geom_bar(stat="identity", position=position_dodge()) + - theme_classic() + - theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) -p2 <- ggplot(compare_dat, aes(method, lcl, fill = method)) + - geom_bar(stat="identity", position=position_dodge()) + - theme_classic() + - theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) - -ggarrange(p1, p2,common.legend = TRUE) -``` - -Given the similarity of upper and lower confidence intervals of the weighted bootstrap sample method compared to the potentially more theoretically correct, but computationally more intensive weighted mixture method (via *ssd_rmulti()*), we also compared the time taken to undertake bootstrapping across the methods. - -Using the default 1,000 bootstrap samples, the elapsed time to undertake bootstrapping for the mixture method was `r t2["elapsed"]` seconds, compared to `r t4["elapsed"]` seconds for the weighted bootstrap sample. This means that the weighted bootstrap method is ~ `r round(t2["elapsed"]/t4["elapsed"])` times faster, representing a considerable computational saving across many SSDs. For this reason, this method is currently set as the default method for confidence interval estimation in ssdtools. - - -```{r fig.width=7,fig.height=5} -p3 <- ggplot(compare_dat, aes(method, time, fill = method)) + - geom_bar(stat="identity", position=position_dodge()) + - ylab("Elapsed time (seconds)") + - theme_classic() + - theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) -p3 -``` - -## References - -
- -```{r, results = "asis", echo = FALSE} -cat(licensing_md()) -``` diff --git a/vignettes/confidence_intervals.RData b/vignettes/confidence_intervals.RData deleted file mode 100644 index 1290541d9282c4f6129743c91bfdfcd2e8d8c92c..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 764 zcmV&N7#LWXfE-2! z76wj`f{bKCASn!Fas#mdk`#MlS!z*YdMc2~2NJelITHkC!TBfV|2p9>m>MTxnoB}JLZFqIDd4ltDvo$uez z-K5JG26Z?iRGb0LZ>-5ViN(ce#Afp zCgtbDq*x1(IPArVxdk~O7lE9~^dIUnW|%Q0NjdQ;X)v3BJeV_65=#>G(m-L2B{+cT z!iem25pKk!wxI}-E;LU^C9QLX^QWm#3OJ$F2 zlOJ@E%KRUDnxtZ68&fN?PpwKlaa-C75!sqIZL5Uc;rv%jeRCX@3P&S*03+LkY9)T+ zx9!u^p2GQ)dN?QKxgkZilvC}6B|TJ#?DyZ#-4tur9*u0$BAclMRJ((Miq%1=rmZ=K{-~)OcVX zf|e|-P#3U3mGG9QW~OJ9q^2n3rY6D)2=Uy^ijvZzR0WN+%nG0sL`|_K%u!HjD3>2@ z1SIQ2%PtmJ282bwgKe=f5WoWwrViF_hP9JmRVZ5N$^vU|!`fG{su!(PW`ULBu&NtD u!%BCoae-REv4CR@qdY_|B|)Vq_IO7rv#=I}e;~2N@DBjMG~Kq#3;+PT5?Js6 From dad4ff6e86f43a938edace919be59ec430c4a8ee Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 07:40:56 -0700 Subject: [PATCH 05/13] ordering vignettes --- _pkgdown.yml | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/_pkgdown.yml b/_pkgdown.yml index d914a68ae..898df5091 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -111,3 +111,11 @@ reference: - '`qlgumbel`' - '`rlgumbel`' +articles: +- title: All vignettes + contents: + - B_distributions + - A_model_averaging + - confidence-intervals + - D_embelishing-plots + - E_additional-technical-details From 096830783bf4d5a94f85ad0ba24ca264eee9033b Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 07:50:01 -0700 Subject: [PATCH 06/13] vignette names in "" --- _pkgdown.yml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_pkgdown.yml b/_pkgdown.yml index 898df5091..8dc35bf69 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -114,8 +114,8 @@ reference: articles: - title: All vignettes contents: - - B_distributions - - A_model_averaging - - confidence-intervals - - D_embelishing-plots - - E_additional-technical-details + - "B_distributions" + - "A_model_averaging" + - "confidence-intervals" + - "D_embelishing-plots" + - "E_additional-technical-details" From 8dfcc46722f683462db0a354caf1e82cdfdc9aa2 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 07:55:01 -0700 Subject: [PATCH 07/13] trying vignette names in `` --- _pkgdown.yml | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_pkgdown.yml b/_pkgdown.yml index 8dc35bf69..192f90146 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -114,8 +114,8 @@ reference: articles: - title: All vignettes contents: - - "B_distributions" - - "A_model_averaging" - - "confidence-intervals" - - "D_embelishing-plots" - - "E_additional-technical-details" + - `B_distributions` + - `A_model_averaging` + - `confidence-intervals` + - `D_embelishing-plots` + - `E_additional-technical-details` From c0ab541960df059f576e46ca302adcc745d5e74b Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 08:01:02 -0700 Subject: [PATCH 08/13] c-i to c_i --- _pkgdown.yml | 10 +++++----- ...nfidence-intervals.Rmd => confidence_intervals.Rmd} | 0 2 files changed, 5 insertions(+), 5 deletions(-) rename vignettes/articles/{confidence-intervals.Rmd => confidence_intervals.Rmd} (100%) diff --git a/_pkgdown.yml b/_pkgdown.yml index 192f90146..d4edddfe6 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -114,8 +114,8 @@ reference: articles: - title: All vignettes contents: - - `B_distributions` - - `A_model_averaging` - - `confidence-intervals` - - `D_embelishing-plots` - - `E_additional-technical-details` + - B_distributions + - A_model_averaging + - confidence_intervals + - D_embelishing-plots + - E_additional-technical-details diff --git a/vignettes/articles/confidence-intervals.Rmd b/vignettes/articles/confidence_intervals.Rmd similarity index 100% rename from vignettes/articles/confidence-intervals.Rmd rename to vignettes/articles/confidence_intervals.Rmd From c8f140f7ded7561a255814abc17cc1ea190f65a5 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 08:07:35 -0700 Subject: [PATCH 09/13] article/ci --- _pkgdown.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_pkgdown.yml b/_pkgdown.yml index d4edddfe6..a0eff934f 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -116,6 +116,6 @@ articles: contents: - B_distributions - A_model_averaging - - confidence_intervals + - articles/confidence_intervals - D_embelishing-plots - E_additional-technical-details From 49e7cc688ce7ba467070ba26caf9462e976661d1 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 08:19:14 -0700 Subject: [PATCH 10/13] c-i --- _pkgdown.yml | 2 +- .../{confidence_intervals.Rmd => confidence-intervals.Rmd} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename vignettes/articles/{confidence_intervals.Rmd => confidence-intervals.Rmd} (100%) diff --git a/_pkgdown.yml b/_pkgdown.yml index a0eff934f..ce7890be8 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -116,6 +116,6 @@ articles: contents: - B_distributions - A_model_averaging - - articles/confidence_intervals + - articles/confidence-intervals - D_embelishing-plots - E_additional-technical-details diff --git a/vignettes/articles/confidence_intervals.Rmd b/vignettes/articles/confidence-intervals.Rmd similarity index 100% rename from vignettes/articles/confidence_intervals.Rmd rename to vignettes/articles/confidence-intervals.Rmd From 937928ce18dfdb19c58d8ccd4428fbcc418425c5 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 08:20:50 -0700 Subject: [PATCH 11/13] drop leading letters and replace _ with - --- ...nal-technical-details.Rmd => additional-technical-details.Rmd} | 0 vignettes/{B_distributions.Rmd => distributions.Rmd} | 0 vignettes/{D_embelishing-plots.Rmd => embelishing-plots.Rmd} | 0 vignettes/{A_model_averaging.Rmd => model-averaging.Rmd} | 0 4 files changed, 0 insertions(+), 0 deletions(-) rename vignettes/{E_additional-technical-details.Rmd => additional-technical-details.Rmd} (100%) rename vignettes/{B_distributions.Rmd => distributions.Rmd} (100%) rename vignettes/{D_embelishing-plots.Rmd => embelishing-plots.Rmd} (100%) rename vignettes/{A_model_averaging.Rmd => model-averaging.Rmd} (100%) diff --git a/vignettes/E_additional-technical-details.Rmd b/vignettes/additional-technical-details.Rmd similarity index 100% rename from vignettes/E_additional-technical-details.Rmd rename to vignettes/additional-technical-details.Rmd diff --git a/vignettes/B_distributions.Rmd b/vignettes/distributions.Rmd similarity index 100% rename from vignettes/B_distributions.Rmd rename to vignettes/distributions.Rmd diff --git a/vignettes/D_embelishing-plots.Rmd b/vignettes/embelishing-plots.Rmd similarity index 100% rename from vignettes/D_embelishing-plots.Rmd rename to vignettes/embelishing-plots.Rmd diff --git a/vignettes/A_model_averaging.Rmd b/vignettes/model-averaging.Rmd similarity index 100% rename from vignettes/A_model_averaging.Rmd rename to vignettes/model-averaging.Rmd From 1560c1cfb07aef125c8aeb1dfb53c6abe2c8cbb8 Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 08:57:49 -0700 Subject: [PATCH 12/13] update yml --- _pkgdown.yml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/_pkgdown.yml b/_pkgdown.yml index ce7890be8..c65275f09 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -114,8 +114,8 @@ reference: articles: - title: All vignettes contents: - - B_distributions - - A_model_averaging + - model-averaging + - distributions - articles/confidence-intervals - - D_embelishing-plots - - E_additional-technical-details + - embelishing-plots + - additional-technical-details From ea2992b9854478febc437d564b05f73d74e3786e Mon Sep 17 00:00:00 2001 From: Joe Thorley Date: Thu, 2 May 2024 13:40:50 -0700 Subject: [PATCH 13/13] contributing update copyright --- CONTRIBUTING.Rmd | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/CONTRIBUTING.Rmd b/CONTRIBUTING.Rmd index 9b74420a2..ad1892135 100644 --- a/CONTRIBUTING.Rmd +++ b/CONTRIBUTING.Rmd @@ -16,18 +16,19 @@ All contributors retain the original copyright to their stuff, but by contributi **To track copyright, please use the following:** -**New code file:** At the top of the file, please ensure copyright is attributed to collaborator and assign the Apache 2.0 license +**New code file:** At the top of the file, please ensure copyright and year is attributed to collaborator and assign the Apache 2.0 license -**Major addition to code file:** “Copyright Province of British Columbia” and Apache 2.0 remains as a header and either the second collaborator is added if there are changes throughout the code or copyright is listed for specific lines of code. So it could read: +**Major addition to code file:** “Copyright YYYY Province of British Columbia” and Apache 2.0 remains as a header and either the second collaborator is added if there are changes throughout the code or copyright is listed for specific lines of code. So it could read: -Copyright Province of British Columbia and -Copyright Collaborator +Copyright 2020-2022 Province of British Columbia +Copyright 2024 Collaborator Apache 2.0 License Or -Copyright Province of British Columbia and -Lines 200-500 Copyright Collaborator +Copyright 2020-2022 Province of British Columbia +Copyright 2024 Collaborator Lines 200-500 Apache 2.0 License -**Minor changes:** If there are small changes to the code throughout the file then it may be easiest to keep these files as Copyright Province of British Columbia. However, the contribution will be tracked through GitHub. +**Minor changes:** If there are small changes to the code throughout the file then it may be easiest to keep these files as Copyright YYYY Province of British Columbia. +However, the contribution will be tracked through GitHub.