Contains models and tools to produce short-term forecasts for both case and sequence notifications assuming circulation of either one or two variants. Tools are also provided to allow the evaluation of the use of sequence data for short-term forecasts in both real-world settings and in user generated scenarios.
Install the stable development version of the package with:
install.packages("forecast.vocs", repos = "https://epiforecasts.r-universe.dev")
Install the unstable development from GitHub using the following,
remotes::install_github("epiforecasts/forecast.vocs", dependencies = TRUE)
If you don’t already have CmdStan installed then, in addition to
installing forecast.vocs
, it is also necessary to install CmdStan
using CmdStanR’s install_cmdstan()
function to enable model fitting in
forecast.vocs
. A suitable C++ toolchain is also required. Instructions
are provided in the Getting started with
CmdStanR
vignette. See the CmdStanR
documentation for further details and
support.
cmdstanr::install_cmdstan()
This quick start uses data from Germany that includes COVID-19 notifications and sequences with sequences either being positive or negative for the Delta variant. It shows how to produce forecasts for both a one and two strain model for the 19th of June 2021 when the latest available data estimated that approximately 7% of COVID-19 were positive for the Delta variant. Note that estimated growth rates and reproduction numbers shown here have been rescaled using an assumed generation time of 5.5 days and a weakly informative prior centred around Delta being 50% more transmissible than non-Delta cases has been used.
library(forecast.vocs)
options(mc.cores = 4)
obs <- filter_by_availability(
germany_covid19_delta_obs,
date = as.Date("2021-06-19")
)
obs
#> date location_name location cases cases_available seq_total seq_voc
#> 1: 2021-03-20 Germany DE 87328 2021-03-20 NA NA
#> 2: 2021-03-27 Germany DE 109442 2021-03-27 NA NA
#> 3: 2021-04-03 Germany DE 117965 2021-04-03 NA NA
#> 4: 2021-04-10 Germany DE 107223 2021-04-10 NA NA
#> 5: 2021-04-17 Germany DE 142664 2021-04-17 4072 6
#> 6: 2021-04-24 Germany DE 145568 2021-04-24 4500 32
#> 7: 2021-05-01 Germany DE 131887 2021-05-01 3607 55
#> 8: 2021-05-08 Germany DE 107141 2021-05-08 4491 86
#> 9: 2021-05-15 Germany DE 77261 2021-05-15 3443 92
#> 10: 2021-05-22 Germany DE 57310 2021-05-22 3392 114
#> 11: 2021-05-29 Germany DE 33052 2021-05-29 2258 89
#> 12: 2021-06-05 Germany DE 22631 2021-06-05 1136 75
#> 13: 2021-06-12 Germany DE 15553 2021-06-12 NA NA
#> 14: 2021-06-19 Germany DE 7659 2021-06-19 NA NA
#> share_voc seq_available
#> 1: NA <NA>
#> 2: NA <NA>
#> 3: NA <NA>
#> 4: NA <NA>
#> 5: 0.001473477 2021-06-16
#> 6: 0.007111111 2021-06-16
#> 7: 0.015248129 2021-06-16
#> 8: 0.019149410 2021-06-16
#> 9: 0.026720883 2021-06-16
#> 10: 0.033608491 2021-06-16
#> 11: 0.039415412 2021-06-16
#> 12: 0.066021127 2021-06-16
#> 13: NA 2021-07-29
#> 14: NA 2021-07-29
current_obs <- latest_obs(germany_covid19_delta_obs)
Run a forecast for both one and two strain models (or optionally just
one of these) using the forecast()
function. This provides a wrapper
around other package tooling to initialise, fit, and summarise
forecasts. Multiple forecasts can be performed efficiently across dates
and scenarios using forecast_across_dates()
and
forecast_across_scenarios()
.
forecasts <- forecast(obs,
strains = c(1, 2), voc_scale = c(0.4, 0.2),
voc_label = "Delta", scale_r = 5.5 / 7,
adapt_delta = 0.99, max_treedepth = 15,
refresh = 0, show_messages = FALSE
)
forecasts
#> id forecast_date strains overdispersion variant_relationship r_init
#> 1: 0 2021-06-19 1 TRUE correlated 0, 0.25
#> 2: 0 2021-06-19 2 TRUE correlated 0, 0.25
#> voc_scale error fit data fit_args samples max_rhat
#> 1: 0.4, 0.2 <CmdStanMCMC[42]> <list[28]> <list[5]> 4000 1
#> 2: 0.4, 0.2 <CmdStanMCMC[42]> <list[28]> <list[5]> 4000 1
#> divergent_transitions per_divergent_transitions max_treedepth
#> 1: 7 0.00175 9
#> 2: 9 0.00225 10
#> no_at_max_treedepth per_at_max_treedepth time posterior
#> 1: 3693 0.92325 10.6 <fv_posterior[167x20]>
#> 2: 500 0.12500 38.6 <fv_posterior[474x20]>
#> forecast
#> 1: <fv_posterior[12x13]>
#> 2: <fv_posterior[60x13]>
Summarised posterior and forecast estimates can be directly extracted
from the output using the included summary()
method. Alternatively,
the complete summarised posterior can be extracted using the following
(which also has its own summary()
and plot()
options to make
exploring the results easier). See the documentation for further
details.
summary(forecasts, target = "posterior", type = "all")
Plot the posterior prediction for cases from the single strain model (“Overall”), from the two strain model (“Combined”), and the unobserved estimates for each strain.
plot(forecasts, current_obs)
Plot the posterior prediction for the fraction of cases that have the Delta variant from the two strain model.
plot(forecasts, current_obs, type = "voc_frac", voc_label = "Delta variant")
Plot the posterior estimate for the growth rate over the mean of the generation time for COVID-19 cases (here assumed to be 5.5 days).
plot(forecasts, type = "growth")
Plot the posterior estimate for the effective reproduction number of Delta and non-Delta cases.
plot(forecasts, type = "rt")
Plot the estimated transmission advantage for Delta vs non-Delta over time. Note that as here a model with a correlated variant growth rates has been used meaning that the difference between variants may vary over time (potentially suitable where imports or present or a variant has greater immune escape than the baseline). If it is likely that the growth advantage is constant over time consider the fixed scaling model. If the variants are likely to be circulating in independent populations consider the independent variant model. See the documentation and model definition vignette for details.
plot(forecasts, type = "voc_advantage")
Alternatively a list of all plots with sensible defaults can be produced using the following.
plot(forecasts, current_obs, type = "all", voc_label = "Delta variant")
A potentially useful optional integration with
scoringutils
package is also
provided to streamline scoring forecasts using proper scoring rules. See
the documentation of the scoringutils
package for more on this but as
a simple example below we calculate average scores both the single and
two strain models for this single Forecast
library(scoringutils)
library(knitr)
summary(forecasts, target = "forecast", type = "cases") |>
fv_score_forecast(current_obs) |>
summarise_scores(by = "strains") |>
kable()
strains | obs_overdispersion | obs_overdispersion | interval_score | dispersion | underprediction | overprediction | coverage_deviation | bias | ae_median |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 2056.375 | 277.750 | 1779.750 | 0 | -0.375 | NA | NA |
2 | 1 | 1 | 1118.750 | 507.125 | 610.875 | 0 | 0.000 | NA | NA |
Rather than using the all-in-one forecast()
function individual
package functions can be used to produce a forecast as follows.
dt <- fv_as_data_list(obs, horizon = 4)
model <- fv_model(strains = 2)
inits <- fv_inits(dt, strains = 2)
fit <- fv_sample(
data = dt, model = model, init = inits,
voc_scale = c(0.4, 0.2),
adapt_delta = 0.99, max_treedepth = 15,
refresh = 0, show_messages = FALSE
)
posterior <- fv_tidy_posterior(fit, voc_label = "Delta", scale_r = 5.5 / 7)
As for forecasts produced with the forecast()
function summary plots
and estimates can then be produced using summary()
, and plot()
methods. For example case forecasts can be returned using the following,
summary(posterior, type = "cases", forecast = TRUE)
If using forecast.vocs
in your work please consider citing it using
the following,
#>
#> To cite forecast.vocs in publications use:
#>
#> Sam Abbott (2021). forecast.vocs: Forecast case and sequence
#> notifications using variant of concern strain dynamics, DOI:
#> 10.5281/zenodo.5559016
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {forecast.vocs: Forecast case and sequence notifications using variant of concern strain dynamics},
#> author = {Sam Abbott},
#> journal = {Zenodo},
#> year = {2021},
#> doi = {10.5281/zenodo.5559016},
#> }
Please briefly describe your problem and what output you expect in an issue. If you have a question, please don’t open an issue. Instead, ask on our Q and A page.
We welcome contributions and new contributors! We particularly appreciate help on priority problems in the issues. Please check and add to the issues, and/or add a pull request.
Please note that the forecast.vocs
project is released with a
Contributor Code of
Conduct. By
contributing to this project, you agree to abide by its terms.