This is the poster I presented at the Graduate School Poster Day 2016 at the Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, UK.
Here I present the work I have done on Ebola phylodynamics, which has been partially published in Dudas et al. (2016) and Diehl et al. (2016).
Click to see the poster.
One of the questions Dudas et al. (2016) posed was what climatic and socio-economical factors were associated with outbreak size, i.e, total number of cases, in a given location. To answer that, we employed a negative binomial GLM to model outbreak sizes as dependent variable, using several attributes measured at the location level as covariates. To achieve a parsimonious model specification, we employed stochastic search variable selection (SSVS), in a very similar way to the one suggested by George & McCulloch (1993) and later explored by Lemey et al. 2009. SSVS is convenient because it allows us to calculate Bayes factors (BF) analytically for each covariate/predictor and hence determine its "significance".
Predictors of EVD outbreak size in West Africa.
Here I report predictors (covariates) with Bayes factor support greater than 3.
Notice the estimated coefficients for temperature seasonality (TempSS
) and travel times (tt50K
and tt100K
) are negative, suggesting that the more seasonal the climate and the more far away from urban centres, the fewer cases a region tended to have. Taken together, these results suggest an important role of urbanicity on the epidemic potential of EVD.
Diehl et al. (2016) presented experimental evidence that a mutation from Alanine to Valine in the Glycoprotein (82nd position) conferred increased infectivity specific to human cells. The question was then if the mutation also had an impact on disease severity.
To address this, we collected data for 236
patients regarding
- which
genotype
, A or V the virus sampled had; - the
Ct
value obtained during sequencing, inversely correlated withviral load
; - the
outcome
, i.e., whether the patient died or survived.
With this information in hand, we then fitted a binomial generalised linear model using outcome
as a response variable and genotype
and Ct
as covariates. Details of data transformation, etc can be found in the STAR methods section of Diehl et al. (2016).
Predicted fatality rates per genotype We transformed Ct
to reflect viral load. Prediction curves from the fitted binomial GLM. Notice the considerable uncertainty in the predictions, evidenced by the overlap between the 95% confidence bands for each genotype.
Here is a phylogeny onto which I mapped viral load and outcome. For viral load I used a Brownian motion continuous diffusion model (Lemey et al. 2010) and I modelled outcome as discrete trait using a continuous-time Markov chain (CTMC) (Lemey et al. 2009).
Annotated phylogeny of 236 EBOV sequences from Guinea
Maximum clade credibility (MCC) tree obtained from a posterior distribution approximated with BEAST. Clearly, branches with lower viral load (transformed Ct
) have higher probability of leading to a surviving tip.
More details and a cool animation showing the spread of GP-A82V in West Africa can be found in the public repository for the paper.