Within-host genetic diversity and large transmission bottlenecks confound phylodynamic inference of epidemiological dynamics. Conventional phylodynamic approaches assume that nodes in a time-scaled pathogen phylogeny correspond closely to the time of transmission between hosts that are ancestral to the sample. However, when hosts harbour diverse pathogen populations, node times can substantially pre-date infection times. Imperfect bottlenecks can cause lineages sampled in different individuals to coalesce in unexpected patterns. To address realistic violations of standard phylodynamic assumptions we developed a new inference approach based on a multi-scale coalescent model, accounting for nonlinear epidemiological dynamics, heterogeneous sampling through time, non-negligible genetic diversity of pathogens within hosts, and imperfect transmission bottlenecks. We apply this method to HIV-1 and Ebola virus outbreak sequence data, illustrating how and when conventional phylodynamic inference may give misleading results. Within-host diversity of HIV-1 causes substantial upwards bias in the number of infected hosts using conventional coalescent models, but estimates using the multi-scale model have greater consistency with reported number of diagnoses through time. In contrast, we find that withinhost diversity of Ebola virus has little inuence on estimated numbers of infected hosts or reproduction numbers, and estimates are highly consistent with the reported number of diagnoses through time. The multi-scale coalescent also enables estimation of within-host effective population size using single sequences from a random sample of patients. We find within-host population genetic diversity of HIV-1 p17 to be 2Nµ=0:012(95% CI:0:0066-0:023), which is lower than estimates based on HIV envelope serial sequencing of individual patients.
Phylodynamic inference across epidemic scales
Erik M. Volz