Genome-Wide Evolutionary Analyses of G1P[8] Strains Isolated Before and After Rotavirus Vaccine Introduction

Rotaviruses are the most important etiological agent of acute gastroenteritis in young children worldwide. Among the first countries to introduce rotavirus vaccines into their national immunization programs were Belgium (November 2006) and Australia (July 2007). Surveillance programs in Belgium (since 1999) and Australia (since 1989) offer the opportunity to perform a detailed comparison of rotavirus strains circulating pre- and postvaccine introduction. G1P[8] rotaviruses are the most prominent genotype in humans, and a total of 157 G1P[8] rotaviruses isolated between 1999 and 2011 were selected from Belgium and Australia and their complete genomes were sequenced. Phylogenetic analysis showed evidence of frequent reassortment among Belgian and Australian G1P[8] rotaviruses. Although many different phylogenetic subclusters were present before and after vaccine introduction, some unique clusters were only identified after vaccine introduction, which could be due to natural fluctuation or the first signs of vaccine-driven evolution. The times to the most recent common ancestors for the Belgian and Australian G1P[8] rotaviruses ranged from 1846 to 1955 depending on the gene segment, with VP7 and NSP4 resulting in the most recent estimates. We found no evidence that rotavirus population size was affected after vaccine introduction and only six amino acid sites in VP2, VP3, VP7, and NSP1 were identified to be under positive selective pressure. Continued surveillance of G1P[8] strains is needed to determine long-term effects of vaccine introductions, particularly now rotavirus vaccines are implemented in the national immunization programs of an increasing number of countries worldwide.


Introduction
Rotavirus is the most common viral cause of acute gastroenteritis in infants and young children worldwide. Rotavirus infection results in 114 million episodes of gastroenteritis, 24 million clinic visits, and 2.4 million hospitalizations annually (Glass et al. 2006). Of the estimated 453,000 annual deaths, the majority occur in developing nations in Asia and sub-Saharan Africa (Tate et al. 2012). Rotavirus belongs to the Reoviridae virus family and has a double-stranded RNA genome composed of 11 gene segments. The genome encodes six structural viral proteins (VP1À4, VP6, VP7) and six nonstructural proteins (NSP1À5/6) (Estes and Greenberg 2013). Numerous mechanisms impact the dynamics of rotavirus diversity including genetic shift, genetic drift, recombination, and zoonotic transmission. The accumulation of spontaneous sequential point mutations (genetic drift) occurs due to the error-prone nature of the rotavirus RNA-dependent RNA polymerase (Estes and Greenberg 2013). The rate of mutations has been calculated for several VP7 genotypes, and a small number of other genes including VP4 and NSP2, resulting in the identification of varying mutation rates that may reflect the different selective pressures exerted on different genes and genotypes (Jenkins et al. 2002;Matthijnssens, Heylen, et al. 2010;Donker and Kirkwood 2012;Nagaoka et al. 2012;Trang et al. 2012).
Two live-oral vaccines are currently available on the global market, Rotarix (GlaxoSmithKline Vaccines, Belgium) and RotaTeq (Merck and Co., USA), and included in the routine vaccination programs of many countries including the United States, Brazil, Belgium, and Australia (Dennehy 2012). RotaTeq is a live-attenuated pentavalent vaccine that contains five genetically distinct human-bovine reassortant virus strains. Each reassortant strain contains a human rotavirus gene encoding one of the outer capsid proteins (VP7 encoding G1, G2, G3, or G4; or VP4 encoding P[8]) within a bovine WC3 strain backbone (G6P[5]) (Heaton et al. 2005;Matthijnssens, Joelsson, et al. 2010). RotaTeq is administered in a three dose schedule at 2, 4, and 6 months of age. Rotarix is a live-attenuated monovalent vaccine composed of a G1P [8] strain that is administered in a two dose schedule at 2 and 4 months of age (Vesikari et al. 2007).
In early 2006, Rotarix and RotaTeq became commercially available in Australia and were subsequently introduced into the Australian National Immunisation Program in July 2007. Each state and territory independently selected which vaccine to implement; Victoria, Queensland, Western Australia, and South Australia used RotaTeq, while New South Wales, the Northern Territory, Tasmania, and the Australian Capital Territory used Rotarix (Buttery et al. 2011). By December 2008, the estimated national vaccine coverage was 87% for at least one dose of vaccine received by 4 months of age, and 84% for a full vaccine course (either two or three doses) by 13 months of age (Macartney and Burgess 2009). Belgium was the first country in the European Union to introduce rotavirus vaccines into the routine childhood immunization schedule with Rotarix and RotaTeq introduced in November 2006 and June 2007, respectively (Braeckman et al. 2011). Vaccine coverage reached approximately 90% by the start of the 2007-2008 rotavirus season and the most frequently used vaccine in Belgium is Rotarix (Zeller et al. 2010;Braeckman et al. 2011). Several studies have shown that vaccine introduction has significantly decreased the burden of rotavirus disease in Australia and Belgium (Lambert et al. 2009;Zeller et al. 2010;Braeckman et al. 2011;Buttery et al. 2011). It is predicted that vaccine introduction will result in an increase in selective pressure exerted on wild-type rotavirus strains in the population, affecting the evolution of these strains.
The aim of this study was to genetically characterize circulating G1P[8] strains in Belgium (Rotarix) and Melbourne, Australia (RotaTeq), and to investigate viral evolution in the pre-and postvaccine era.

Sample Preparation
Stool samples were collected from children hospitalized under the age of five as part of the ongoing rotavirus surveillance studies in Belgium and Australia. Belgian samples were collected from the Gasthuisberg University Hospital, Leuven, and from other hospitals in Belgium between 1999 andG1P[8] rotaviruses were selected based on the phylogenetic diversity of VP7 and availability of stool sample. Australian stool samples were collected from the Royal Children's Hospital, Melbourne, Australia, between 2001 and 2011 and were randomly selected based on the availability of stool samples and the proportion of G1P[8] samples collected in a given year. No Group A rotavirus strains were included that were completely or partially vaccine derived.
For each stool sample, RNA extractions were performed using the QIAamp viral RNA mini kit (Qiagen, Hilden, Germany) and these were sent to the J. Craig Venter Institute (Rockville, MD) for sequencing as previously described (McDonald et al. 2009(McDonald et al. , 2012. The sequences generated in this study were deposited in GenBank under the accession numbers listed in supplementary table S1, Supplementary Material online.

Bayesian Evolutionary Inference and Phylogenetic Network Analysis
We estimated time-measured evolutionary histories for each rotavirus segment using Bayesian phylogenetic inference as implemented in BEAST (Drummond et al. 2012). The nucleotide substitution process was modeled by separately partitioning the codon positions into 1st + 2nd and 3rd positions (Shapiro et al. 2006) and applying a separate general timereversible substitution model with gamma-distributed rate heterogeneity and a proportion of invariant sites (GTR + I + gamma) (Tavaré 1986), under an uncorrelated lognormal relaxed molecular clock to account for variation in rates of evolution among lineages ). We specified a Bayesian Skygrid coalescent tree prior that allows the population size to be estimated through time from a single or multiple unlinked genetic loci (Gill et al. 2013).
Three independent Markov chain Monte Carlo chains were run for 100 million steps and sampled every 10,000th generation, with 10% of the generations discarded as chain burn-in. All analyses were performed using the BEAGLE library to enhance computation speed (Suchard and Rambaut 2009;Ayres et al. 2012). Convergence and mixing of the chains were inspected using Tracer version 1.5; all continuous parameters yielded effective sample sizes greater than 200. A maximum clade credibility tree was summarized using TreeAnnotator version 1.7.4 (Drummond and Rambaut 2007). In addition, a phylogenetic network was constructed of all 11 concatenated ORFs using the SplitsTree4 program (Huson and Bryant 2006). Based on the phylogenetic network, clusters were defined and associations among clusters, country of isolation, and period of sample collection (before or after vaccine introduction) were determined using Fisher's exact test.

Selection Pressure Analysis
To estimate site-specific nonsynonymous/synonymous substitution rate ratios (d N /d S ), we used the Renaissance counting methodology that combines estimates of the number of synonymous and nonsynonymous substitutions at each site with an empirical Bayes procedure to produce a posterior distribution of d N /d S ratios for all sites in the alignment (Lemey et al. 2012). In addition, d N /d S ratios were also computed using the Mixed Effects Model of Evolution (MEME), Single likelihood ancestor counting (SLAC), Fixed Effects Likelihood (FEL), and Fast Unconstrained Bayesian AppRoximation (FUBAR) algorithms as implemented in the Datamonkey webserver (Pond et al. 2005;Delport et al. 2010). As outcomes can differ considerably depending on the chosen method, we only considered sites that were under positive selection pressure according to three or more of the five abovementioned methods.

Results
The complete ORF of all 11 gene segments of 69 and 88 G1P[8] rotaviruses from Belgium and Australia, respectively, isolated before and after vaccine introduction were determined. In total, 78 G1P[8] strains were detected before vaccine introduction and 79 strains were detected after vaccine introduction. The majority of the Australian samples were collected after 2004, whereas Belgian samples were collected from 1999 until 2010 although no samples were collected in 2004 ( fig. 1).
The level of reassortment in the ancestral history of the Belgian and Australian G1P[8] rotaviruses was investigated by constructing a phylogenetic network from concatenated sequences of the 11 genes ( fig. 2A). The phylogenetic network showed that the G1P[8] strains were distributed among three main clusters, each containing multiple subclusters. The presence of a large number of incompatible splits in the network indicates the occurrence of frequent reassortment events between strains belonging to the same or different clusters. Belgian and Australian G1P[8] strains were found in each of the three clusters, although only six Australian strains were observed in cluster 2. Within every cluster, separate subclusters could be identified containing only strains from either Belgium or Australia. However, several subclusters (shaded green in fig. 2B), predominately in cluster 2 and 3, contained both Belgian and Australian strains reflecting rapid global dissemination of G1P[8] rotaviruses. No overall association was observed between the country of isolation and the period of sampling (pre-or postvaccine introduction) (table 1). Focusing on specific clusters, the country of isolation and the period of sampling were not significantly related to each other for cluster 2 and 3 (P = 0.37 and P = 0.33, respectively). For cluster 1, however, less strains were isolated in Belgium after rotavirus vaccine introduction (P = 0.01).
Each of the three main clusters and the majority of the subclusters contained closely related G1P[8] strains from both Belgium and Australia collected before and after rotavirus vaccine introduction ( fig. 2B). Subclusters were identified that contain unique strains detected before (shaded green) or after vaccine introduction (shaded red), although in most cases these clusters also coincided with a single sampling location and period ( fig. 2B). The majority of subclusters containing only strains identified after vaccine introduction were observed in cluster 3, and generally represented unique strains in either Belgium or Australia.
To further investigate the evolutionary dynamics for the individual gene segments, we constructed phylogenetic trees for all 11 gene segments using the Bayesian phylogenetic inference of time-measures trees as implemented in the BEAST package. The VP7 and VP4 genes were segregated in two and three lineages, respectively ( fig. 3). For VP7, both lineages were approximately of equal size and contained Belgian and Australian strains. Belgian and Australian strains were more intermingled in lineage 2, while separate clusters of Belgian or Australian strains were more frequently observed in lineage 1, indicating more localized epidemics in each country ( fig. 3A). For VP4, three lineages were observed, but the majority of the strains were found in lineage 1. Belgian and Australian G1P[8] strains were more intermingled within lineage 1 compared with lineage 2 or 3. Most strains that formed closely related clusters for VP7 were also closely related for VP4, although some of these clusters show evidence of reassortment as there is no complete congruency between VP7 and VP4 lineages. Within lineage 1 and 2 several clades containing Australian G1P[8] strains were detected. One Belgian strain was distinct from all other P[8] VP4 genes characterized in this study. It clustered separately from all other Belgian and Australian G1P[8] strains and was most closely related to OP354-like P[8] VP4 genes (data not shown).
The We also investigated the clustering patterns of Belgian and Australian G1P[8] strains with respect to their time of isolation (before or after vaccine introduction). For VP7 and VP4, G1P[8] strains isolated before and after vaccine introduction were present in the two major lineages ( fig. 3B). However, for VP7, 60% (45 of 75) of all lineage 2 strains were isolated after vaccine introduction, whereas for lineage 1 only 44% (36 of 82) of the strains were isolated after vaccine introduction (P = 0.06) ( fig. 2). For VP4, strains isolated before and after vaccine introduction were evenly distributed across the different lineages (51%; 24 of 47 and 56 of 109 for lineage 1 and 2, respectively; P = 1.00) ( fig. 3B), but strains isolated after vaccine introduction dominated in one particular clade within lineage 1. For the other nine gene segments, strains isolated before and after vaccine introduction were also observed in all major lineages and strains isolated before and after vaccine introduction were approximately evenly distributed across the different lineages. Gene segments from G1P[8] strains isolated before and after vaccine introduction were sometimes also found to be closely related, indicating a prolonged circulation of a gene variant for an extended period of time  1932-1975) and NSP4 (195495% HPD: 193895% HPD: -1971. Other gene segments with relatively young TMRCAs were VP1, VP2, VP3, NSP3, and NSP5. Remarkably older TMRCAs were observed for the VP4, VP6, NSP1, and NSP2 gene segments. For VP4, this was partially the result of a single outlier strain in lineage 3 (the TMRCA of P[8] lineage 1 and P[8] lineage 2 strains was estimated at 1889; 95% HPD: 1832-1942), while for VP6, NSP1, and NSP2 a comparatively large genetic diversity resulted in a relatively old TMRCA ( fig. 4A). A separate TMRCA analysis was conducted for the data sets from each country resulting in similar findings, except for VP4, NSP1, and NSP5. For these gene segments, the TMRCA of Australian strains was more recent than those of Belgian strains, due to a lower genetic diversity in the strains isolated in Australia (data not shown). We also calculated the evolutionary rate for each gene segment, which revealed only marginal differences between the different gene segments (fig. 4B). The evolutionary rates for each gene segment ranged from 6.05 Â 10 À4 to 1.01 Â 10 À3 . The highest mutation rates were observed for NSP1 (1.01 Â 10 À3 ; 95% HPD: 8.44 Â 10 À4 to 1.18 Â 10 À3 ) and NSP4 (1.01 Â 10 À3 ; 95% HPD: 7.39 Â 10 À4 to 1.33 Â 10 À3 ), respectively.
The population size estimates through time for the Belgian and Australian VP7, VP4, VP6, and NSP4 genes showed a relative stable profile with a few peaks and troughs during the sampling period (1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) (fig. 4C). Comparison with the Skygrid plots of other gene segments also revealed similar patterns. There is one exception, NSP1, were the peaks and troughs were absent and the Skygrid plot is completely flat. However, as the peaks and troughs were found in almost all other gene segments and were observed well before the introduction of rotavirus vaccines, they were more likely the result of sampling biases rather than vaccine introduction (supplementary fig. S3, Supplementary Material online).
Each gene segment was tested for sites under positive selection using a Belgian data set, an Australian data set, and the combined data set. Sites under positive selection were identified in VP2 (aa 40, 42), VP3 (aa 326, 683), VP7 (aa 28), and NSP1 (aa 422) by at least three of the five used models (table 2). No sites under positive selection were identified in the other seven rotavirus gene segments. Nonconservative amino acid variation was identified at the majority of sites under positive selection. However, amino acid variation at sites under positive selection did not differentiate before and after vaccine introduction in G1P[8] strains identified in either Belgian or Australian strains.

Discussion
In this study, we investigated the impact of rotavirus vaccine introduction on the most common human rotavirus genotype G1P[8]. To our knowledge, this is the first large-scale genomic analysis of human rotaviruses collected before and after rotavirus vaccine introduction. Australia and Belgium were among the first countries worldwide where rotavirus vaccines were implemented in national immunization programs. In Belgium, the most commonly used vaccine is Rotarix, while in Australia different states are using different rotavirus vaccines (Zeller et al. 2010;Buttery et al. 2011). The samples in this study V P 4 V P 6 V P 1 V P 2 V P 3 N S P 1 N S P 2 N S P 3 N S P 4 N S P 5 Time (

Genome-Wide Evolutionary Analyses of G1P[8] Strains
were collected from Victoria where RotaTeq is the only vaccine used (Buttery et al. 2011). Vaccination coverage was high in both countries, which provides a unique opportunity to investigate the early effect of vaccine introduction on the genetic diversity of G1P[8] rotaviruses (Braeckman et al. 2014;Hull et al. 2014). Three different clusters could be identified in the phylogenetic network that was constructed using sequences derived from the concatenation of the 11 rotavirus genes ( fig. 2). Of particular interest is cluster 1, which is predominantly composed of Belgian G1P[8] strains from the prevaccine era. In Australia, no difference was found in the prevalence of G1P[8] strains in cluster 1 before and after vaccine introduction, which could possibly be the result of the different vaccines used in the national immunization programs of both countries. On the contrary, Australian G1P[8] strains belonging to cluster 2 were found more frequently after vaccine introduction, although this difference was not statistically significant as only 6.8% of all Australian G1P[8] strains were found in cluster 2. Lineage replacement has been suggested as an important evolutionary mechanism for rotaviruses to adapt to different immunological environments (McDonald et al. 2009;Zeller et al. 2012;Dennis et al. 2014;Zhang et al. 2014;Magagula et al. 2015) and the observed changes in lineage frequency could be an adaptation of the rotavirus population to evade immunological pressures of widespread vaccine use. However, when interpreting these data the different sample selection methods in Belgium and Australia should be kept in mind. In Belgium, samples were selected based on genetic diversity of VP7, whereas in Australia a random sample collection occurred. This may, for example, explain why Australian G1P[8] strains more frequently formed closely related subclusters in many gene segments as compared with Belgian strains.
For every gene segment at least two lineages were present within the Australian and Belgian G1P[8] strains, this is similar to that observed in the few other large-scale genomics studies investigating Wa-like rotaviruses (McDonald et al. 2009(McDonald et al. , 2012Roy et al. 2014;Zhang et al. 2014;da Silva et al. 2015). No data are currently available about rotavirus vaccine effectiveness against intragenotypic lineages, but it seems unlikely that the difference in vaccine effectiveness is larger between different intragenotypic lineages than between different genotypes. Vaccine efficacy against rotavirus diarrhea of any severity for different rotavirus genotypes has been well studied and ranges between 61% and 88% for Rotarix and 88% and 95% for RotaTeq depending on the genotype (Soares-Weiser et al. 2012). In a Belgian case-control study, the vaccine effectiveness of Rotarix against the most common human genotypes ranged from 85% to 95% with only small differences between the vaccine effectiveness against Wa-like genotype strains G1, G3, and G4 (Braeckman et al. 2012;Matthijnssens et al. 2014). The differences in vaccine effectiveness between different lineages of G1P[8] strains are expected to be low. This was also suggested by our findings as only subtle differences were detected in the genetic diversity of the different rotavirus gene segments before and after vaccine introduction. For example, the majority of strains clustering in VP7 lineage 2 were isolated after vaccine introduction, whereas for VP7 lineage 1 most strains were isolated before vaccine introduction. Despite changes in genetic diversity, the rotavirus population size remained stable throughout the sampling period. As only relatively few countries were using rotavirus vaccines between 2006 and 2010, a potential effect on the rotavirus population size in Belgium and Australia could have been diluted by rotavirus strains migrating from other countries without a universal rotavirus vaccination program.
Belgium and Australia are located on the northern and southern hemisphere, respectively, and therefore experience rotavirus seasons that do not overlap each other. Despite this, we found Belgian and Australian G1P[8] rotaviruses in subsequent rotavirus seasons that were very closely related. It is unknown whether this is the result of direct transmission between Belgium and Australia or that both viruses originate from a third country, for example, in the tropics where rotavirus infections occur year round (Levy et al. 2009). However, it is likely that G1P[8] strains detected in Belgian and Australia are part of a global pool of circulating G1P[8] strains. The phylogenetic analysis revealed several lineages that circulated in the pre-and postvaccine era. Several unique genetic clusters, based on whole genome concatenation or specific gene segments that represented unique subclusters/lineages, were only present in the postvaccine introduction samples and warrant further monitoring. More extensive sampling of larger genomic data sets collected over a longer time period in Belgium and Australia as well as samples collected in different locations in the world would help to elucidate global patterns of rotavirus transmission. Surprisingly, we found large variations in the TMRCA for different rotavirus gene segments, which could not be attributed to differences in evolutionary rates among segments ( fig. 4). Especially, for VP4, VP6, NSP1, and NSP2 the TMRCAs were approximately 100 years older than those of many other gene segments and displayed a large genetic diversity ( fig. 4 and  supplementary fig. S1, Supplementary Material online). In contrast, the VP7 and NSP4 gene segment histories have a relatively recent TMRCA. Human rotaviruses are subdivided into Wa-like and DS-1-like genotype constellations. For most gene segments generally only two genotypes (1 or 2) are frequently present in human rotaviruses, while for VP7 six genotypes are regularly found: G1-G4, G9, and G12 (Matthijnssens and Van Ranst 2012). The conserved Wa-like genetic backbone allows for frequent reassortment and circulates in combination with varying G genotypes. Therefore, each G genotype is more limited in circulation compared with genotypes of other genes, which results in a decreased ability of G1 genotypes to accumulate mutations. For this study, only rotaviruses bearing the G1 genotype were selected and a large part of VP7 genetic diversity was thus omitted a priori. The TMRCA for the NSP4 segment was estimated at 1954, which together with a high mutation rate and limited genetic diversity suggest that the NSP4 E1 genotypes may have gone through a genetic bottleneck. The wide array of functions that have been assigned to the NSP4 protein, in particular the interaction with other VPs and its function as an enterotoxin (Hu et al. 2012), probably exercises a strong purifying selection pressure.
Selection pressure analysis of Australian and Belgian G1P[8] rotavirus strains identified several sites under positive selection in VP2, VP3, VP7, and NSP1, several of which were found in regions containing known cytotoxic T lymphocyte epitopes (VP2, VP3, and VP7), or protein domains that interact with innate immune factors (NSP1) (Franco et al. 1993(Franco et al. , 1994Buesa et al. 1999;Newell et al. 2013). The site under positive selection in NSP1 (aa 422) is located in the highly variable C-terminal region, involved in downregulation of the innate immune response by binding interferon-regulatory factors (IRF3, IRF5, and IRF7) and inhibiting NFkB through degradation of b-TrCP (Barro and Patton 2005; Arnold and Patton 2011). These data suggest that sequence variation at sites under positive selection in multiple VPs may be involved in immune escape of circulating G1P[8] strains. However, given the extent of genetic diversity between different G1P[8] lineages, a change in G1P[8] lineage frequency is likely to be a more dominant evolutionary mechanism to respond to selection pressures such as the introduction of rotavirus vaccines.
To conclude, we found limited differences between the Belgian and Australian G1P[8] rotaviruses, even though different vaccines were used in both locations. Despite the fact that rotavirus vaccine introduction has decreased rotavirus hospitalizations in many countries, our data suggest that rotavirus vaccination may impact the evolution of G1P[8] rotaviruses even though one of the vaccines, Rotarix, consists of a G1P[8] rotavirus strain. However, in this study, we only looked at limited rotavirus seasons after vaccine introduction in a period where only a small number of countries had implemented universal rotavirus vaccination programs, and as still little is known about the long-term evolution of rotaviruses, we cannot rule out that our results are limited by this. Therefore, an extended surveillance of G1P[8] rotaviruses would be valuable to ascertain long-term effects of vaccine introduction.