Unbiased whole-genome deep sequencing of human and porcine stool samples reveals circulation of multiple groups of rotaviruses and a putative zoonotic infection

Abstract Coordinated and synchronous surveillance for zoonotic viruses in both human clinical cases and animal reservoirs provides an opportunity to identify interspecies virus movement. Rotavirus (RV) is an important cause of viral gastroenteritis in humans and animals. In this study, we document the RV diversity within co-located humans and animals sampled from the Mekong delta region of Vietnam using a primer-independent, agnostic, deep sequencing approach. A total of 296 stool samples (146 from diarrhoeal human patients and 150 from pigs living in the same geographical region) were directly sequenced, generating the genomic sequences of sixty human rotaviruses (all group A) and thirty-one porcine rotaviruses (thirteen group A, seven group B, six group C, and five group H). Phylogenetic analyses showed the co-circulation of multiple distinct RV group A (RVA) genotypes/strains, many of which were divergent from the strain components of licensed RVA vaccines, as well as considerable virus diversity in pigs including full genomes of rotaviruses in groups B, C, and H, none of which have been previously reported in Vietnam. Furthermore, the detection of an atypical RVA genotype constellation (G4-P[6]-I1-R1-C1-M1-A8-N1-T7-E1-H1) in a human patient and a pig from the same region provides some evidence for a zoonotic event.


Introduction
Rotavirus (RV) infections are the leading cause of acute gastroenteritis globally, with a disproportionally greater morbidity and mortality in developing countries of Asia and sub-Saharan Africa (Tate et al. 2012(Tate et al. , 2016. RV can infect humans and different animal species and is considered, in part, a zoonotic disease in humans (Estes et al. 2007). RV zoonotic infections and transmissions have been shown with animal strains moving into humans via direct contact with animals or exposure to environmental contamination (Cook et al. 2004;Martella et al. 2010;Ghosh and Kobayashi 2014), and present a challenge to infection control and management. RV is a non-enveloped double-stranded RNA virus forming the single genus Rotavirus in the Reoviridae family, with a 18.5-kb genome of eleven segments encoding six structural (VP1-4, VP6, and VP7) and five or six non-structural proteins (NSP1-NSP5/6) (Ramig et al. 2005;Estes et al. 2007). RVs are classified into eight established groups (A-H) and a new tentative group (I) based on the genetic and antigenic differences of VP6, with viruses of group A, B, C, and H known to infect both humans and other animals (Matthijnssens et al. 2012;Mihalov-kov acs et al. 2015).
Determining the sequence of the 18.5-kb segmented genome for rotaviruses by standard methods can be biased and cumbersome, requiring an initial PCR step to identify and select primers specific for the RV genogroup and/or strain, with the 11-segment genome providing an additional complication for primer design. Such primer-based sequencing strategies can be further complicated by reassortment possibilities not predicted by the initial PCR typing, leading to sequencing failure of atypical and un-typeable RV strains whose frequency can vary by location, season, and environment (Cook et al. 1990;Gentsch et al. 2005;Santos and Hoshino 2005;Pitzer et al. 2009Pitzer et al. , 2011Atchison et al. 2010). Next-generation sequencing has been recently employed for whole-genome sequencing of RVA with initial 11 PCR amplifications (Jere et al. 2011;Magagula et al. 2015;Nyaga et al. 2015); however, a single robust platform for whole-genome deep sequencing of multiple RV genogroups without prior genotype information would be useful. Routine identification of circulating RVA can be performed using commercial enzyme immunoassay kits (based on inner capsid protein VP6) and RT-PCR diagnostic and genotyping assays (based on outer capsid proteins VP4 and VP7) (Estes et al. 2007;Desselberger 2014). In addition, specific, rapid, and cost-effective assays are lacking for the detection of less common rotaviruses such as viruses in group B, C, and H (RVB, RVC, RVH), hindering our understanding of molecular epidemiology of these viruses and challenging efforts of genomic sequencing, particularly in resource-limited countries (Estes et al. 2007;Desselberger 2014).
The RV vaccines (Rotarix and RotaTeq) have been available since 2006 (Ruiz-Palacios et al. 2006;Vesikari et al. 2006), and offer a variable degree of protective immunity against human RVA infections. Reduced RVA vaccine efficacy has been observed in resourcelimited countries in comparison to developed countries (Armah et al. 2010;Zaman et al. 2010;Breiman et al. 2012;Glass et al. 2014). The mechanism responsible for reduced vaccine efficacy in these settings is unclear, but may in part be due to local circulation of genetically and antigenically divergent RVA or zoonotic strains in developing countries (Santos and Hoshino 2005). Similar to other segmented viruses (Nomikou et al. 2015), genetic reassortment has been observed in RVs yielding significant genetic diversity, including a number of cross-species reassortants (Cook et al. 2004;Estes et al. 2007;Desselberger 2014). Hence, assessment of all eleven genome segments through full virus genome sequencing is essential for monitoring the overall RV genomic diversity, complex evolutionary dynamics, and the emergence of novel and zoonotic reassortants that may compromise vaccine protection (Matthijnssens et al. 2008(Matthijnssens et al. , 2011Matthijnssens and Van Ranst 2012).
Vietnam is a low-to-middle income country located in Southeast Asia and is considered one of the global hot spots of emerging infectious diseases (Hay et al. 2013). Diarrhoea is the fourth most common cause of mortality in children <5 years of age, accounting for 12% of deaths in this age group in 2013 (World Health Organization 2015). Among all diarrhoeal pathogens, RVA is responsible for 44-67.4 percent of all childhood diarrhoea cases requiring hospitalisation (Nishio et al. 2000;Nguyen et al. 2001Nguyen et al. , 2004Nguyen et al. , 2007Doan et al. 2003;Landaeta et al. 2003;Van Man et al. 2005;Anh et al. 2006;Ngo et al. 2009;Tamura et al. 2010;Tra My et al. 2011;Trang et al. 2012;Vu Tra My et al. 2014;Thompson et al. 2015). Contrary to the clinical and public health importance, vaccination against RVA is currently not part of the Extended Program on Immunisation for Vietnamese infants. In addition, diagnosis for rotaviruses in diarrhoeal cases in humans and animals is not routinely performed and systematic genomic surveillance of circulating human and animal rotaviruses is limited. This leads to relatively little data on the overall RV prevalence and diversity in human and animal populations and their contribution to human infections and their potential to compromise RVA vaccine protection. Given the tropical climate of Vietnam prone to flooding, the close human-animals living proximity and high prevalence of infectious diseases, we hypothesized that RV zoonosis may occur in the region but is under-investigated and undercharacterized. There is no report on the overall prevalence of other RVs (non-RVA) in both humans and animals in this region. To address this knowledge gap, we used focused sampling within human healthcare and animal farming populations, combined with high-throughput primer-independent direct genome sequencing from clinical materials (Cotten et al. 2014) to document RV diversity and transmission within and between humans and animals in a region of Vietnam.

Study setting and design
Human and porcine faecal samples were collected as part of the Vietnam Initiative on Zoonotic Infections (VIZIONS) project from Dong Thap, a peri-urban province located in the south of Vietnam in the Mekong Delta region (see map, Supplementary  Fig. S1). The human subjects were diarrhoeal patients (N ¼ 146), regardless of age and gender, admitted to Dong Thap Provincial Hospital in the period from October 2012 to January 2014. Diarrhoea was defined as "at least three loose stools or one bloody stool within 24 hours ", according to the WHO guidelines (World Health Organization 2005). A stool specimen was collected from each individual within 24 h of hospital admission to avoid confounding by nosocomial infections. A total of 150 porcine faecal samples were randomly selected from a collection of porcine stool samples from pig farm baseline surveillance samples collected across the same province from January 2012 to April 2013. For four pigs in farms where no faecal specimens were obtained, a boot swab was collected (pig ID 12087_38, 14152_6, 14150_53, and 14250_12). The collection dates and ages of human enrollees and pigs in this study are provided in Supplementary

Mapping of the patient residential and pig farm addresses
The residential district centroid was recorded for enrolled human patients to maintain participant anonymity, while the exact geographical location was recorded for the pig farms using an eTrex Legend GPS device (Garmin, UK). The decimal degrees of latitude and longitude were entered in a confidential database and kept separate from patient metadata so that patient identities could not be revealed based on the residence locations. These addresses were then validated in Google Earth Pro (https://www.google.com/earth/) and finally visualized in QGIS v2.2.0 (http://www.qgis.org/en/site/) overlaid with provincespecific geographic data.

Sample preparation and nucleic acid extraction
Total nucleic acid extraction was performed as previously described (de Vries et al. 2011described (de Vries et al. , 2012Cotten et al. 2014). In brief, 110 ll of a 50 percent stool suspension in PBS was centrifuged for 10 min at 10,000 Â g. Non-encapsidated DNA in the samples was degraded by addition of 20 U TURBO DNase (Ambion). Virion-protected nucleic acid was subsequently extracted using the Boom method (Boom et al. 1990). Reverse transcription was performed using non-ribosomal random hexamers (Endoh et al. 2005) that avoid transcription of rRNA, and second strand DNA synthesis was performed using 5 U of Klenow fragment 3 0 -5 0 exo -(New England Biolabs). Final purification of extracted nucleic acids was performed with phenol/chloroform and ethanol precipitation.

Library preparation and sequencing
Standard Illumina libraries were prepared for each sample. In short, nucleic acids in each sample were sheared to 400-500 nt in length, each sample's nucleic acid was separately indexed and samples were multiplexed at either seven samples per MiSeq run or ninety-six samples per HiSeq 2500 run, generating 2-3 million 149 nt (MiSeq) or 250 nt (HiSeq) paired-end reads per sample.

De novo assembly and identification of viral genomes
Raw sequencing reads were filtered to remove low-quality reads (Phred score >35) and trimmed to remove residual sequencing adapters using QUASR (Watson et al. 2013). The reads were assembled into contigs using de novo assembly with SPAdes (Bankevich et al. 2012) combined with sSpace (Boetzer et al. 2011). RV-encoding contigs and other mammalian virus contigs were identified with a modified SLIM algorithm (Cotten et al. 2014) combined with ublast (Edgar 2010). Coverage was determined for all contigs harvested to filter any process contamination sequences in each run, followed by additional filtering for minimum contig size cutoff (300 nt). Partial but overlapping contigs were joined into full-length sequences using Sequencher (Gene Codes Corporation, USA), and any ambiguities were resolved by consulting the original short reads. Final quality control of genomes included a comparison of the sequences, open reading frames (ORFs), and the encoded proteins with reference sequences retrieved from GenBank.

Genotyping and phylogenetic reconstruction
Assembled RVA sequences were genotyped using the online genotyping tool, RotaC v2.0 (http://rotac.regatools.be) (Maes et al. 2009), according to the guidelines for precise RVA classification using all eleven genomic segments (Matthijnssens et al. 2011). The resulting RVA, RVB, RVC, and RVH sequences were combined with additional full-length or nearly full-length sequences from previous Vietnamese studies (if available) and global representatives retrieved from GenBank. The complete genomes from the vaccine components of the monovalent vaccine Rotarix (Gautam et al. 2013) and the pentavalent vaccine RotaTeq (Matthijnssens et al. 2010) were retrieved from GenBank for phylogenetic reconstructions of all eleven RVA segments. Sequences were aligned using MUSCLE v3.8.31 (Edgar 2004) and manually checked in AliView (Larsson 2014); aligned sequences were trimmed to complete ORFs for subsequent analyses. Evolutionary model testing was implemented in IQ-TREE v3.10 (Nguyen et al. 2015) using the Akaike Information Criterion (AIC) to determine the best-fit models of nucleotide substitution for all genomic segments. Maximum likelihood (ML) phylogenetic trees were then inferred in IQ-TREE v3.10 with 500 bootstrap replicates under the best-fit model of evolution according to AIC (Supplementary Table 2 summarized the models determined for all segments). Resulting trees were visualized and edited using FigTree v1.4.2 (http://tree.bio.ed.ac. uk/software/figtree/). Genetic distances (p-uncorrected) were estimated using Geneious v9.0.4 (Biomatters Ltd).

Bayesian analysis for RVA NSP3 T7 genotype and for RVH VP6 gene sequences
Available sequences were retrieved from GenBank and new sequences obtained in this study were aligned using MUSCLE v3.8.31 (Edgar 2004), manually checked in AliView (Larsson 2014), and trimmed to complete ORF. An ML phylogenetic tree was constructed under the GTR þ C 4 model of substitution in IQ-TREE v3.10 (Nguyen et al. 2015). For RVH VP6 sequences, highly similar sequences were removed before running Bayesian analyses (strains BR59, BR60, BR61, BR62, BR63, NC7_64_3, OK5_68_10). The molecular clock model was assessed in TempEst v1.5 (Rambaut et al. 2016), assessing the linear regression between root-to-tip divergence and the date of sampling. For RVA NSP3 T7 analysis, year of strain collection was used as tip dates since data on day and month were not available for GenBank sequences; for RVH VP6 sequences, tip dates were defined as year, month, day of strain collection. A Bayesian Markov Chain Monte Carlo (MCMC) approach was then performed in BEAST v1.8.0 (Drummond and Rambaut 2007), using relaxed lognormal molecular clock under HKY85þC 4 substitution model. For RVA NSP3 T7 sequences, a Bayesian SkyGrid population process was employed and run in triplicate for 100 million generations chain with sampling performed every 10,000 runs. For RVH VP6 sequences, a non-parametric Guassian Markov Random Fields (GMRF) Bayesian Skyride population analyses were run in triplicate for 50 million generations with sampling performed every 5,000 generations. These triplicate runs were then combined using LogCombiner v1.8.0 (available within the BEAST package) with a removal of 10% burn-in, and analysed in Tracer v1.6 (http://tree.bio.ed.ac.uk/software/ tracer/) to ensure all parameters had converged with effective sample size (ESS) values >200 and to estimate the mean evolutionary rates across branches. Maximum clade credibility trees were annotated using TreeAnnotator v1.8.0 (BEAST) and visualised in FigTree v1.4.2.

Overall diversity of rotaviruses in human and pigs
Sequencing of human enteric samples from acute diarrhoeal patients admitted to Dong Thap Provincial Hospital from 2012 to 2014 yielded sixty de novo assembled RVA genome sequences from 146 samples (41.1 percent). No other RV genogroups were found in these human stool samples (Table 1). The same methods applied to 150 porcine faecal samples collected within the same geographic region (Supplementary Fig. 1) identified thirtyone rotaviruses from four different RV groups (A, B, C, and H) in a total of 150 samples (20.7 percent). These de novo assembled sequences included thirteen RVA (41.9 percent), seven RVB (22.6 percent), six RVC (19.4 percent), and five RVH (16.1 percent) ( Table 1). The length of each assembled sequence was determined and expressed as percentage length coverage (length of assembled sequence divided by expected full length of that segment) for the corresponding segment (Fig. 1) . In samples where two distinct contigs were assembled for a segment (e.g. mixed infections), only the longer assembled contig was reported in the heatmap of segment coverage for the purpose of clarity (Fig.  1). The overall length coverage in human RVA sequences was higher than porcine RVA, RVB, RVC, and RVH, possibly be due to differential viral load or sample quality.
Mixed infections were identified in nine samples, seven in humans and two in pigs (Fig. 2), with mixed infection being defined as the detection of two assembled but genetically distinct contigs in at least one segment with sufficient contig coverage to exclude potential process contamination among samples in the same run. The two homologous contig segments identified in mixed infections can have different or the same genotype; for example a mixed infection reported in an individual pig (sample ID 12070_4) contained two homologous VP7 segments, NSP1, NSP2, NSP3, NSP4, and NSP5 bearing the constellation of G1/G4-P[8]-I1-R1-C1-M1-A1/A8-N1/N1-T1/T1-E1/E1-H1/H1 ( Fig. 2 and Supplementary Fig. S2). Another porcine sample was found with a mixture of G9/G11-P[13]/P[23]-I5/I5-R1/R1-C1-M1-A8/A8-N1/N1-T1/T1-E1/E1-H1 (sample ID 14150_53); however, it is important to note that this particular sample was a boot swab of faecal material in a cage-type pigsty, thus there is the possibility that the sample represents mixed environmental virus from more than one pig. Mixed human RVA infections typically contained genotypes 1 and 2 (Wa-like and DS-1 like, respectively) viruses.

Phylogenetic diversity of local human and porcine RVA
Phylogenetic trees were inferred for each RVA segment ( Fig. 3A and B and Supplementary Fig. S3A and B) from assembled sequences in this study along with full-length sequences from previous studies in Vietnam, reference sequences in GenBank and sequences from the RVA vaccine formulations (RotaRix and RotaTeq). The local sequences clustered primarily by genotype as expected (Fig. 3A and B and Supplementary Fig. S3A and B); for example, VP7 G1 sequences in this study clustered with other G1 sequences from other regions and our G2 sequences clustered with other G2 (Fig. 3A). Sequences within the G4 genotype fell into two sub-lineages, with the human strain (16020_7) and porcine strains from this study clustering into one common sub-lineage (Fig. 3A). The mixed infection in the pig (12070_4) described above comprised two distinct contigs for the VP7 segment (belonging to the G1 and G4 genotypes), with the G1 sequence clustering within the human G1 lineage and the G4 sequence falling into a lineage with other G4 porcine sequences from this study (Fig. 3A). Similar observations were seen in the phylogenetic tree for VP4 sequences (Fig. 3B) and other gene segments ( Supplementary Fig. S3A and B). It is also noteworthy that sequences from the vaccine strains (Rotarix and RotaTeq) were relatively distinct from the Vietnamese RVA sequences reported here, particularly for genotypes G5, G9, G11, P[6], P[13], P[23] of the two neutralizing antigens, VP7 and VP4 ( Fig. 3A and  B). Comparison of the amino acid sequences of VP7 and VP4 of the local strains to the Rotarix and RotaTeq vaccine sequences indicated a number of amino acid differences observed across the length of the proteins and particularly in the antigenic epitopes of VP7 and VP4 (data not shown). Taken together, multiple RVA genotypes co-circulate in human and pigs in this location; many of these genotypes are genetically dissimilar to currently used vaccine components.

Putative zoonotic infection of human with a porcine-human RVA virus
An atypical RVA genotype constellation of G4-P[6]-I1-R1-C1-M1-A8-N1-T7-E1-H1 was found in both a human patient (16020_7) and a weaning pig (14250_9), whose geographical distance (between the residence and the farm) was $35 km apart ( Fig. 4A and B). The genotype constellation of the core gene cassette (R1-C1-M1-A8-N1-T7-E1-H1) was also  identified in four pig samples collected from two other farms (Fig. 4A); the farms that raised these pigs are also about 35 km away from the residential location of the 16020_7 case (Fig. 4B). RVA strains with the aforementioned genome constellation have been identified in paediatric diarrhoeal patients in Hungary (Papp et al. 2013  Phylogenetically, all eleven segments of the human 16020_7 and porcine 14250_9 viruses belonged to lineages comprising porcine and/or porcine-origin human sequences (Fig. 3A and B and Supplementary Fig. S3A and B). Genetic distance suggested that the strain 16020_7 was most similar to porcine strains: TMa (for VP1; 96 percent nt similarity), CMP45 (NSP3; 93 percent), and porcine-origin human strain 30378 (NSP2; 99 percent) (Fig.  5). The remaining segments were most similar to porcine RVA strains obtained from this study, including the porcine 14150_53 (NSP1; 98.7 percent and NSP4; 99.1 percent) and 14225_44 (VP2; 99.2 percent and NSP5; 99.1 percent) strains. The capsid proteins of 16020_7 were most similar to the VP7 sequences of porcine samples 12129_48, 12129_49 and 12070_4 (G4 type; 97.2 percent), to the VP4 of pig 14226_39 (98.7 percent) and VP6 of pig 14226_42 (99.5 percent) (Fig. 5). The porcine sample 14250_9, despite possessing the same genotype constellation as 16020_7, shared the highest nucleotide similarity in only two internal genes, VP3 (97.5 percent) and NSP5 (99.1 percent), to the corresponding segments of strain 16020_7. Compared with the RVA vaccine Rotarix, 16020_7 was relatively dissimilar, sharing as low as 75.1 and 75.6 percent nt similarity for the VP4 and VP7 segments, respectively (Fig. 5).
In this unusual genotype constellation reported, the NSP3 T7 type is a rare genotype that was first identified in a cow in Great Britain in 1973 (Ward et al. 1984), then in a bovine-like human strain (Mukherjee et al. 2011), and later in pigs (Martel-Paradis et al. 2013), porcine-bovine human reassortant (Wang et al. 2010) and porcine-like human strains (Bucardo et al. 2012;Zeller et al. 2012;Degiuseppe et al. 2013;Papp et al. 2013;Martinez et al. 2014) in various geographical locations. The inferred evolutionary rate of RVA NSP3 sequences bearing T7 genotype was 1.3261 Â 10 À3 substitutions per site per year (95 percent highest posterior density (HPD): 8.624 Â 10 À4 -1.793 Â 10 À3 ), which is slightly lower than the estimated evolutionary rates for RVA VP7 capsid gene of 1.66 Â 10 À3 and 1.87 Â 10 À3 substitutions/ site/year for G12 and G9 genotypes, respectively (Matthijnssens et al. 2010). The time-stamped MCC tree also indicate an interconnection among different host species indicating several host  Figure 3. ML phylogenetic trees inferred from the assembled nucleotide sequences for RVA VP7 and VP4 genes. (A) ML tree of VP7 gene showed genetic relationships between sequences from this study and additional sequences of corresponding segments from GenBank. Strains were coloured according to the host species from which the strain was identified, and "VIZIONS" in the annotation refers to strains identified from this study. Tree is mid-point rooted for the purpose of clarity and bootstrap values of !75 percent are shown as asterisks. All horizontal branch lengths are drawn to the scale of nucleotide substitutions per site. (B) ML tree of VP4 gene showed genetic relationships between sequences from this study and additional sequences of corresponding segments from GenBank. The pattern of tree visualization is consistent with VP7 tree, see description of Fig. 3A for more information.
jump events particularly between pig and human hosts, suggesting that viral zoonotic chatter may occur more frequently than hitherto reported (Fig. 4C).

RV group H (RVH), group B (RVB), and C (RVC)
RVH was identified in five Vietnamese pigs (3.33 percent; 5/150) at several time points and locations with no temporal or geographical associations (Fig. 6), suggesting that these infections were sporadic and not linked to a single local outbreak. Furthermore, phylogenetic trees were inferred for all RVH segments to investigate the genetic diversity, comparing the RVH strains identified in this study with RVH sequences retrieved from GenBank ( Fig. 6 and Supplementary Fig. S6). In general, RVH sequences typically clustered according to the host species, that is, all porcine RVH sequences belonged to a lineage that is separated from human or cow RVH lineages. Within the porcine clade of the VP6 gene (Fig. 6), sequences fell into two lineages: one lineage comprising sequences from USA and Japan, and the other lineage of Brazilian and Vietnamese sequences. The evolutionary rate was estimated to be 5.195 Â 10 À3 substitutions/ site/year for sequences in the porcine lineage of RVH VP6 (95 percent HPD: 1.865 Â 10 À3 -8.976 Â 10 À3 ).
RVB was found in seven pigs (4.67 percent; 7/150) and RVC was identified in six pigs (4 percent; 6/150). Phylogenetic trees of all segments of RVB and RVC showed that the local porcine sequences belonged to lineages comprising porcine sequences from other geographical locations for RVB ( Fig. 7 and Supplementary Fig. S4) and RVC ( Fig. 7 and Supplementary Fig.  S5). In both RVB and RVC groups, the porcine lineages were relatively distant from lineages comprising of human sequences ( Fig. 7 and Supplementary Figs. S4 and S5).

Discussion
This study represents the first unbiased genome-wide surveillance, targeting simultaneously multiple groups of rotaviruses infecting humans and animals in the same geographical location. Prior to this study, there were only three subgenomic RVC sequences (<300 nt) and nine complete or nearly complete RVA genomes reported from Vietnam in GenBank with no data on RVB and RVH. Data from this study document genomic sequences from sixty human RVA and thirty-one porcine RV (groups A, B, C, and H), providing the largest available collection of genome sequences from human and pigs from a single location in general and from Vietnam in particular. This is also the first report and the first genome characterisation of RVB, RVC and RVH from Vietnamese pigs.
Among the RVA, we identified a human (sample 16020_7) and a porcine sample (sample 14250_9) with atypical RVA genotype constellation G4-P[6]-I1-R1-C1-M1-A8-N1-T7-E1-H1, detected for the first time in Vietnam and in Asia. This variant may have originated from a direct zoonotic transmission or from reassortment event(s) involving porcine and porcine-origin human strains. The human RVA strain (16020_7) was identified in a sample from a 54-year-old patient, admitted to the hospital due to acute diarrhoea. RV was the sole enteric pathogen identified from the stool sample and no other common viral and bacterial diarrhoeal pathogens were found by diagnostic testing (Dung et al. 2012;Thompson et al. 2015) for norovirus, astrovirus, sapovirus, adenovirus F, and aichi virus, Shigella spp., Salmonella spp., and Campylobacter spp. (data not shown). Although adults can be infected with RVA, such infections in immuno-competent individuals are typically asymptomatic, self-limiting, or cause mild disease (Anderson and Weber 2004). The RV infection in this particular case required hospitalization suggesting a moderate-severe end of the clinical spectrum of diarrhoeal disease. Although further studies are required to determine their significance and relevance in human and animal diseases, it is tempting to suggest that this atypical strain may be the cause of the moderate-severe diarrhoeal disease. The close proximity between humans and pigs and common use of river water (Mekong Delta River, Fig. 4B) for daily activities and farming might present an enhanced risk of transmitting water-borne infectious pathogens, in this case providing a plausible zoonotic route of atypical RV transmission.
Compared with RVA, human and animal infections with RVB, RVC, and RVH are not well understood and the detection rates for these groups of viruses are relatively low (Ghosh and Kobayashi 2011). This is probably because the majority of RV investigations have been focused on RVA given its clinical and public health relevance and the large genetic distance among these groups of rotaviruses as compared with RVA, which would likely be missed by commonly used diagnostic assays. Recently, there have been increasing numbers of reports on RV groups B, C, and H in animals (Marthaler et al. 2012;Amimo, Vlasova, and Saif 2013;Marthaler et al. 2013;Lahon et al. 2014;Marthaler et al. 2014;Molinari, Alfieri, and Alfieri 2014;Molinari et al. 2014;Marton et al. 2015;Otto et al. 2015) and humans (Alam et al. 2013;Lahon et al. 2013;Zhirakovskaia et al. 2016), which possibly reflect improved molecular methods to detect these viruses rather than an actual increase in their prevalence. In Vietnam, the frequencies and relative role in human and animal disease of RVB, RVC, and RVH viruses are not yet known.
Our study does have limitations. First, the sample size of 146 human and 150 porcine samples is relatively small. Despite this size, we were able to identify a potential zoonotic infection. This provides a baseline frequency for zoonotic infections and suggests that this may be occurring at a higher rate than previously considered. Second, the disease status of the sampled pigs was not well defined and there was no follow-up beyond the sampling time point so the clinical spectrum of diarrhoeal disease (e.g. mild, moderate, or severe) in the pigs is unknown. However, the primary objective of this study was characterization of RV pathogenesis or causation of the diarrhoeal disease. The presence of RV material at sufficiently high titres to allow full genome sequencing is consistent with these animals being a common source of the virus for movement to other species. Our findings indicate that porcine faecal material is a source of novel and possibly zoonotic viruses.
It is likely that with the ubiquity and falling costs of sequencing, the unbiased virus sequencing described here will become an important component of infectious disease surveillance and rapid responses to outbreaks (Woolhouse, Rambaut, and Kellam 2015). The ideal sampling rate, sample numbers, and geographical relationship between humans and animals for genetic surveillance are still being defined but this study provides a good starting point for future efforts. Even within the relatively modest sample set of 296 human and animal enteric samples, a considerable RV genetic diversity was observed including a potential zoonosis. The integration of targeted sampling, sequencing, and phylogeography or phylogenetics in different places in the world, perhaps informed by other risk mapping (Hay et al. 2013;Rabaa et al. 2015) has the ability to inform surveillance and to monitor zoonotic pathogens in human and animals.

Supplementary data
Supplementary data are available at Virus Evolution online.   Figure 7. ML phylogenetic trees inferred from the assembled nucleotide sequences for VP6 gene of RVB and RVC. ML trees of RVB and RVC VP6 segments showed genetic relationships between assembled sequences from this study and full-length reference sequences of corresponding segments retrieved from GenBank. Trees are mid-point rooted for the purpose of clarity and only bootstrap values of ! 75 percent are shown. Scale bars are in the unit of nucleotide substitutions per site. Strains were coloured according to the host species that the sequences were identified from, and "VIZIONS" in the annotation refers to strains identified from this study.