Molecular Epidemiology of Human Rhinovirus From 1-Year Surveillance Within a School Setting in Rural Coastal Kenya

Abstract Background Human rhinovirus (HRV) is the most common cause of the common cold but may also lead to more severe respiratory illness in vulnerable populations. The epidemiology and genetic diversity of HRV within a school setting have not been previously described. The objective of this study was to characterize HRV molecular epidemiology in a primary school in a rural location of Kenya. Methods Between May 2017 and April 2018, over 3 school terms, we collected 1859 nasopharyngeal swabs (NPS) from pupils and teachers with symptoms of acute respiratory infection in a public primary school in Kilifi County, coastal Kenya. The samples were tested for HRV using real-time reverse transcription polymerase chain reaction. HRV-positive samples were sequenced in the VP4/VP2 coding region for species and genotype classification. Results A total of 307 NPS (16.4%) from 164 individuals were HRV positive, and 253 (82.4%) were successfully sequenced. The proportion of HRV in the lower primary classes was higher (19.8%) than upper primary classes (12.2%; P < .001). HRV-A was the most common species (134/253; 53.0%), followed by HRV-C (73/253; 28.9%) and HRV-B (46/253; 18.2%). Phylogenetic analysis identified 47 HRV genotypes. The most common genotypes were A2 and B70. Numerous (up to 22 in 1 school term) genotypes circulated simultaneously, there was no individual re-infection with the same genotype, and no genotype was detected in all 3 school terms. Conclusions HRV was frequently detected among school-going children with mild acute respiratory illness symptoms, particularly in the younger age groups (<5-year-olds). Multiple HRV introductions were observed that were characterized by considerable genotype diversity.

Human rhinovirus (HRV) is a common viral respiratory pathogen [1] associated with the common cold [2,3], lower respiratory tract infections [4], and asthma [5]. Although the majority of HRV cases are mild and self-limiting [6], they contribute to substantial economic losses through missed school and workdays [6,7]. HRV is a common reason for prescribing antibiotics [8], potentially contributing to antibiotic resistance. The virus is transmitted via inhalation of contaminated aerosols, for example, during close contact with infected persons, or selfinoculation via touching contaminated surfaces or objects [6,9]. HRV has a median incubation period (range) of 2 (1-7) days and a symptom duration of 7-14 days [10]. Viral shedding can occur for up to 2 (range, 1-3) weeks [11]. Children, the elderly, and persons with preexisting respiratory conditions bear the highest HRV burden [12][13][14].
Children constitute a significant susceptible population that support the transmission and persistence of HRV in populations [20]. In Africa, little has been done to investigate the patterns and mechanisms of transmission of HRV in the school setting and to understand the extent to which school settings contribute to HRV transmission in the community. This is despite that school children have high contact rates compared with other age groups in the community [21][22][23]. Design of effective intervention strategies against HRV will be supported by improved knowledge of transmission dynamics of HRV in different social networks and population structures [6,24]. This study investigated HRV infections in a school setting in rural coastal Kenya by sequence analysis of the VP4/2 junction to describe the frequency, diversity, and temporal occurrence of HRV.

Study Area and Design
The study was conducted in a rural school located within the Kilifi Health and Demographic Surveillance System (KHDSS) in Kenya [25] to characterize the occurrence of respiratory viruses. The study design is described in detail elsewhere [26]. Briefly, the school offers both early childhood development education and primary school education. Pupils and teachers from all classes were enrolled in the study, which took place between May 2017 and April 2018. Pupils were divided into 2 main groups: the lower primary, comprised of day care, kindergarten (KG) levels 1-3, and grade 1 (n = 5 classes; age range, 3-12 years); and the upper primary, comprised of grades 2-8 (n = 7 classes; age range, 7-20 years).
Nasopharyngeal swabs were collected when a pupil or a teacher had at least 1 of the following acute respiratory illness (ARI) symptoms: cough, sore throat, or runny nose. Students documented the ARI symptoms they experienced in a daily diary. Class teachers recorded symptoms for the lower primary group. A maximum of 8 samples per class was collected from the lower primary group per week, while a maximum of 4 samples per class was collected from the upper primary group per week. A maximum of 3 samples was collected from the teachers per week. We collected more samples from the lower primary due to the perceived critical role of this age group in childhood infectious diseases and hence the need to reduce the level of uncertainty in the estimated risk in this age group. Samples were collected in viral transport media (VTM) and transported in cool boxes to the KEMRI-Wellcome Trust Research Programme laboratory where they were stored at -80°C before screening. Sampling was suspended during school holidays: August, November, and December 2017 and after April 6, 2018, which marked the end of the study.

Patient Consent Statement
This article reports on samples collected from 2 studies: a school [26] and a community ARI surveillance study [27]. An informed written parental consent for persons under the age of 18 years or individual consent for adults was obtained before sample collection. For the school cohort, consent was obtained at the beginning of the study, with new students and those not initially enrolled allowed to join in the second and third terms. In addition, children whose parents consented were asked for individual assent to participate. Ethical approval was provided by the KEMRI-Scientific Ethics Review Unit (KEMRI-SERU #3332 and #3103) and the University of Warwick Biomedical and Scientific Research Ethics Committee (BSREC #REGO_2016-1858 and #REGO-2015-6102).

VP4/2 Amplification and Sequencing
VP4/2 sequencing was used to assign species and genotypes. A genomic region of ~549 nucleotides and consisting of a hypervariable region of the 5'UTR, the complete VP4, and the partial VP2 gene region was amplified for all HRV-positive samples using a One-Step RT-PCR kit (QIAGEN) as previously described [31,32]. PCR products were purified using the MinElute PCR purification kit (Qiagen, Manchester, UK) and sequenced with the respective forward and reverse primers in a BigDye terminator, version 3.1 (Applied Biosystems, Foster City, California, USA), and analyzed in an ABI 3130xl genetic analyzer.
Intratype diversity for genotypes with ≥10 sequences was studied by visualization of the number of nucleotide substitutions to the index sequence for each type. The substitution rate of the VP4/2 coding region in HRV had previously been estimated as 7 × 10 -4 -4 × 10 -3 substitutions/site/year [37]. Using the upper evolutionary rate value translated to 1.68 nucleotide substitutions per year across the sequenced 420 nt segment. We therefore defined an intratype variant as a sequence with >2 nucleotide differences from the index sequence. An intratype variant had to be observed in at least 2 sequences to increase confidence that this was not the result of sequencing error.

Data Analysis
Data analysis was conducted using STATA, version 13 (STATA Corp, TX, USA), and R, version 3.6.1 (CRAN R Project). Categorical variables were summarized using frequencies and percentages. The HRV proportion for each class and respective 95% CIs were defined as the number of HRV-positive samples out of the total number of samples tested per class. The chisquare test for trend was used to check for linear trend in HRV proportion with an increasing hierarchy of classes in the school (from day care to teaching staff). Spearman's rank correlation coefficient was run to determine correlation between age and Ct value.

Definition of Terms
We defined "persistence" as the continued occurrence of the same genotype within the same school term. Detection of a genotype in a subsequent school term was considered a genotype recurrence. We defined "frequent" genotypes as those that occurred in ≥5 samples, from >2 individuals, and further investigated their temporal occurrence and persistence. We defined "individual HRV re-infection" as the acquisition of a new genotype or the detection of a previously acquired genotype in a subsequent school term. Individual detection with the same genotype in consecutive samples was considered a continuing infection.

Sequence Data Availability
Sequences generated by this study are available in GenBank under accession numbers MT177659-MT177911.  Table 1). The HRV positives were collectively from 164 individuals: 160 pupils and 4 teachers. The mean age of the HRV-positive pupils was 9.4 years, with a standard deviation of ±3.9. The most common ARI symptom recorded among the HRV-positive cases (single sampling events) was nasal discharge (n = 278; 90.6%), followed by cough (n = 226; 73.6%) and sore throat (n = 62; 20.2%). There was no notable difference between symptoms in HRVpositive and negative cases (P > .05). Only a small proportion of those detected with HRV could identify a household member with ARI-like symptoms (n = 43, 14.0%). Among the household members identified, 26/43 (60%) were school-going siblings, with 18 (69%) of these attending the same school and the other 8 (31%) a different school (Table 1).

Baseline Characteristics and HRV Detection
HRV circulation was detected throughout the 3 school terms under observation. Seasonal variations of HRV infections could not be identified due to breaks in sample collection during the school holidays. The lower primary had a higher HRV proportion compared with upper primary (19.8% vs 12.2%; P < . 001) (Figure 1). Spearman's rho indicated no correlation between Ct values and age (rho = .03; P = .615).

Temporal Occurrence and Spatial Clustering Patterns
Numerous genotypes circulated simultaneously in the school, with 22, 15, and 19 unique genotypes observed in term 1, term 2, and term 3, respectively. Nine genotypes recurred during the study period. Of the 22 that occurred in term 1, 2 re-occurred in term 2, and 4 in term 3, whereas of the 15 observed in term 2, 3 recurred in term 3. No genotype was observed across all 3 school terms. Four of the recurring genotypes (A13, A59, B48, C3) were detected in the same class ( Figure 3A).
Twelve genotypes were observed as singletons (ie, present in a single sample/individual). We observed that frequent genotypes (n = 21) circulated averagely for 28 days (median, 23 days). Five of the frequent genotypes recurred in a subsequent school term. The longest persisting genotype was B70 (n = 22 samples), seen in 81 days. Genotype A2 (n = 24 samples) was similarly frequent but persisted for only 16 days. Among the frequent genotypes, none was limited to the upper primary group, whereas A10, A28, A101, and C3 were observed only in the lower primary group. However, no frequent genotype was limited to only 1 school class (Figure 4). (24/109) and re-infections (85/109). The number of genotypes per person ranged from 1 to 5. About two-thirds of the individuals (98/144; 68.1%) had only 1 HRV genotype across the study period. Overall, the highest HRV diversity per person (5 genotypes) was observed in 3 pupils, all from the lower primary classes ( Table 2). Time to re-infection varied greatly (13-307 days), with a median of 77 days. However, no individual was re-infected with the same genotype across the study period ( Figure 3).

Intratype Genetic Diversity
Eight genotypes had ≥10 samples/sequences, and 5 of these occurred as a single variant throughout the study. The remaining 3, A28, B48, and B70, had >1 variant. A28 had 2 variants simultaneously observed. One variant was predominantly from KG1 (n = 4/5), while the second was heterogenous. For B48, the second variant was observed as a genotype recurrence in a subsequent school term. Genotype B70 had the highest diversity, with 3 variants observed within 1 school term. The first B70 variant (13 samples) occurred across lower and upper primary as well as teaching staff, the second (4 samples) was first observed 49 days after the genotype's overall index sequence (all samples from KG class 1), and the third (5 samples) was first observed 69 days after the genotype index sequence, with 4/5 samples coming from the lower primary. No individual had >1 variant of the same genotype ( Figure 5). Genotype-specific ML trees of these 3 genotypes showed variant-associated phylogenetic clustering, supported by bootstrap values >70%. Only 2 phylogenetic clusters were associated with grade: K1 (n = 4/5 samples from KG3) in A28 and K2 (n = 4 samples from KG1) in B70 (Supplementary Figure 1).

HRV Among Teaching Staff
Thirty-two samples were collected from teachers, of which 5 (15.6%) were HRV positive from 4 individuals. Three HRV genotypes were identified: A2 (n = 2), B70 (n = 2), and B99 (n = 1). The A2 and B70 genotypes were detected in teachers several days after their initial detection in the student population. The B99 sample from a teacher was the only case of this genotype identified during the entire study period, suggesting that they acquired the infection from outside the school setting, and no onward transmission was observed.

DISCUSSION
In this coastal Kenya school study, we found that HRV occurs year-round in line with studies in this location among symptomatic individuals within the KHDSS area (11%-23%) [31,32,38]. HRV was detected across all age groups, with the highest proportion in the <5-year-olds and the lowest proportion in older age groups (≥18 years), in agreement with previous findings [20,39,40]. A proportion of the HRV-positive children (14%) identified another household member as having ARI-like symptoms, suggesting transmission at the household level that might contribute to transmission at school (or vice versa). All HRV species were found in circulation throughout the year. HRV-A circulated widely (53%), more than -B (18%) and -C (29%), in contrast to previous reports where HRV-A and HRV-C co-dominated [31,32,38]. However, a similar occurrence of HRV species was reported in the first 2 years of aboriginal and nonaboriginal Australian children [41]. There was considerable HRV diversity, with almost one-third of all known HRV genotypes detected. HRV infections were comprised of single genotype occurrences observed in a single sample/individual, as well as frequent genotypes affecting numerous pupils across several classes. This pattern has been observed elsewhere [42], and it is not clear whether this is due to varying serotype infectivity [43] or other epidemiological factors. Are schools a hub of rhinovirus spread? • ofid • 9 Numerous genotypes co-circulated in every school term, implying that no particular genotype predominates at any given time period. Previous studies have shown that contemporary HRV infections in a given population are characterized by numerous genotype-specific "mini-epidemics" [44] and that up to 30 genotypes circulate simultaneously in a given geographical area [45]. No frequent genotype was limited to 1 class, suggesting heterogeneous mixing and transmission within the school. However, 4 frequent genotypes and 1 variant of B70 were observed only in the lower primary, an indication of social clustering. Genotype recurrence in a subsequent school term was observed in 9 genotypes. Although it is not clear whether the study design missed samples between 2 genotype occurrences, the infrequency of genotype recurrence is possibly a reflection of herd immunity to specific types within the school/local community or a reflection of random introductions into the school/ local community. Frequent genotypes in the school persisted for about a month on average. This is a shorter period than that observed across the KHDSS (a larger geographical scope) during an earlier outpatient surveillance study [38]. This is probably due to increased transmission (steered by high contact rates among school-going children), resulting in a shorter-duration epidemic.
The younger age groups exhibited high rhinovirus diversity, as they had more HRV re-infections. No individual was re-infected with the same genotype, further evidence of serotype-specific immunity to HRV lasting ≥1 year [46]. We demonstrate the occurrence of intragenotype variants and associated phylogenetic clusters, which were either separate rhinovirus introductions or diversification of a single variant after introduction, forming as a result different transmission clusters. This observation highlights the benefit of sequence data over serology to study viral transmission dynamics. The numerous HRV genotypes, sparse sampling of ARIs, and minimal resolution from partial short sequences obtained here did not allow for transmission inferences (due to insufficient within-type variation).
An outpatient health facility located at the same location as the school reported in this study was recruited into an ARI surveillance study from December 2015 to November 2016, 5 months before onset of the school surveillance [27]. This outpatient clinic is within 4 km from the school. A detailed analysis of molecular epidemiology of HRV for samples collected at this outpatient clinic has been reported elsewhere [38]. Although not a primary objective of this study, we compared the diversity of HRV infections between the 2 study periods. We observed 12 common genotypes in the 2 studies: A13, A20, A28, A31, A46, A54, A78, A101, B42, C6, C11, and C19. However, only 1 genotype was frequent in both periods: A101 (Supplementary Figure 2). Our comparison of HRV diversity between a school setting and clinical cases in a health facility within the same geographical location and 2 consecutive seasons showed only 1 frequent genotype present in both studies. This is an indication that HRV diversity within a community varies widely over time, as previously observed [42]. It is not definite what drives the exchange of common rhinovirus genotypes. The rapid turnover and coexistence of genotypes and variants might be determined by immunologically mediated selection processes or other nonselective epidemiological processes.
Our study had some limitations. First, the dichotomy in the number of samples collected weekly in lower vs upper primary posed a challenge when comparing the 2 groups. Second, weekly sampling of only symptomatic persons will likely have resulted in missed HRV infections, impairing the overview of HRV dynamics. In addition, the study failed to successfully amplify and sequence nearly 18% of HRV-positive samples. Failure was not correlated with viral load and may have been caused by variability in primer-annealing sites resulting in mismatches. This may have resulted in missed genotypes or subvariants.
This study provides improved knowledge of the diversity and temporal characteristics of HRV in a school setting, reinforcing the notion that schools are a focal point in understanding HRV transmission in the community. The effect of numerous individuals in close contact enabling HRV transmission is evident. In addition, we see that infections could be linked to transmission events occurring outside the school setting, that is, in the household setting. The contemporary inclusion of different population structures (eg, schools, households, health centers) in studying HRV dynamics will improve our understanding of HRV epidemiology in communities. Future studies should focus on whole-genome sequencing to fully elucidate transmission clusters.

Supplementary Data
Supplementary materials are available at Open Forum Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.

Acknowledgments
We thank all the study participants, teachers, School Management, Parent Teacher Association, and County Education Management Committees for allowing us to conduct the study within their school. We are grateful to the field study team for participant recruitment and the laboratory staff of the KEMRI-Wellcome Trust Research Programme/Virus Epidemiology and Control research group. This paper is being published with the permission of the Director of KEMRI.
Financial support. This work was supported by the Wellcome Trust through a Wellcome Senior Investigator Award to the last author (#102975). The first author was supported by the DELTAS Africa Initiative (DEL-15-003). The DELTAS Africa Initiative is an independent funding scheme of the African Academy of Sciences (AAS) and is supported by the New Partnership for Africa's Development Planning and Coordinating Agency (NEPAD Agency) with funding from the Wellcome Trust (107769/Z/10/Z) and the UK government. Disclaimer. The views expressed in this publication are those of the authors and not necessarily those of the African Academy of Sciences, NEPAD Agency, Wellcome Trust, or the UK government.
Potential conflicts of interest. The authors declare no competing interests. All authors: no reported conflicts of interest. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.
Data availability. The replication data and analysis scripts for this manuscript shall be made available from the Harvard Dataverse: https:// dataverse.harvard.edu/dataverse/vec. . Nucleotide substitutions are demonstrated by a colored bar. A substitution to an "A" is indicated by green, "T" by red, "G" by indigo, and "C" by blue bars. The sequences are labeled by grade of individual and ordered by date of sample collection with the genotype's index sequence at the bottom (acting as a reference). Abbreviation: HRV, human rhinovirus.