Comparison of dkgB-linked intergenic sequence ribotyping to DNA microarray hybridization for assigning serotype to Salmonella enterica

Two DNA-based methods were compared for the ability to assign serotype to 139 isolates of Salmonella enterica ssp. I. Intergenic sequence ribotyping (ISR) evaluated single nucleotide polymorphisms occurring in a 5S ribosomal gene region and flanking sequences bordering the gene dkgB. A DNA microarray hybridization method that assessed the presence and the absence of sets of genes was the second method. Serotype was assigned for 128 (92.1%) of submissions by the two DNA methods. ISR detected mixtures of serotypes within single colonies and it cost substantially less than Kauffmann–White serotyping and DNA microarray hybridization. Decreasing the cost of serotyping S. enterica while maintaining reliability may encourage routine testing and research.


Introduction
Serotyping of Salmonella enterica ssp. I is the basis of national and international surveillance and communications, it facilitates determining associations between the pathogen and sources, and it gives some guidance in regards to preventing transmission (P. Fields, pers. commun.) (Foodnet, 2011). The historical method used to serotype S. enterica is the antibody-based Kauffman-White (KW) scheme (Kauffman & Edwards, 1952;Brenner et al., 2000). Positive results generate an antigenic formula based on structural details of the H-antigen of flagella and the O-antigen of lipopolysaccharide (H-and O-antigens, respectively) (Bopp et al., 1999;Popoff & Le Minor, 2001).
A major advantage of DNA analysis is that it is not impacted by variable expression of cell-surface antigens as are antibody-based agglutination assays like the KW scheme. Major obstacles to genome typing of S. enterica becoming broadly available include expense, the need for highly specialized equipment, and in some cases, sophisticated bioinformatics (Wattiau et al., 2011). A DNA-based method for assigning serotype to S. enterica at comparatively low cost and with readily accessible laboratory equipment commonly used for culturing and conducting the polymerase chain reaction (PCR) would be beneficial. A discrete region within S. enterica ssp. I was previously shown to differentiate closely related serotypes (Morales et al., 2006). The region of interest spans from the end of a 23S ribosomal gene, across a 5S gene and includes the last base pair preceding a tRNA aspU ribosomal gene neighboring dkgB (previously yafB) (Fig. 1). We wanted to know if dkgB-linked intergenic sequence ribotyping (ISR) would assign serotype similarly to an AOAC International certified DNA microarray hybridization method (DNAhyb) (Check & Trace by Checkpoints, Certificate 121001) (Malorny et al., 2003;Wattiau et al., 2008;Madajczak & Szych, 2010). The set of isolates examined were previously assigned a KW serotype and this historical information is included.

Salmonella enterica submissions included for analysis
The investigators providing the submissions listed in Table 1 reported serotype. In this laboratory, cultures were streaked on brilliant green (BG) agar (Acumedia; Neogen Corporation, Lansing, MI) and incubated for 24-48 h at 37°C to obtain well-separated large colonies. One colony was then transferred to brain heart infusion (BHI) broth (Acumedia) and incubated for 16 h at 37°C with shaking. For submissions that later appeared to have mixed cultures and for those with disagreement between methods, agglutination reactions for single colonies using commercially available absorbed antisera (Difco, BD, Franklin Lakes, NJ) were carried out, and in some cases, isolates were submitted for serotyping (Silliker, South Holland, IL). Thus, single colonies were processed by both ISR and DNAhyb (Check & Trace, Check Points, Wageningen, the Netherlands). In cases of disagreement between methods, a maximum of 10 well-isolated colonies were selected from agar plates and then transferred to BHI broth for individual analysis.

Determination of ISR
The locations where primers hybridize the reference genome in S. enterica ssp. I Enteritidis strain P125109 are shown in Fig. 1. Forward (ISR-F1) and reverse (ISR-R1) primers incorporate the rRNA-23S ribosomal ribose nucleic acid (RNA) region neighboring dkgB (previously known as yafB). Reference genomes and primers used in these analyses are listed in Tables 2 and 3. Primers ISR-F1 and ISR-R1 replaced previously published primers  ISRH-1 and ISRH-2 (Morales et al., 2006).
For primer amplification, DNA was extracted from 1 mL of pelleted cells using the PureLink Genomic DNA Mini Kit (Invitrogen Life Technologies, Grand Island, NY). One microliter of DNA was added to 29 Gene Amp Fast PCR Master Mix (Applied Biosystems, Foster City, CA) and 200 nM forward (ISR-F1) and reverse (ISR-R1) primers in a final volume of 30 lL. The PCR was performed on a Veriti 96 well Fast Thermal cycler (Applied Biosystems) as follows: 95°C for 10 s, 35 cycles at 94°C, 40 s at 64°C, and 10 s at 72°C. After confirmation of the predicted amplicon of approximately 1400 bp by gel electrophoresis, PCR products were purified using the QIAquick PCR purification kit (Qiagen, Valencia, CA). DNA concentrations were measured (NanoDrop, Wilmington, DE) prior to submitting PCR products for Sanger sequencing (Retrogen Inc., San Diego, CA) on an Applied Biosystems, Incorporated (ABI) Prism ® 3730 DNA Analyzer using primers ISRHfs and ISRHfr.
Analysis and naming of ISR sequences ISR sequences were aligned using SeqMan Pro of the Lasergene 8 software (DNASTAR, Madison, WI). Parameters were set to 100% minimum match percentage and a match size of at least 50 bp. Primers were designed to assure linkage to dkgB, which is required to assure that the correct region is under investigation. Serotype names were assigned to an ISR sequence when a 100% match was made to a reference sequence or when DNAhyb and KW serotyping agreed. Otherwise, ISR sequences are identified as 'UN' followed by four-digit numbers. The initial (5′ ATGTTTTGGCG 3′) and final (5′ CGGTGGAGCGG 3′) eleven nucleotides should be similar for ISRs, with the exception that the first nucleotide in the ISR sequence can sometimes be a cytosine rather than an adenine nucleotide.

DNAhyb assay
The DNAhyb protocol was performed as directed (Check & Trace) on single Salmonella colonies grown for 24-48 h on BG agar at 37°C. Large colonies are recommended to reach the recommended DNA concentration. Images of products were obtained on a single-channel ATR03 reader and processed by the Salmonella Check-Points software which indicates a serotype name, or alternatively, a genovar number. Images of spot patterns were discussed with the manufacturer in unusual situations, such as finding genetic variants of Salmonella Kentucky. In this case, three isolates were submitted to the source of the kit for independent verification that new variants were being ISR region (499 bp)   rrlH  rrfH  aspU  dkgB   ISRHfs   ISRHrs   ISR-F1 ISRH-1   ISR-R1  ISRH-2   299,500  299,000 300,000 300,500 Fig. 1. Description of the ISR region within the genome of Salmonella enterica serovar Enteritidis strain P125109 (GenBank AM933172). The nucleotide sequence of each ISR is serotype specific and size ranges from 257 to 530 bp. Sequence is required to assign serotype to a submission. An ISR region includes sequence from the end of the rrlH gene (rRNA-23S ribosomal gene) and the start of the aspU (tRNA-Asp) gene that is adjacent to the dkgB gene.  Assigning serotype to Salmonella enterica identified and that all positive and negative controls worked.

Results
ISR and DNAhyb assign serotype to S. enterica similarly Details from analysis of S. enterica by ISR and DNAhyb are shown in Table 1. Of the 139 submissions, 115 (82.7%) had substantial agreement between DNAhyb and ISR, as well as the reported KW serotype (Table 1a). Some genetic variation was noted in ISRs in this grouping, but serotype association was maintained in comparison with both DNAhyb and the KW scheme and thus these were counted as agreements. Further analysis of ISR variation showed that 15 named serotypes had at least three independent submissions. Of this group, 10 (66.7%) had uniform ISRs with no variation. Five (5) serotypes had multiple ISRs, namely Salmonella Infantis (two variants), Salmonella Typhimurium (two variants), S. Kentucky (two variants), Salmonella Newport (four variants), and Salmonella Montevideo (three variants). For the purpose of determining specificity and sensitivity in this study, results with disagreement about how much variation is accounted for by DNAhyb vs. ISR were counted as true negatives (TN), because the ISR method produced information somewhat different, but not necessarily in disagreement, to DNAhyb. In summary, 128 of the 139 isolates (92.1%) had agreement between DNAhyb and ISR (Table 1b + 1a). Detail about ISR variation within a single named serotype will be discussed in following text. For 10 submissions, the forward/reverse (F/R) sequences did not match. Individually processing multiple colonies for nine of these submissions in this category recovered a second serotype. The frequency with which a second serotype accounted for disagreement between genomic methods and KW serotype suggests that minority subpopulations are common. In addition, techniques appear to vary in their ability to detect multiple serotypes. Seven other submissions (25011, 25048, 25040, 25041, 25035, 26026, and 100304.53) had KW serotypes with O-antigens that did not match what was received for analysis but there was no disagreement in F/R sequences. In these cases, misinterpretation of the second cell-surface molecule flagella could have contributed to misinterpretation of serotype. Alternatively, these submissions could have also been mixed when initially examined for KW serotype and undergone separation of serotypes prior to analysis during the current study. For submissions 101116-10 and 101116-12, which were classified as Salmonella Fresno and ISR UN0019, O-antigen D2 epitopes would be expected to cross-react with factor 9 antisera used to detect D1 epitopes (Curd et al., 1998). Table 1c shows paired samples with disagreement between DNAhyb and ISR. Four of these submissions yielded mixed serotypes, namely 25043-1, 100723.01-1, 100723.02-1, and 100723-14.1. They are included in Table 1c to reflect the incidence with which disagreement was encountered. However, disagreements between DNAhyb and ISR and KW were resolved for these four isolates. The other seven submissions in Table 1c had a unique (UN) ISR sequence that was identified by DNAhyb as a genovar with an available reference sequence. However, slight differences in the ISR suggested genetic variants could be present as observed for the first group in Table 1a. These submissions were grouped in Table 1c because the serovar they might be associated with could not be further verified at this time. Acquisition of more isolates is needed to resolve the relationship between KW serotype, DNAhyb and ISR sequence in these cases.
ISR appears more sensitive than DNAhyb for the detection of genetic variation within serotype ISR appears to give a more sensitive assessment of serotype than DNAhyb. To explain further, some specific examples are cited. These are as follows: (1) ISR sometimes detected two types of S. Typhimurium, namely Typhimurium and Typhimurium 5-, whereas DNAhyb did not (Table 1, accession numbers 100304.57, 100304.74, 100304.32, 100304.62-2, 100616.9, 99163, 99164 and 99172). Thus, DNAhyb is not currently capable of detecting the 5-variant. ISR indicated that UN0014 was linked to the Typhimurium 5-KW serotype, but this correlation was not observed for all examples (see 100304.62-2). Thus, variation in expression of cellsurface epitopes may account for the 5-variant in addition to genetic variation as observed by ISR.
(2) Only one publicly available reference sequence out of 27 disagreed with assignment of serotyping by the three methods. The ISRs for the reference sequences of S. Schwarzengrund NC_011094 and NZ_ABEJ00000000 were 267 bp and no SNP differences were present. The strains of S. Schwarzengrund analyzed here had ISRs of 257 bp regardless of method used to assign serotype. Further analysis is required to explain this discrepancy, which could be due to strain variation or some discrepancy with annotation in the reference strain.
(3) Five submissions of S. Newport had different ISR sequences and four DNAhyb patterns despite having a single serotype assigned by the KW scheme. Alignment of the sequence of five ISRs from submissions that serotyped as S. Newport showed that ISR UN0017 had a deletion within the intergenic region between the end of the 23S and 5S rRNA genes (Fig. 2, Top).
(4) O-antigen group D submissions had an even more complex ISR outcome than did S. Newport (Fig. 2, Bottom). Alignment showed that, in comparison with S. Enteritidis, ISR UN0002 had a somewhat similar deletion as that seen occurring within Salmonella Pullorum in the intergenic region between the 5S rRNA and tRNA aspU. In this same region, ISR UN0008 has an insertion that was 100% similar to a region from S. Newport found in a different ribosomal region. The insert accounted for the exceptional length of the ISR. Specifically, there was a 146 bp insert in ISR UN0008 that was the same as base pairs 4165975-4166120 in the genome of S. Newport strain SL254 (NC_011080). Other serotypes that had ISR variants within a serotype included S. Kentucky, S. Montevideo, and S. Infantis.
ISR and DNAhyb are limited to assignment of serotype to S. enterica The limit of detection of ISR was found by analysis of S. Enteritidis submissions 22079, 21027, and 21046, which were included to as control strains because they were previously characterized by whole-genome analysis (Guard et al., 2011). Strains 21027 and 21046 are clonally related and are within the same phage type lineage 13a/8, whereas 22079 is phage type 4. Despite belonging to the same phage type, strains 21027 and 21046 are phenotypically distinct and are known to have 16 genes with altered open reading frames as well as other SNPs. All three subpopulations of S. Enteritidis had the same KW serotype, DNAhyb genovar, and ISR. Thus, the two DNA methods were equivalent in regards to determining serotype only.
Determination of sensitivity and specificity of ISR in comparison with DNAhyb Table 1 includes the category of each ISR for the purposes of determining sensitivity and specificity in comparison with DNAhyb. True positives (TP) were defined as submissions assigned a serotype by ISR in complete agreement with DNAhyb, TN indicated ISR was different for the serotype assigned by DNAhyb due to the presence of genetic variation (including mixtures of sero- Fig. 2. Alignment of ISR sequences to evaluate variation within serotype. (Top) Alignment of ISRs from Salmonella enterica serovar Newport. The shortest ISR shown is UN0017, which had a deletion occurring before the 5S ribosomal gene. The 5S ribosomal gene of Salmonella Enteritidis was included for reference purposes. (Bottom) Alignment of ISRs from the O-antigen group D serotypes of S. enterica. The submission with ISR UN0002 was S. Enteritidis by the KW scheme and Genovar 6660 by DNAhyb. ISR UN0008 was Salmonella Pullorum by the KW scheme and Genovar Gallinarum Pullorum 2978H by DNAhyb. Alignment of ISR sequences from Group D serotypes indicated that UN0002 is more like S. Pullorum. ISR UN0008 is more like S. Gallinarum or S. Enteritidis, except that it has a 146 bp insertion also found in S. Newport. types), false positives (FP) were assigned a serotype by ISR but not by DNAhyb, and false negatives (FN) should have returned an ISR sequence but did not. As all submissions were S. enterica ssp. I and produced an ISR and a DNAhyb genovar, the FN value is 0. Calculating sensitivity from the values 124/124 + 0 (TP/TP + FN) suggests unity (similar performance) of the two methods. Calculating specificity from the values 8/7 + 8 = 0.53 (TN/ FP + TN) suggests that DNAhyb is more specific, or in other words, it detects less genetic variation than does ISR. Removing submissions with mixtures did not change the finding that ISR appears to give more specific information than DNAhyb. Given that detection of new serotypes is a continuous process for S. enterica ssp. I, application of ISR has the potential to expand knowledge about diversity of serotypes. In these analyses, we used S. enterica serovar Enteritidis to provide a crucial control that shows ISR does not provide fine-scale differentiation achieved with whole-genome sequencing.

Discussion
A limit of detection of ISR is that it targets a single region of the bacterial chromosome. Homologous recombination and other genomic events that mobilize DNA could generate a hybrid strain with potential to alter the correlation between an ISR region and the rest of the chromosome (Porwollik & McClelland, 2003). Methods that target multiple regions around the bacterial chromosome, such as DNAhyb and whole-genome sequencing, will thus still be required for critical stages of analysis. The primary use proposed for ISR is to facilitate routine and inexpensive serotyping of S. enterica. The method has been applied to processing DNA samples from South America in cooperation with the United States, and further development of software that incorporates a validated database will streamline analysis for users (Pulido-Landinez et al., 2012). SNP analysis by ISR complements methods such as DNAhyb that evaluate the whole genome, and each genome method can be used to check the quality of results from the other.
Disagreement between the KW scheme and genotyping by either DNA method could be attributed to at least four causes with a biological or molecular explanation.
(1) Flagella H-antigen immunoreactivity may contribute disproportionately to interpretive differences between investigators; (2) A genetic variant may have a unique ISR or DNAhyb genovar that, in consensus with previous knowledge, is a genetic variant of an existing serotype; (3) Mixtures of serotypes could be present within cultures, which can be detected by some methods but not others.
(4) The most troublesome group was new variants with undefined relationships to named serotypes. ISR UN0002 (ISR 360 bp) and UN0008 (ISR 530 bp) were associated with submissions serotyped by the KW scheme as S. Enteritidis and S. Pullorum, respectively, despite their unique ISR sequences. Classifying them by the KW scheme as S. Enteritidis or S. Pullorum could have unintended consequences, because the biological impact of these strains on susceptible hosts is not known. For example, S. Pullorum on-farm can initiate depopulation of chickens in order to protect poultry health (http:// www.aphis.usda.gov/animal_health/animal_dis_spec/poul try/), whereas S. Enteritidis in people and foods can initiate control measures to protect human health (http:// www.fda.gov/Food/FoodSafety/Product-SpecificInformation/ EggSafety/EggSafetyActionPlan/ucm170746.htm). No information is available on the comparative virulence properties of UN0002 (ISR 360 bp) and UN0008 (530 bp) to S. Pullorum (ISR 361 bp) or S. Enteritidis (ISR 499 bp). Further research using biological assays is needed to characterize the virulence of strains identified by ISR as being potentially new strains of concern to either human or animal health.
Assay costs were from $10 to $12 per sample for ISR, $35 to $185 for KW serotyping and $50 for the method of DNAhyb used here. The point of comparison begins when a colony is identified on agar that is suspected of being Salmonella. The low cost and simplicity of conducting ISR make it a method that supports public health laboratories and food producers with in-house laboratories in their efforts to monitor S. enterica. Other efficiencies such as submission of DNA to centralized facilities and applying robotics for sample preparation may lower the cost of conducting ISR further. If a simple method for serotyping S. enterica is available, farm management and plant processors may test samples and monitor environments more frequently. The ability of ISR to detect a mixture of serotypes, its independence of cell-surface epitopes, cost, and simple software requirements are relative strengths. We suggest that it will a useful addition for assigning serotype at minimal cost rather than being another typing method with no clear advantage (Achtman, 1996;Achtman et al., 2012).