Development and characterization of a synthetic DNA, NUversa, to be used as a standard in quantitative polymerase chain reactions for molecular pneumococcal serotyping

Abstract Identification of Streptococcus pneumoniae and its more than 90 serotypes is routinely conducted by culture and Quellung reactions. Quantitative polymerase chain reactions (qPCRs) have been developed for molecular detection, including a pan-pneumococcus lytA assay, and assays targeting 79 serotypes. Reactions require genomic DNA from every target to prepare standards, which can be time consuming. In this study, we have developed a synthetic DNA molecule as a surrogate for genomic DNA and present new single-plex qPCR reactions to increase molecular detection to 94 pneumococcal serotypes. Specificity of these new reactions was confirmed with a limit of detection between 2 and 20 genome equivalents/reaction. A synthetic DNA (NUversa, ∼8.2 kb) was then engineered to contain all available qPCR targets for serotyping and lytA. NUversa was cloned into pUC57-Amp-modified to generate pNUversa (∼10.2 kb). Standards prepared from pNUversa and NUversa were compared against standards made out of genomic DNA. Linearity [NUversa (R2 > 0.982); pNUversa (R2 > 0.991)] and efficiency of qPCR reactions were similar to those utilizing chromosomal DNA (R2 > 0.981). Quantification with plasmid pNUversa was affected, however, whereas quantification with synthetic NUversa was comparable to that of genomic DNA. Therefore, NUversa may be utilized as DNA standard in single-plex assays of the currently known 94 pneumococcal serotypes.


INTRODUCTION
Streptococcus pneumoniae (the pneumococcus) often causes life threatening infections, such as pneumonia, septicemia and meningitis (Klugman, Madhi and Albrich 2008;O'Brien et al. 2009;van der Poll and Opal 2009). Pneumococcal disease (PD) kills ∼800 000 people, mostly children, every year worldwide (O'Brien et al. 2009). To reduce the burden of PD, a 7-valent pneumococcal conjugate vaccine (PCV) was introduced in the USA in 2000 and was replaced by a 13-valent pneumococcal conjugate vaccine (PCV13) in 2010. Moreover, PCV has been introduced in many parts of the world including European countries. The introduction of these vaccines has reduced the burden of PD caused by vaccine serotypes on a global scale and has also decreased nasopharyngeal carriage of pneumococcal vaccine types in vaccinated populations (Simonsen et al. 2014). There has been a modest increase of PD caused by non-vaccine types (NVT) since the introduction of PCV. In a phenomenon called serotype replacement, these strains have replaced vaccine-type (VT) strains in the nasopharynx, resulting in pneumococcal carriage rates similar to those observed prior to the introduction of vaccines (Singleton et al. 2007;Weinberger, Malley and Lipsitch 2011;Feikin et al. 2013).
The phenomenon of serotype replacement, and therefore the increase in prevalence of strains not included in current pneumococcal vaccines, might be of concern. NVT strains include more than 80 different serotypes. Monitoring the distribution of pneumococcal vaccine and non-vaccine serotypes is important for predicting the effectiveness of current vaccines, and might also be necessary for determining future vaccine formulations.
The Quellung reaction is the gold standard method for pneumococcal serotyping. Reactions utilize specific antibodies produced against the capsular polysaccharide (cps). In a positive reaction, antibodies produce 'swelling' of the pneumococcal capsule that can be observed under the microscope. While the Quellung reaction is proven, a number of molecular methods have been developed during last few years for molecular serotyping (Azzari et al. 2010;Turner et al. 2011;Azzari et al. 2012;Pimenta et al. 2013;Sakai et al. 2015;Messaoudi et al. 2016). Molecular methods are faster than the Quellung reactions, highly sensitive and specific, and adaptable to high-throughput platforms such as microarrays or TaqMan array cards (TACs). Satzke et al. recently evaluated molecular methods for molecular serotyping in a comprehensive, multi-center, comparative study . Real-time quantitative polymerase chain reaction (qPCR) proved to be highly specific and sensitive, with limits of detection (LOD) of reactions between ∼2 and ∼20 genome equivalents per reaction. qPCR reactions allow for the quantification of the bacterial load and can be utilized directly with DNA purified from nasopharyngeal specimens. The downside of qPCR reactions is that each reaction requires a specific DNA standard to construct the regression curve for the translation of real-time PCR results to genome equivalents; however, the exact whole genome size of many serotypes is frequently unavailable.
We, as well as others including the Centers for Disease Control and Prevention (CDC), have previously reported a series of serotype-specific single-plex and multiplex qPCR assays (Azzari et al. 2010;Azzari et al. 2012;Pimenta et al. 2013;Sakai et al. 2015;Messaoudi et al. 2016;Pholwat et al. 2016). Altogether, the reactions detect all 13 PCV types and 66 NVT strains, demonstrating high sensitivity and providing the absolute densities of individual serotypes in each sample. Although there is great progress with serotype coverage of qPCR reactions, there are still a number of serotypes for which molecular quantitative reactions are not available. In this study, we developed eleven novel single-plex qPCR assays for the detection and quantification of NVT strains. Furthermore, we engineered a synthetic DNA fragment containing sequences for lytA and for 94 pneumococcal serotypes, including targets for all qPCR assays developed thus far for pneumococcal serotyping. The synthetic DNA was cloned into a plasmid, hereafter called pNUversa, and was further characterized to be used as a standard for all qPCR reactions. Using pNUversa DNA as a standard allowed detection, but the plasmid's conformational structure affected quantification. A linearized pNUversa-a PCR product generated using pNUversa DNA-restored efficiency of quantitative reactions to levels similar to those achieved when using chromosomal DNA as a standard.

Purification of DNA and preparation of DNA standards for validating qPCR assays
Strains were grown overnight on blood agar plates, and DNA was extracted using the QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) as follows. Pneumococci were resuspended in trisethylenediaminetetraacetic acid (TE) buffer (10 mM tris-HCl, 1 mM ethylenediaminetetraacetic acid, pH 8.0) containing 40 mg/ml lysozyme and 75 U/ml mutanolysin (Sigma-Aldrich Co., Saint Louis, Missouri, USA). This suspension was incubated at 37 • C for 1 h after which 20 μl of proteinase K were added and incubated for 30 min at 56 • C. Then, 4 μl of RNase A (100 mg/ml) (Qiagen, Hilden, Germany) were added to the suspension and incubated 5 min at room temperature. The following steps were conducted as outlined by the manufacturer. The eluted DNA was quantified using the Nanodrop system (Nanodrop Technologies, Wilmington, Delaware, USA). DNA was diluted with TE buffer to obtain DNA standards of the following amounts per reaction: 1 ng, 100 pg, 10 pg, 1 pg, 100 fg, 50 fg and 5 fg. Considering the genome size of the reference strain TIGR4, 2.16 Mb (Tettelin et al. 2001), the approximate genome equivalent for each DNA standard was: 4.29 × 10 5 , 4.29 × 10 4 , 4.29 × 10 3 , 4.29 × 10 2 , 4.29 × 10 1 , 2.14 × 10 1 and 2.14 genome equivalents, respectively. The efficiency (i.e. regression curves) of reactions using the above standards was evaluated in each run. Primers and probes were optimized for each quantitative reaction. The LOD was determined from the lowest concentration standards that achieved a positive reaction in each assay. The Ct-value cut-off to distinguish a positive reaction from a negative reaction was 40.

Streptococcus pneumoniae
Novel qPCR assays for serotypes 10CF, 11BC, 16A, 17A, 17F, 19C, 24BF, 28AF, 32AF, 33C and 48 were developed. Systematic design of the new qPCR assays was essentially done as detailed in our previous study (Sakai et al. 2015). Briefly, the pneumococcal cps locus was obtained from the GenBank Database. Accession numbers are listed in Table 1. Specificity in silico was analyzed using the National Center for Biotechnology Information-Basic Local Alignment Search Tool. Once specific sequences were selected within each locus, assays were designed utilizing software from Integrated DNA technologies (http://www.idtdna.com/site). All qPCR primers and probes were synthesized at Sigma-Aldrich Co. The assays were performed utilizing Platinum Quantitative PCR Super Mix-UDG (Life Technology, Carlsbad, California, USA) and 2.5 μl of pure DNA as template in a CFX96 real-time PCR-detection system (Bio-Rad, Hercules, California, USA).

Specificity of newly developed qPCR assays
Specificity was investigated as previously described (Sakai et al. 2015). In short, two DNA libraries were prepared: one containing DNA from all known pneumococcal serotypes (N = 94) and a second library containing DNA from related bacterial species that inhabit the upper respiratory airways (N = 20). Purified DNA preparations were adjusted to 40 pg/μl, and 150 μl of each DNA was transferred to a 96-well microtiter plate and stored at -80 • C until use. Specificity testing utilized 2.5 μl of DNA, equivalent to 100 pg (i.e. 4.29 × 10 4 genome equivalents), which was transferred from the libraries to each reaction plate. Reactions were run in a CFX96 real-time PCR-detection system.

Cloning and extraction of synthetic vector containing all qPCR target sequences for serotyping
The qPCR target sequences of all serotype-specific single-plex assays (N = 66), as well as lytA, were sequentially assembled into a single sequence using the DNASTAR Lasergene software version 11.2.1 (DNASTAR Inc., Madison, Wisconsin, USA). The final nucleotide sequence (∼8.2 kb) was synthesized and then subcloned within pUC57-Amp-modified (GenScript, Piscataway, New Jersey, USA). The synthetic vector (∼10.9 kb), named pNUversa ( Fig. 1), was transformed into TOP10 competent Escherichia coli (Invitrogen, Carlsbad, California, USA) according to the manufacturer's instructions. pNUversa was purified from E. coli stocks using the QIAprep Spin Miniprep Kit (Qiagen, Hilden, Germany), eluted in EB buffer and quantified using the Nanodrop system. pNUversa was utilized as template in PCR reactions to produce the synthetic product, hereafter called NUversa.

Long PCR and purification of the synthetic DNA (NUversa)
To produce the synthetic DNA fragment NUversa, long PCR was conducted with plasmid DNA from pNUversa as the template, platinum Taq DNA polymerase High Fidelity (Invitrogen, Carlsbad, California, USA) and the following primers whose com-plementary sequences are located upstream and downstream of NUversa: lytA re (5 -TCGTGCGTTTTAATTCCAGCT-3 : 200 nM) and Sero48 re (5 -ATCGGCCAAAAGTTATCATTAGC-3 : 200 nM). Reaction conditions were 30 s at 94 • C for initial denaturation, followed by 35 cycles of 15 s at 94 • C, 30 s at 60 • C and 9 min at 68 • C for amplification. PCR product was confirmed with electrophoresis in 1% agarose gel (Standard Low -m r Agarose, Bio-Rad Laboratories Inc., Hercules, California, USA) with SYBR safe staining (Invitrogen, Carlsbad, California, USA). The obtained NUversa product was purified with the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions, eluted with EB buffer, evaluated for quality and quantified using the Nanodrop system. The purified NUversa product was stored at -80 • C until use.

Validation and optimization of quantitative PCR assays
Eleven newly designed qPCR assays (Table 1) were validated for their specificity using DNA from 94 pneumococcal serotypes and DNA from 20 non-pneumococcal species, most of which are inhabitants of the human upper airways, including several streptococcal species. Reactions were 100% specific as they only detected DNA in wells where the target pneumococcal serotype was added (not shown). Ct values of a typical positive reaction were between 20 and 30 cycles. Table 1 shows the optimal concentration that yielded a recommended reaction efficiency of between 90% and 110% and a correlation coefficient of nearly 1.000.
The LOD was also evaluated for each assay using an approach including serial dilution. Quantitative assays demonstrated an Probes were labeled at 5 with FAM (6-carboxyfluorescein) and at 3 with BHQ1 (Black Hole Quencher-1).  LOD of 5-50 fg per reaction, or ∼2-∼20 genome equivalents per reaction. Altogether, these new qPCR assays demonstrated high specificity, sufficient reaction efficiency to be utilized for quantification and detection of very low genome equivalents.

Efficiency of quantitative reactions using NUversa is similar to those obtained using genomic DNA
We engineered a synthetic DNA molecule containing targets for most, if not all, qPCR reactions available in the literature (N = 66) including the lytA target and those sequences presented in this study that together target 94 pneumococcal serotypes. The synthetic DNA molecule was cloned into plasmid pUC57-Ampmodified and this new plasmid was named pNUversa (Fig. 1). The sequence of pNUversa was deposited in the GenBank with accession number MF540153. Purified pNUversa was then utilized as a template in qPCR reactions to generate linear DNA, called NUversa, which was used as a standard in lytA-based qPCR reactions. We first evaluated the quantitative nature of qPCR reactions targeting the lytA gene using NUversa DNA standards, and compared these results with those obtained utilizing the traditional Figure 2. Linearity of qPCR reactions utilizing NUversa or genomic DNA. NUversa (orange) and genomic DNA (blue) were serially diluted to obtain seven standards (detailed in Methods) spanning 5 through 1.13 × 10 6 and 2 through 4.29 × 10 5 genome equivalents, respectively. Genome equivalent standards were utilized as template in qPCR reactions targeting the lytA gene. Plots represent the mean of cycles (Ct) of threshold values. Standard errors were calculated from four different replicates and each replicate included duplicate reactions. Regression equations (y), coefficient of determination (R 2 ) and reaction efficiency are shown in the insets. standard genomic DNA from TIGR4. Efficiency of reactions from four independent experiments (i.e. each experiment performed in duplicate) with NUversa (94.1 ± 1.3) or genomic DNA (94.1 ± 3.2) fell within the recommended efficiency of >90% and <110% (Table 2). Regression curves showed a similar R 2 , 0.9993-1.000 when using TIGR4 genomic DNA and 0.991-0.997 when NUversa PCR product was utilized (Fig. 2). Standard deviations of Ct values obtained with the different standards ranged from 0.2555 to 0.3621 when utilizing TIGR4 genomic DNA, whereas those utilizing NUversa ranged from 0.4904 to 1.4602. Standard error of the means was 0.1278-0.1811 with TIGR4 genomic DNA and 0.2452-1.0325 with NUversa PCR product (Table 2). Further statistical analysis showed no significance when R 2 , the slope or y-intercept were compared by two-tailed t-Test (P > 0.05, all cases).

The synthetic DNA (NUversa), but not the plasmid, can be utilized as standards for serotype-specific qPCR reactions
Given that current real-time platforms are utilizing plasmid DNA as standards, we next investigated whether NUversa and pNUversa (i.e. the plasmid encoding NUversa) could be utilized as DNA standards for serotype-specific single-plex qPCR reactions. Serial dilutions of genomic DNA, pNUversa or NUversa were made and utilized as template in serotype-specific qPCR reactions targeting PCV13 serotypes, (i.e. 1, 3, 4, 5, 6A, 6B 7F, 9V, 14, 18C, 19A, 19F and 23F). Because most strains from where the genomic DNA was extracted have not been genome sequenced, the approximate genome equivalent was estimated using the whole genome size of serotype 4 strain TIGR4 (Tettelin et al. 2001). As shown in Table 3, Fig. 3 and Supplementary Fig. S1, reaction efficiency of most qPCR assays using any of the standards fell within the recommended efficiency of >90% and <110%. The y-intercept, however, was only similar in reactions utilizing genomic DNA and NUversa, whereas reactions that utilized the plasmid pNUversa were statistically different to those with genomic DNA and those with NUversa (Table 3). Serotypes of strains for chromosomal DNA preps were 1, 3, 4, 5, 6A, 7F, 9V, 14, 18C, 19F, 19A and 23F b Set of parameter values from NUversa and pNUversa reactions were statistically compared to values from chromosomal DNA reaction with unpaired, two-tailed t-test. Figure 3. Linearity of qPCR reactions utilizing genomic DNA, NUversa or plasmid pNUversa. Genomic DNA (blue), NUversa (orange) or pNUversa (green), were serially diluted to obtain seven (genomic DNA and NUversa) or eight (pNUversa) standards (detailed in Methods) spanning 2 through 4.29 × 10 5 , 5 through 1.13 × 10 6 and 4 through 8.52 × 10 6 genome equivalents, respectively. Genome equivalent standards were utilized as template in serotype-specific qPCR reactions targeting (A) serotype 1, (B) serotype 3, (C) serotype 4 or (D) serotype 5. Plots represent the mean of cycles of threshold values obtained from duplicate reactions. Regression equations, coefficient of determination (R 2 ) and reaction efficiency are shown in the insets.

DISCUSSION
In this study, NUversa, a synthetic DNA molecule was thoroughly characterized for its use as a standard in quantitative single-plex reactions for PCV13 pneumococcal serotypes including the pan-pneumococcus lytA assay. NUversa has the potential to be utilized in reactions targeting most pneumococcal serotypes. We also developed, tested and optimized eleven new real-time qPCR assays for the detection and quantification of 16 pneumococcal serotypes/serogroups belonging to NVT. While these NVT strains are less likely to be isolated from cases of PD, the emergence of NVT due to serotype replacement after vaccine introduction warrants development of assays for the detection of potential emergent strains and their use in epidemiological surveillance programs (Wyres et al. 2013;Mosser et al. 2014;van der Linden et al. 2015).
To the best of our knowledge, many of the molecular assays (i.e. conventional PCR, qPCR, etc.) for the detection of target pneumococcal serotypes that were presented in this study have not previously been available. These new reactions, along with other qPCR reactions published by other groups, including the CDC, complete the coverage of the 94 pneumococcal serotypes for which an antiserum is available. We acknowledge that single-plex reactions for pneumococcal serotyping have a limited use in epidemiological studies because of the associated cost, need for increased amounts of DNA sample and increased time required to perform single-plex assays. However, validation studies conducted in this manuscript will allow qPCR assays, primers and probes described here to be incorporated into high-throughput technology such as multiplex qPCR schemes (Pimenta et al. 2013) or a recently developed TAC for pneumococcal serotyping (Pholwat et al. 2016). The TAC, for example, detects 78 different pneumococcal serotypes. It utilizes either DNA from Streptococcus pneumoniae isolates or DNA purified from clin-ical specimens as a template. This method yields fast and comprehensive serotyping in comparison to the standard method of Quellung reactions, but currently misses detection of >10 pneumococcal serotypes. The new reactions developed and validated here can be added to the TAC method, which will increase the number of molecular targets to 94 pneumococcal serotypes.
Another contribution of the studies presented within this manuscript was the characterization of NUversa and pNUversa. NUversa includes sequences from 67 qPCR targets that together detect 94 serotypes and performs as well as genomic DNA when utilized as a DNA standard in qPCR lytA-based reactions. This synthetic DNA can be utilized as a positive control or as a DNA standard for quantitative reactions in any laboratory where single-plex qPCR is performed. Moreover, with further optimization, NUversa could also be used as a control in multiplex reactions. Supporting this, we tested NUversa in two different multiplex reactions published by the CDC (Pimenta et al. 2013), and NUversa was detected by all reactions within these multiplex reactions (not shown).
Using NUversa will allow a faster turnaround of data and findings, and it will be particularly helpful in resource limited settings, where maintaining a strain library required for current methods is prohibitively expensive and time consuming. To calculate genome equivalents per mass units (i.e. in silico), the genome size of the strain utilized to prepare DNA standards is required. A reference, genome-sequenced, TIGR4 strain has been classically utilized to prepare standards for quantitative reactions targeting the pan-pneumococcus lytA qPCR assay (Tettelin et al. 2001;Carvalho Mda et al. 2007). This strain has been widely distributed among research laboratories, and it is accessible to clinical laboratories. However, widely accessible pneumococcal strains belonging to all known serotypes to be utilized as a standard for serotype-specific single-plex qPCR reactions represent a challenge. Given that NUversa also contains the target for the pan-pneumococcus lytA assay, quantification using NUversa will be standardized in all studies, avoiding the use of subjective quantitative numbers obtained from DNA of poorly characterized strains as a DNA standard for constructing the standard curve. pNUversa, the plasmid encoding the NUversa synthetic sequence, was designed in a manner that allows for more sequences to be cloned into it if more serotypes are discovered in the years to come, or if new improved reactions, i.e. targets, become available for the current serotypes.
Different multiplex molecular platforms for the detection of human pathogens have been developed during the last few years (Kodani and Winchell 2012;Pholwat et al. 2015). These platforms commonly utilize a plasmid containing targets for reactions detected by the systems. According to Hou et al., however, a circular plasmid is not recommended as a quantification standard for real-time PCR assays because it frequently forms supercoiled structures; this unknown conformational structure may restrict polymerase reaction and cause delayed Ct of corresponding standards (Hou et al. 2010).
In the current study, detection of target sequences in a plasmid (pNUversa) did not pose major issues but the quantitative nature of this system was proven to be inaccurate and required a synthetic linear DNA molecule. Absolute quantitative assays for PCV13 serotypes were conducted to compare the reactivity of chromosomal DNA and two types of synthetic genes: a circular plasmid and NUversa PCR product. Whereas the regression curves generated from chromosomal DNA were highly similar to those from NUversa PCR product, a significant discrepancy was found between the y-intercepts of the chromosomal DNA and the y-intercept of circular pNUversa reactions. These high y-intercept values with pNUversa caused apparent overestimation of quantification values when compared to both chromosomal DNA and NUversa. Using a plasmid as standard for quantification purposes, at least in this case, overestimates the quantity of corresponding Ct values. Together, these data indicate that synthetic NUversa is as efficient as genomic DNA to quantify genome equivalents in serotype-specific qPCR reactions. We, however, did not perform qPCR assays for the remaining NVT that can be a limitation of our study. Another limitation includes the diversity of normal flora strains utilized for the validation studies. Whereas specificity was investigated using DNA from all 94 serotypes, 20 different normal flora streptococci and other respiratory bacteria, we did not include all potential bacterial species that can colonize the human upper airways.
In this study, the whole genome size of serotype reference strains was estimated utilizing the genome size of TIGR4. However, this estimation may cause inaccurate quantification since the whole genome size might vary between serotypes. For further assessment of this synthetic DNA, lytA qPCR assays with TIGR4 chromosomal DNA and NUversa PCR product were compared. Since the genome sizes of both DNA preparations were available, specific genome equivalents for each standard could be estimated. The obtained regression curves from these two sources of standards were highly similar and no significant (i.e. statistically) discrepancy was observed. This result emphasized the significance of linearity and accurate gene size for absolute quantification, and demonstrated that NUversa PCR product can be utilized instead of chromosomal DNA in the assay (Hou et al. 2010).
In summary, the synthetic NUversa DNA and the new singleplex reactions included in this study will be useful to many in the field conducting surveillance of the pneumococcus and pneu-mococcal serotypes by qPCR. Integrating NUversa and these qPCR assays with other previously reported high-throughput systems, such as the TAC system, will improve investigations of disease trends in this age of serotype replacement postvaccination. These advancements will greatly contribute to the ongoing monitoring and evaluation of pneumococcal immunization programs around the world.