-
PDF
- Split View
-
Views
-
Cite
Cite
Emilie Lefèvre, Courtney M Gardner, Claudia K Gunsch, A novel PCR-clamping assay reducing plant host DNA amplification significantly improves prokaryotic endo-microbiome community characterization, FEMS Microbiology Ecology, Volume 96, Issue 7, July 2020, fiaa110, https://doi.org/10.1093/femsec/fiaa110
Close - Share Icon Share
ABSTRACT
Due to the sequence homology between the bacterial 16S rRNA gene and plant chloroplast and mitochondrial DNA, the taxonomic characterization of plant microbiome using amplicon-based high throughput sequencing often results in the overwhelming presence of plant-affiliated reads, preventing the thorough description of plant-associated microbial communities. In this work we developed a PCR blocking primer assay targeting the taxonomically informative V5-V6 region of the 16S rRNA gene in order to reduce plant DNA co-amplification, and increase diversity coverage of associated prokaryotic communities. Evaluation of our assay on the characterization of the prokaryotic endophytic communities of Zea mays, Pinus taeda and Spartina alternifora leaves led to significantly reducing the proportion of plant reads, yielded 20 times more prokaryotic reads and tripled the number of detected OTUs compared to a commonly used V5-V6 PCR protocol. To expand the application of our PCR-clamping assay across a wider taxonomic spectrum of plant hosts, we additionally provide an alignment of chloroplast and mitochondrial DNA sequences encompassing more than 200 terrestrial plant families as a supporting tool for customizing our blocking primers.
INTRODUCTION
Over the last decade, the increasing amount of surveys applying next-generation sequencing technologies to the study of microbial communities has unquestionably improved our appreciation of the extent of the microbial taxonomic and functional diversity, as well as the importance of the microbial world for ecosystem functioning (Delgado-Baquerizo et al. 2016). Plant-associated microbial communities are no exception to this trend. Surveys describing the structure and diversity of plant-associated fungal and prokaryotic communities across various host species and habitats have led to the realization that they are a source of incredible biological diversity and are essential to key ecosystem processes (Moyes et al. 2016; Laforest-Lapointe et al. 2017; Fitzpatrick et al. 2018a; Christian, Herre and Clay 2019). However, the taxonomic characterization of plant microbiome using direct amplicon sequencing can pose important methodological challenges. Due to the sequence homology between bacterial 16S rRNA gene and host chloroplast and mitochondrial DNA, universal prokaryotic PCR primers are unable to discriminate well between plant and prokaryotic DNA, which often leads to the overwhelming presence of plant-affiliated reads in the resulting DNA sequence data sets. Indeed, chloroplasts and mitochondria derive from the acquisition of photosynthetic prokaryotic endosymbionts and α-proteobacteria, respectively, by other unicellular organisms (Howe et al. 2008; Morley and Nielsen 2017). Therefore, the presence of reads affiliated to cyanobacteria and Rickettsiales in sequence data sets is likely indicative of chloroplast and mitochondrial DNA contamination. The removal of such reads, though possible during posterior in silico sequence analyses, unfortunately often leaves an insufficient number of sequences to allow proper coverage of the actual microbial diversity.
While co-amplification of plant DNA has been reported in many plant tissue and species, e.g. Phloem of stone fruit trees (Eichmeier et al. 2019), potato tuber (Kõiv et al. 2015), latex of Euphorbia spp. (Gunawardana et al. 2015), Sugarcane stalks (de Souza et al. 2016), grapevine, broccoli and Medicago truncatula roots (Faist et al. 2016; Yaish et al. 2016; Gadhave et al. 2018), barley seeds (Yang et al. 2017), the problem is clearly more critical in plant above-ground green tissue (Arenz et al. 2015; Müller et al. 2015; de Souza et al. 2016; Jakuschkin et al. 2016), where organellar DNA is usually present in higher quantity. Indeed, it has been estimated that mesophyll cells contain an average of 50 chloroplasts, each containing up to 1000 genome copies (Cole 2016). Mesophyll cells can also harbor between ∼200 to 600 mitochondria (Cole 2016), each containing from 2 to 24 copies of their genome (Oldenburg, Kumar and Bendich 2013). Thus, each single leaf cell can contain up to 50 000 and 14 400 copies of chloroplast and mitochondrial genome, respectively. In a survey examining the microbial communities associated with sugarcane, De Souza et al. (2016) observed indeed that the level of plant-DNA contamination was clearly more important in leaves and upper stalks than in roots. Moreover, though co-amplification of plant DNA can also occur in epiphytic and rhizospheric microbial community studies (Kembel et al. 2014; Fitzpatrick et al. 2018a), it is usually much more problematic for the study of endophytic communities (de Souza et al. 2016). Additionally, given that the bacterial load in plant tissue is usually not as high as in other habitats (e.g. two orders of magnitude lower than in soil, Friesen et al. 2011; Pershina et al. 2015), the ratio ‘plant DNA: bacterial DNA’ can be extremely high, resulting in the preferential amplification of plant DNA relative to prokaryotic DNA.
The issue of contaminant DNA co-amplification is not exclusive to plant microbiome studies though, and techniques aiming to reduce co-amplification of non-target DNA have been developed in other research areas. Among the most frequently mentioned is the PCR-clamping approach, which uses modified nucleic acid oligomers to block the amplification of non-target DNA. This approach has been successfully applied to diet studies, to exclude predator DNA (Vestheim and Jarman 2008; Terahara et al. 2011; Leray et al. 2013; Piñol et al. 2015; Robeson et al. 2018; Su et al. 2018), in biological anthropology, to exclude modern human DNA (Gigli et al. 2009; Boessenkool et al. 2012), and in ecological studies, to exclude non-target organisms (Von Wintzingerode et al. 2000; Powell et al. 2012; Wilcox et al. 2014; Tan and Liu 2018; Banos et al. 2018; Clerissi et al. 2018). Using this approach, blocking non-target DNA is achieved either through the addition of blocking primers, which are similar to conventional PCR primers, but to which a C3 spacer is incorporated at the 3' end (Vestheim and Jarman 2008), or PNA clamps, which are uncharged nucleic acids analogs able to form stable and strong duplexes with DNA (Karkare and Bhatnagar 2006). Because neither is recognized by the Taq DNA polymerase, their specific annealing to the non-targeted DNA prevents its amplification (Fig. 1). Although to a much lesser extent, the approach has also been proposed for plant microbiome studies (Lundberg et al.2013; Arenz et al. 2015). Lundberg et al. (2013) who designed a PCR PNA clamping assay for the suppression of plant chloroplast and mitochondrial DNA, were able to reduce the proportion of plant-affiliated reads from ∼95 to 25% in Arabidobsis thaliana leaves, and its application to sugarcane microbiome also resulted in a substantial reduction of plant reads (de Souza et al. 2016). However, applied to the microbiome of other plant taxa, the effectiveness of the assay was not as pronounced, suggesting PNA clamps must be customized to host genomes. Lundberg et al. (2013) had to discard over 95% of plant-affiliated reads obtained from Oryza sativa leaves despite the addition of the PNA clamps, which left an absolute number of bacterial reads per sample below 1000. Liu et al. (2018), and Carrell, Carper and Frank (2016) used the same protocol in apple tree and conifers, respectively, and obtained an even lower sequencing depth (720 ± 684 and 42–493 reads per sample, respectively). In fact, Fitzpatrick et al. (2018b) recently confirmed that the blocking efficiency of the chloroplast-specific PNA clamp originally designed by Lundberg et al. (2013), highly varied across plant taxa (ranging from 6% to 100%). Similarly, corn mitochondrial and chloroplast DNA blocking primers designed by Moronta-Barrios et al. (2018) although effective when the hosts presented a high endophytic infection load presented a weak blocking efficiency when endophytic infection was low, indicating that despite the above-mentioned efforts, there is still an important need for the development of tools improving the study of plant-microbiome communities.
Mode of action of competitive PCR clamping using blocking primers and PNA clamps. In this illustration, blocking primers and PNA clamps specific to the plant DNA overlap the binding site of the universal prokaryotic primer, preventing its annealing to the% plant DNA. Because the Taq polymerase does not recognize either the C3 spacer at the 3’end of the blocking primer or the PNA-DNA duplex, it cannot extend from it, hence preventing the amplification of plant DNA. (A), Co-amplification of plant and endophytic DNA when neither blocking primers nor PNA clamps are added to the PCR mix. B. Preferential amplification of endophytic DNA when blocking primers (B)1 or PNA clamps (B)2 specific to the plant DNA are added to the mix.
In this context, the objective of our work was to develop a PCR-clamping assay to improve further the characterization of plant-associated prokaryotic communities. Zea mays, Pinus taeda and Spartina alternifora, were used to validate the PCR-clamping protocol designed here. As co-amplification of plant DNA has been shown to be more problematic within plant green tissue (de Souza et al. 2016), we chose to validate our assay on the endophytic microbial community inhabiting the phyllosphere of these three hosts. To this aim, we first evaluated the potential of three commonly targeted regions of the 16S rRNA gene (i.e. V1–V2, V3–V4 and V5–V6) for the design of our PCR-clamping assay. Second, within the selected region, we identified a suitable locus for the design of blocking primers and PNA clamps specific to plant chloroplast and mitochondrial DNA. Third, we determined which type of blocking oligomers (i.e. blocking primers or PNA clamps) performed best in reducing plant DNA co-amplification. Finally, we provided an alignment of chloroplast and mitochondrial DNA sequences encompassing 203 terrestrial plant families as a supporting tool for the customization of our PCR-clamping assay allowing its application across a wider taxonomic spectrum of plant hosts.
MATERIALS AND METHODS
Sampling and DNA extraction
Leaf tissue from corn (Zea mays, F. poaceae), loblolly pine (Pinus taeda, F. pinaceae) and saltmarsh cordgrass (Spartina alterniflora, F. poaceae) was collected for the purpose of this study. Corn leaves were collected in May 2015 from 56-day-old individuals growing in a greenhouse on Duke University campus (Durham NC, US). Pine needles were collected in August 2016 from trees growing on Duke University campus. Cordgrass leaf material was collected in May 2016 in Bay Jimmy (Louisiana, US) from a site affected by the 2010 Deepwater Horizon oil spill (Lumibao et al. 2018). To remove epiphytic communities from leaf surfaces, plant material was surface-sterilized in successive baths of 95% ethanol for 10 sec, 0.5% sodium hypochlorite for 2 min, and 70% ethanol for 2 min and air-dried under a laminar flow hood for at least 15 min. About 50 mg of Z. mays leaf material was stored at −80°C. P. taeda needles were cut into fragments of approximately 2 mm2 and 96 of these fragments were stored at −80°C. Finally, S. alterniflora leaf material was also cut into fragments of approximately 2 mm2 and 96 of these fragments were stored in 1 ml of CTAB solution (2% CTAB, 0.02 M EDTA, 0.1 M Tris, 1.4 M NaCl final concentrations) at −20°C. For the DNA extraction procedure, plant material was transferred into sterile aluminum dishes (for S. alterniflora, the CTAB solution was previously removed by pipetting) and dried in a laboratory oven for an hour at 55°C. Dry plant material was then transferred into a sterile porcelain mortar, ground first in liquid nitrogen for 1 min using a pestle, and a second time with additional liquid nitrogen for 30 sec. From this step on, the ground material was processed for microbial genomic DNA extraction using the MoBio PowerSoil DNA isolation kit (MoBio, Carlsbad, CA, US) following manufacturer's instructions. Isolated DNA was re-suspended in 50 μl of molecular water and quantified using a Qubit 2.0 fluorometer. In order to obtain highest DNA yields, tissue storage and cell disruption methods were optimized for each host considered in this study and though these methods differed between hosts, the objective of the present study was not to compare host species, but compare the efficiency of the PCR-clamping approach, for each host.
MiSeq Library preparation for the evaluation of regions assays
The four bacterial universal primer sets, 27F-338R, 338F-533R, Pro341F-Pro805R and 799F-1115R targeting the V1–V2, V3, V3–V4 and V5–V6 regions of the 16S rRNA gene, respectively, were used to amplify the endophytic communities inhabiting the leaves of Z. mays, P. taeda and S. alterniflora (Fig. 2). Each primer was modified with the Illumina overhang adapter sequences (in bold, 3´-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-forward_primer-5´ and 3´-GTCTCGTGGGCTCGGAGATGTGTATA-AAGAGACAG-reverse_primer-5´). For each of the 12 samples (4 primer sets × 3 hosts), PCR reactions were performed in triplicate (36 PCR reactions total, Fig. S1a, Supporting Information). Each 25 μl-PCR reaction contained 25 ng of template DNA, 0.2 μM of each primer, 12.5 μl of 2X KAPA HiFi MasterMix (Kapa Biosystems, Wilmington, MA, US), and was performed on a BioRad T100 thermocycler (BioRad, Carlsbad, CA, US). For the 27F-338R, and 338F-533R assays, PCR thermal conditions identical to that used in Manter et al. (2010), and Su et al. (2016) respectively, were applied as they resulted in DNA amplification. For the Pro341F-Pro805R and 799F-1115R assays, however, because PCR thermal conditions used in previous studies (Kembel et al. 2014; Akinsanya et al. 2015, respectively) did not result in any amplification, they had to be re-optimized (Table S1, Supporting Information). For each of the 12 samples, the triplicate PCR reactions were pooled, and 25 μl were purified using AMPure XP beads (Beckman Coulter, Brea, CA, US) according to the manufacturer's protocol. Four replicates of 5 μl of each purified PCR products (48 PCR reactions total, Fig. S1a, Supporting Information) were dual-indexed in a second PCR using the Nextera XT N7XX and N5XX index sequences according to the Illumina ‘16S Metagenomic Sequencing Library Preparation’ workflow. Products were then purified using AMPure XP beads, quantified on a Qubit 2.0 fluorometer, normalized to the same concentration, and sequenced on the 250 bp paired-end Illumina MiSeq platform using the V3 sequencing technology at the Duke University sequencing facility.
Position and corresponding sequences of the primers, and blocking oligomers (in bold) used and designed in this study. References of studies in which the primers were used are indicated. M = A, C; N = G, A, T, C; B = G, T, C; S = G, C; V = G, A, C; K = G, T; W = T, A; R = G, A. [1] Manter et al. 2010, [2] Su et al. 2016, [3] Akinsanya et al. 2015, [4] Kembel et al. 2014, [5] Durand et al. 2018.
PCR-clamping oligomers design
Because the V5–V6 region of the 16S rRNA gene generated the highest number of taxonomically informative reads and allowed the detection of the highest number of reads corresponding to archaeal endophytes, this region was selected for the design of blocking oligomers. Thus, in the V5–V6 region, blocking oligomers (i.e. blocking primers and PNA clamps) specific to the chloroplast and mitochondrial 16S rRNA gene of Z. mays, P. taeda, and S. alterniflora were designed using an alignment of 264 DNA sequences. The alignment consisted of the chloroplast and mitochondrial DNA sequences of the three hosts considered here, as well as 16S rRNA gene sequences obtained in previous studies from (i) bacterial endophytes isolated from the tissue of Z. mays and S. alterniflora (Figueiredo et al. 2009; Montañez et al. 2009, 2012; Pereira et al. 2011; Glaeser et al. 2014; Kandalepas, Blum and Van Bael 2015; Kämpfer et al. 2016), (ii) a 16S rRNA gene environmental cloning survey on the endophytic communities of Z. mays (Chelius and Triplett 2001) and (iii) a trial 16S rRNA gene environmental cloning survey we realized in our laboratory on the endophytic communities of S. alterniflora using the 799F-1115R primer set (Genbank accession no. MK920987, MK922281 and MK922282). Details of the sequences included in the alignment and their mismatches with the PCR primers, and blocking oligomers used and designed in this study for the V5–V6 region can be found in Table S2 (Supporting Information). The sequences were aligned using MAFFT (Katoh and Standley 2013), and blocking oligomers were manually designed in Mesquite (http://www.mesquiteproject.org). Because it has been previously shown that PCR clamping using the competitive approach was more efficient than that using the elongation arrest approach (Von Wintzingerode et al. 2000; Vestheim and Jarman 2008), the former strategy was selected for this study. However, because competitive PCR clamping requires that the blocking oligomers overlap one of the universal PCR primers, we had to redesign a universal forward primer, (819F: 5´-GTCCACVCCSTAAACGWTG-3´) to be used instead of the 799F primer. Blocking primers specific to chloroplast and mitochondrial DNA, which overlapped the 3´end of the 819F primer, could then be designed (namely 819F_Blocking_Chloro_C3, 819F_Blocking_Mito_C3, and 819F_Blocking_Chloro_PNA, 819F_Blocking_Mito_PNA, Fig. 2; Table S2, Supporting Information). PNA clamps were designed in the same position as the blocking primers but their sequences was shortened in order to meet the conditions required for efficient blocking, that is a length between 13 and 17 bp, an annealing temperature higher than that used for the PCR primers, and a melting temperature above that of the extension cycle (Lundberg et al. 2013; Fig. 2). Although our initial objective was to design blocking oligomers highly specific to all three hosts considered here, this was not possible for the pine, with which our blocking oligomers presented between 1 and 2 mismatches (Table S2, Supporting Information). The 1115R primer (AGGGTTGCGCTCGTTG) was also slightly modified (to 1115Rmod: 5´-AGGGTTGCGCTCGTTRC-3´), in order for its sequence to better match the endophytes represented in our alignment, especially endophytes within the orders Pseudomonadales and Oceanospirillales (Table S2, Supporting Information).
MiSeq library preparation for the evaluation of PCR-clamping assays
For the blocking primer assays, amplification was also performed on a BioRad T100 thermocycler and the 25 μl-PCR master mix contained 25 ng of template DNA, 0.2 μM of each PCR primer, 12.5 μl of 2X KAPA HiFi MasterMix and 2 μM of each blocking primers. A concentration of each blocking primer 10 times above that of the universal PCR primers was used as it has been shown in previous studies to be optimal (Vestheim and Jarman 2008; Wilcox et al. 2014). In order to obtain a visible PCR product on a 1.5% agarose gel after electrophoresis, the thermal annealing conditions were optimized to 50°C for 30 sec, and the cycle number to 25 (Table S1, Supporting Information). For the PNA clamping assays, the amplification was also performed on a BioRad T100 thermocycler, and the 25 μl-PCR master mix contained 25 ng of template DNA, 0.2 μM of each PCR primer, 12.5 μl of 2X KAPA HiFi MasterMix and 1 μM of each PNA clamp. A concentration of each PNA clamp five times above that of the universal PCR primers was used as recommended by the PNA oligomers supplier (PNABio, Newbury Park, CA, US). In order to obtain a visible PCR product on a 1.5% agarose gel after electrophoresis, the PCR primer annealing step was optimized to 55°C for 60 sec, and the number of cycles to 25. The primer annealing step was preceded by an additional step of 67°C for 60 sec for the annealing of the PNA clamps (Table S1, Supporting Information). The PNA clamp annealing temperature was determined based on the PNA oligomer supplier recommendation, that is 5–10°C lower than the Tm of the PNA oligomers (Fig. 2). The addition of blocking primers and PNA clamps in the PCR mix resulted in the amplification of a broad size range of PCR amplicons. Changes in the PCR protocol to more stringent conditions (e.g. increasing temperature annealing, decreasing annealing time) did not reduce the number or size range of the PCR amplicons. Therefore, we performed, for each PCR-clamping assay, an additional test (henceforth annotated ´cut´) including a step which consisted in excising amplicons having a length between 300 and 500 bp from a 2.3% NuSieve 3:1 agarose gel (Lonza, Base, CH) after electrophoresis, and purifying the excised products using the NucleoSpin Gel and PCR clean-up kit (Macheray-Nagel, Bethlehem, PA, US, Fig. S1b, Supporting Information). The selected amplicon size range was calculated based on the theoretical length of the expected PCR products, including Illumina adapters, and accounting for interspecies sequence length variation. Therefore, for each PCR-clamping assay (i.e. blocking primer, blocking primer cut, PNA clamp, and PNA clamp cut) and each host, 3 replicate reactions were performed for the first PCR (36 PCR reactions total, Fig. S1b, Supporting Information). The three PCR products were then pooled together, 25 μl was purified using AMPure XP beads, and 50 μl was used for gel electrophoresis separation, excision and purification as described above. Four replicates of 5 μl of each purified PCR products (Total of 48 PCR reactions, Fig. S1b, Supporting Information) were then dual-indexed in a second PCR using the Nextera XT N7XX and N5XX index sequences according to the Illumina ‘16S Metagenomic Sequencing Library Preparation’ workflow. Products were finally purified using AMPure XP beads and quantified on a Qubit 2.0 fluorometer. Each of the 48 PCR reactions was then normalized to the same concentration, pooled together and run on a 250 bp paired-end MiSeq platform using the V3-sequencing technology at the Duke University sequencing facility.
Illumina MiSeq read analyses
QIIME (Caporaso et al. 2010) was used to analyze the reads generated by the Illumina MiSeq sequencing (Fig. S2, Supporting Information). Briefly, after joining forward and reverse reads (Step 2 Fig. S2, Supporting Information), filtering the assembled reads based on quality score and length (Step 3 Fig. S2, Supporting Information) and removing chimeric sequences using DECIPHER, (Wright, Yilmaz and Noguera 2012; Step 4 Fig. S2, Supporting Information) a total of 3424 311 and 1584 313 clean sequences were obtained for the evaluation of the 16S rRNA gene region assays, and the PCR-clamping assays, respectively (Table S3, Supporting Information). Sequences were clustered together in operational taxonomic units (OTUs) using Uclust with a similarity cutoff of 97% (Step 6 Fig. S2, Supporting Information). OTUs taxonomic assignment was performed using the RDP classifier against the Greengenes taxonomy reference released in August 2013, with a confidence threshold of 70% (Steps 7–8 Fig. S2, Supporting Information). OTUs classified as ´p_mitochondria´, ´p_cyanobacteria´, ´c_chloroplast´ and ´Unclassified´ were removed from the OTU table using the ´filter_taxa_from_otu_table.py´ QIIME script (Step 9 Fig. S2, Supporting Information), leaving a total of 24 299 and 1082 594 bacterial sequences for the 16S rRNA gene region assays and the PCR-clamping assays, respectively (Table S3, Supporting Information). To detect significant differences (P ≤ 0.05) in the proportion of reads and OTUs obtained between the PCR-clamping assays performed for each host tested, a series of t-tests were performed. Files containing raw reads for each individual sample have been deposited in the NCBI Sequence Read Archive under the accession number PRJNA544094.
RESULTS
16S rRNA gene region assays
The 338F-533R region generated the highest number of clean reads for the three hosts combined (1311 491), followed closely by the 799F-1115R (860 001), the Pro341F-Pro806R (689 765) and the 27F-338F region (563 054, Table S3, Supporting Information). As expected, based on previous similar studies on plant leaf tissue (Arenz et al. 2015; Müller et al. 2015; de Souza et al. 2016; Jakuschkin et al. 2016), the large majority (>96%) of the reads obtained affiliated to plant DNA (Fig. 3, Table S4, Supporting Information). Over 99% of the plant-affiliated reads generated by the 27F-338R, 338F-533R, and Pro341F-Pro805R assays were not of mitochondrial origin, but of chloroplast origin. Indeed, plant leaves may contain up to an order of magnitude more copies of chloroplast genomes than mitochondrial genomes (Ma and Li 2015), making chloroplast DNA more susceptible to be amplified than mitochondrial DNA. An additional likely explanation is that the primers used in each of these assays present overall more mismatches with the hosts mitochondrial DNA than with their chloroplast DNA (Fig. S3, Supporting Information). Interestingly though, not for the pine, but for the corn and cordgrass, the large majority (<99%) of the plant-affiliated reads generated by the 799F-1115R assay were of mitochondrial origin (Fig. 3). Indeed, the 799F primer which was originally designed to reduce the co-amplification of plant chloroplast DNA in Z. mays (Chelius and Triplett 2001), presents 5 mismatches with the chloroplast DNA of this host (Fig. S3, Supporting Information) and none with its mitochondrial DNA. Although neither chloroplast nor mitochondrial sequences for S. alterniflora are currently available, the efficiency of the 799F primer in avoiding chloroplast DNA amplification might likely expand to other members of the poaceae, such S. alterniflora. This is supported by our alignments of the chloroplast and mitochondrial DNA of 1426 and 128 plant species, respectively (Tables S5, S6, Supporting Information), which show that the 799F primer display the same mismatch configuration with all other members of the poaceae family represented as it does with Z. mays. Interestingly, despite the fact that the 799F primer displayed with the chloroplast and mitochondrial DNA of P. taeda only one less and one more mismatch than it does with Z. mays, respectively, the 799F-1115R assay still resulted in the preferential co-amplification of chloroplast DNA over mitochondrial DNA for P. taeda. Given that the 799F primer presents the same mismatch configuration with most plant species represented in our alignments as it does with P. taeda, (Tables S5, S6, Supporting Information), the efficiency of the 799F primer as designed by Chelius and Triplett (2001) in reducing the amplification of plant chloroplast DNA might likely be restricted to species within the poaceae family. Nevertheless, regardless of the effectiveness of the 799F primer in minimizing the co-amplification of chloroplast DNA, the co-amplification of mitochondrial DNA still remains an issue. As previously demonstrated (Chelius and Triplett 2001; Pugh, Jackson and Pasco 2013), used in combination with the 1492R primer, the 799F primer presents the advantage of generating a shorter bacterial PCR product (ca. 700 bp), that can easily be separated by agarose gel electrophoresis from the longer mitochondrial amplicon (ca. 1000 bp) and excised from the gel, hence allowing the removal of most plant mitochondrial DNA in the process. However, used in combination with the 1115R primer, for the generation of amplicons of suitable length for paired-end Illumina MiSeq sequencing, the resulting 16S rRNA gene bacterial and host mitochondrial amplicon generated are of similar size (333–339 bp for bacterial endophytes and 345 and 349 bp for Z. mays and P. taeda, respectively), which prevents their efficient separation by electrophoresis (Fig. S4, Supporting Information). As a result, the removal of plant reads can only be performed a posteriori in silico in this case, and usually leaves a number of sequences insufficient to allow proper characterization of endophytic communities. As a matter of fact, after the removal of plant-affiliated reads, the number of bacterial endophytic reads generated by all assays ranged from 53 ± 36 to 2323 ± 1290 per sample (Fig. 3) and while the 338F-533R assays yielded by far the highest number of bacterial reads for the pine and cordgrass (2323 ± 1290 and 1049 ± 987 per sample, respectively, Fig. 3, Table S4, Supporting Information), the large majority (∼97%) could not be affiliated further than to the bacterial kingdom, greatly limiting the taxonomic information needed to thoroughly characterize the microbial community (Fig. S5, Supporting Information). While for the corn and cordgrass, the assay significantly yielding the highest number of informative bacterial reads was the 799F-1115R assay, for the pine, no significant differences could be detected between the 799F-1115R and the Pro341F-Pro805R assays (Fig. 3). These two assays also allowed for a higher proportion of OTUs to be identified to a lower taxonomic level than the other tested assays. Indeed, while on average only 38% and 14% of the reads generated by the 27F-338R and 338F-533R assays could be affiliated down to the genus level, over 50% and 60% of the reads generated by the Pro341F-Pro805R and 799F-1115R, respectively, were (Fig. S6, Supporting Information), making the V3–V4 and V5–V5 regions the most informative for the taxonomic characterization of plant-associated microbial communities. However, although the Pro341F-Pro806R primer set was originally designed to allow the amplification of archaeal 16S rRNA gene (Takahashi et al. 2014), the only assay enabling the detection of archaeal endophytes in all hosts was the 799F-1115R assay (Fig. 4; Table S7, Supporting Information). For all the above-mentioned reasons, the V5-V6 region targeted by the 799F-1115R assay was considered the most promising for characterizing plant-associated microbial communities. Nevertheless, even after combining the reads of the four replicates performed for each sample, the number of OTUs detected by the 799F-1115R assay ranged from 65 for the pine to 313 for the corn (Fig. S7, Supporting Information), which is in the range of what some previous studies found in plant green tissue, such as leaves (Jackson et al. 2013; Akinsanya et al. 2015; Carrell, Carper and Frank 2016; Ding and Melcher 2016; Rúa et al. 2016), shoots (Liu et al. 2017a,b) and stems (Campisano et al. 2014; Miyambo et al. 2016), but still likely does not cover the extent of the endophytic diversity present. As a matter of fact, none of the accumulation curves generated for the 799F-1115R assay were close to reach an asymptote (Fig. S7, Supporting Information), indicating the need to develop a more effective method to reduce the co-amplification of plant DNA. To this end, we designed a PCR-clamping assay based on the 799F-1115R assay, and targeting the V5-V6 region of the 16S rRNA gene.
Distribution of clean Illumina Miseq reads obtained for each host and each 16S rDNA region PCR assay performed in this study. Proportion of unclassified reads and reads affiliated to plant and bacterial DNA are indicated on the left panel. Number of unclassified and classified bacterial reads are indicated on the right panel. Error bars represent the standard deviation from the mean of the four replicates performed for each sample (for the 338F-533R assay on pine, only three replicates were considered as the fourth did not generate any reads). Horizontal brackets indicate significant differences between assays, according to a t-test (P ≤ 0.05).
Number of bacterial OTUs detected for each host and each 16S rDNA region PCR assay performed in this study. Error bars represent the standard deviation from the mean of the four replicates performed for each sample (for the 338F-533R assay on pine, only three replicates were considered as the fourth did not generate any reads). Horizontal brackets indicate significant differences between assays, according to a t-test (P ≤ 0.05). An asterisk indicates the detection of archaeal OTUs.
PCR-clamping assays
To further optimize the characterization of endophyte diversity, two blocking primers and two PNA clamps specific to the chloroplast and mitochondrial 16S rRNA gene of the three hosts considered were thus designed in the V5–V6 region of the 16S rRNA gene. Because the competitive PCR-clamping approach, which was selected here for its previously demonstrated higher efficiency in blocking co-amplification of host DNA (Von Wintzingerode et al. 2000; Vestheim and Jarman 2008), requires the blocking oligomers to overlap one of the universal PCR primers, a forward primer, namely, 819F (5´-GTCCACVCCSTAAACGWTG-3´), was re-designed 20 nucleotides ahead of the 799F primer (Table S2, Supporting Information), thus conserving nearly the full length of the targeted region and its associated taxonomic information. The number of mismatches that the blocking primers and the shorter PNA clamps had with the bacterial sequences represented in our alignment ranged from 4 to 11, and 3 to 10, respectively (Table S2, Supporting Information), indicating that they would likely not co-block the amplification of most bacterial endophytes.
The proportion of plant-affiliated reads was significantly reduced with the addition of blocking oligomers (Fig. 5, Table S8, Supporting Information). Although the PNA assays produced for all hosts the highest proportion of bacterial reads (Fig. 5), in terms of absolute number, this translated to the lowest number of reads and OTUs (Figs 5 and 6). Previous studies using the PNA PCR-clamping assay originally developed by Lundberg et al. (2013) also obtained a noticeably low number of bacterial reads (Carrell, Carper and Frank 2016; Liu et al. 2018). In that sense, it is interesting that the region targeted by the PNA-clamping assays developed in this study (V5–V6), though it was different from that targeted by Lundberg et al. (2013, V4), also generated a very low number of reads (Fig. 5, Table S8, Supporting Information), possibly indicating that the nature of the PNA oligomers themselves and the mechanism behind their annealing rather than their nucleotides sequence reduce overall PCR efficiency. In order to further exclude for non-prokaryotic DNA, for each of the two types of PCR-clamping assays performed (i.e. PNA and blocking primer clamping assays), an additional test consisting in excising amplicons between 300 and 500 bp was performed. This additional step, however, did not result in any significant increase in the proportion of prokaryotic reads. On the contrary, the number of bacterial reads was either significantly reduced (in the case of the blocking primer clamping assay on the cordgrass; P ≤ 0.05) or not significantly different (in any other cases; Fig. 5). This suggests that endophytic bacterial DNA is likely lost during band excision, and that this extra step should not be performed.
Distribution of clean Illumina Miseq reads obtained for each host and each PCR-clamping assay performed in this study. Proportion of unclassified reads and reads affiliated to plant and bacterial DNA are indicated on the left panel. Number of unclassified and classified bacterial reads are indicated on the right panel. Error bars represent the standard deviation from the mean of the four replicates performed for each sample. Horizontal brackets indicate significant differences between assays, according to a t-test (P ≤ 0.05).
Number of bacterial OTUs detected for each host and each PCR-clamping assay performed in this study. Error bars represent the standard deviation from the mean of the four replicates performed for each sample. Horizontal brackets indicate significant differences between assays, according to a t-test (P ≤ 0.05). An asterisk indicates the detection of archaeal OTUs.
The assay that clearly performed the best was the blocking primer assay that did not include the band excision step (Fig. 5). For all hosts, the number of reads generated by this assay was on average 20-fold that generated by the 799F-1115R assay, and the number of bacterial OTUs retrieved was nearly tripled (Fig. 6). As a result, for all hosts, individual rarefaction curves generated for the blocking primer assay were practically asymptotic (Fig. S8, Supporting Information), indicating a substantially improved coverage of the endophytic diversity present in the samples. The blocking primer assay also allowed the detection of more main taxonomic groups than any other assay (29, 17 and 26 for the corn, pine and cordgrass, respectively; Table S9, Supporting Information), and was the only PCR-clamping assay allowing for the detection of Poribacteria, Parvarchaeota, Deferribacteres, Elusimicrobia, GNO2 and ε-proteobacteria (Table S9, Supporting Information). In addition, although archaeal OTUs were detected by all PCR-clamping assays evaluated in this study, the blocking primer assay was the only one allowing the detection of Archaea in all replicate samples for all hosts (Table S7, Supporting Information), indicating its broader taxonomic detection of prokaryotes. Besides, as previously reported by Tan and Liu (2018), the addition of blocking oligomers did not drastically change the taxonomic distribution of the bacterial endophytes detected, as the taxonomic groups found to be the most represented using the 799F-1115R assay, that is Actinobacteria, Firmicutes, α-, β- and Υ-proteobacteria, were also found to be dominant in the PCR-clamping assays (Fig. S9, Supporting Information). Finally, the addition of blocking oligomers further decreased the proportion of unclassified bacteria (Fig. S9, Supporting Information). In particular for P. taeda where this aspect was the most problematic, the fraction of unclassified bacteria dropped from nearly 20% (799F-1115R assay) to below 4% with the blocking primer assay (Fig. S9, Supporting Information).
DISCUSSION
The objective of our work was to develop a PCR-clamping assay reducing plant DNA co-amplification to effectively improve the characterization of plant-associated prokaryotic communities, both in terms of taxonomic identification and richness. First, we empirically assessed which region of the 16S rRNA gene was the most informative for the characterization of plant-associated microbial communities and thence most appropriate for the design of our PCR-clamping assay. Second, we evaluated which of blocking primers, or PNA clamps performed best in blocking plant DNA co-amplification, as well as whether the addition of an amplicon gel extraction step helped reducing the detection of potential unspecific or chimeric amplicons. The choice of the most appropriate 16S rRNA gene region, which was empirically assessed using Z. mays, P. taeda and S. alterniflora, determined that the V5–V6 region was the most promising candidate. In line with these results, Yang, Wang and Qian (2016) previously showed that the V4–V6 regions reflected best the taxonomic information contained in the full-length sequence of the 16S rRNA gene and Wang et al. (2018) determined that the V5–V7 region was the most accurate, not only for the taxonomic characterization of microbial communities, but also for inferring microbial functions.
Therefore, within the V5–V6 region we identified a locus allowing for the design of a new universal prokaryotic forward primer (i.e. 819F) and two overlapping chloroplast- and mitochondrial-specific blocking oligomers. As reverse primer, we used a slightly modified version of the 1115R primer (i.e. 1115Rmod), which we re-designed as to improve its coverage of endophytic prokaryotes especially within the orders Pseudomonadales and Oceanospirillales. It is noteworthy to mention that the amplicons produced by the PCR-clamping assay designed in this study are nearly identical in length to those produced by the 799F-1115R PCR protocol, which has been an assay of choice for plant-microbiome high throughput V5–V6 sequencing surveys (Redford et al. 2010; Kembel et al. 2014; Aleklett et al. 2015; Mashiane et al. 2017; Liu et al. 2017b; Durand et al. 2018). Therefore, the application of our clamping assay should provide equivalent taxonomic information, though with an improved coverage. Optimization of the assay clearly showed that the competitive PCR-clamping assay using blocking primers and without the additional gel extraction step clearly performed best, which coincidentally, also presents an economic advantage, as PNA clamps are nearly 10 times more expensive than C3 blocking primers. The PCR-clamping assay developed here has several other competitive advantages: (i) it does not require a gel extraction step, which reduces preparation time and extra manipulation of the sample and (ii) the number of PCR cycles are kept to a minimum (25 cycles for the amplicon PCR and 8 cycles for the Index PCR), limiting the well-known methodological biases introduced by PCR-based metagenomic approaches.
However, although our PCR-clamping assay clearly improved the characterization of the endophytic microbial communities associated with each of the three hosts considered in this study, plant reads still constituted a substantial portion of the sequences obtained by Illumina Miseq sequencing. This was especially true for the pine, for which plant reads still represented ∼99% of all the reads generated. Although it is not recommended to compare results between different hosts given that the ratio ‘plant DNA: prokaryotic DNA’ is likely to vary not only with plant species, but also with plant developmental stage and surrounding environmental conditions (Li et al. 2006; Ma and Li 2015), there is a possible explanation for the lower blocking performance of our PCR-clamping assay on pine. Indeed, while the 819F_Blocking_Chloro_C3 and 819F_Blocking_Mito_C3 blocking primers present a perfect match with the corn and cordgrass chloroplast and mitochondrial DNA sequences, they have 1 and 2 mismatches with pine chloroplast and mitochondrial DNA sequences, respectively. Although it seem at first unlikely that 1 or 2 mismatches on a 20 and 23 bp long oligomers could have such a drastic effect on the blocking performance of the oligomers, the fact that Fitzpatrick et al. (2018b), who modified the sequence of the chloroplast-specific PNA clamp originally designed by Lundberg et al. (2013) by only one nucleotide, decreased plant DNA co-amplification from 65% to 23% in 6 Asteriaceae species, suggests that indeed a single mismatch in the blocking oligomer may substantially reduce the blocking performance of the assay.
A closer look at our alignments of chloroplast and mitochondrial DNA sequences (Tables S5 and S6, Supporting Information), indicates that the 819F_Blocking_Chloro_C3 and 819F_Blocking_Mito_C3 display at least one mismatch with most species represented in our alignments, with the exception of the poaceae family and a few other plant species. Based on our results for pine, this suggests that our PCR-clamping assay is likely not to perform best with most plant species outside of the poaceae family if used as is. Therefore, we highly recommend customizing the sequences of the 819F_Blocking_Chloro_C3 and 819F_Blocking_Mito_C3 blocking primers for the host to be surveyed. To this end, we provided alignments of 1429 chloroplast (Table S5, Supporting Information) and 130 mitochondrial (Table S6, Supporting Information) DNA sequences, encompassing 203 terrestrial plant families as a supporting tool for the customization of the blocking oligomers designed in this study. Including if the chloroplast and the mitochondrial DNA sequences of the plant species under study are not available, based on the alignments we provide, it is possible to infer the sequences of the blocking oligomers that are likely to match the plant species. For instance, for a survey aiming to characterize the microbial communities associated with plant species within the Asteracea, a modification of the 819F_Blocking_Chloro_C3 sequence from AAACGATGGATACTAGGTGCTGT-3' to AAACGATGGATACTAGGCGCTGT-3' (Table S6, Supporting Information), is likely to increase the blocking efficiency of 819F_Blocking_Chloro_C3, as all Asteracea species represented in our alignment have the same mismatch configuration with the blocking primers. However, as chloroplast and mitochondrial sequences are far from being known for all plant species, and as many plant families are only represented by one species in our alignment, it is recommended, when possible, to obtain the chloroplast and mitochondrial sequences of the plant host to be studied in order to fully customize the 819F_blocking oligomers. Similar to what was done in this study for S. alterniflora, a preliminary 799F-1115R cloning survey could be performed prior to conducting the plant-associated microbial community survey in order to obtain the partial chloroplast and, or mitochondrial sequence of the plant in question.
Among the plant microbiome surveys that used PNA clamps or blocking primers (Lundberg et al. 2013; Arenz et al. 2015; de Souza et al. 2016; Fitzpatrick et al. 2018a,b; Liu et al. 2018; Moronta-Barrios et al. 2018), none were able to completely suppress the co-amplification of plant DNA. Interestingly, Moronta-Barrios et al. (2018) who studied the temporal dynamics of microbial communities in rice seedlings experimentally infected with bacterial endophytes noted that the blocking primers they designed effectively suppressed the co-amplification of plant DNA only when the bacterial endophytic populations within the plant tissue were high. Although using a concentration of blocking primers too high could increase the probabilities of co-blocking target species (Piñol et al. 2015; Tan and Liu 2018), increasing the ratio ‘blocking primer: universal primers’ has been shown to improve further the suppression of non-target DNA co-amplification (Wilcox et al. 2014; Su et al. 2018). Given the likely high ratio ‘plant DNA: bacterial DNA’ in leaf material, higher blocking primer concentrations might have reduced plant DNA co-amplification even further in this study, and a preliminary optimization of the relative concentration ‘blocking primers: PCR universal primers’ to a concentration more likely to reflect the actual ratio ‘target DNA: non-target DNA’ might have improved the blocking performance of our assay. In a similar manner, PCR annealing temperature and cycle numbers, though not found to be as determinant as primer mismatches or blocking primer concentration for blocking performance (Piñol et al. 2015), could also be further optimized.
Based on the results presented here, we believe that the PCR-clamping assay developed in this study will effectively improve the characterization of plant-associated prokaryotic communities, both in terms of taxonomic identification and diversity coverage. The fact that the PCR-clamping assay designed here performed well for the characterization of prokaryotic community in the endosphere of plant green tissue, suggests that its application to other plant-associated prokaryotic communities (e.g. exophytic, rhizophytic), and other plant tissue, known to contain a lower level of organellar DNA, is likely to be even more successful at preventing the amplification of host DNA. We also provide an alignment of chloroplast and mitochondrial DNA sequences encompassing more than 200 terrestrial plant families as a supporting tool for customization the blocking primers designed, allowing the application of the assay across a wider taxonomic spectrum of plant hosts.
ACKNOWLEDGEMENTS
The authors would like to thank the Van Bael Lab at Tulane University, Louisiana, for providing the Spartina alterniflora leaf material and Erich Pinzón-Fuchs for his assistance in the writing process of this manuscript.
FUNDING
This research was made possible by a grant from The Gulf of Mexico Research Initiative. Data are publicly available through the Gulf of Mexico Research Initiative Information & Data Cooperative (GRIIDC) at https://data.gulfresearchinitiative.org (doi: 10.7266/K0VT4GMJ).
Conflicts of Interest
None declared.

![Position and corresponding sequences of the primers, and blocking oligomers (in bold) used and designed in this study. References of studies in which the primers were used are indicated. M = A, C; N = G, A, T, C; B = G, T, C; S = G, C; V = G, A, C; K = G, T; W = T, A; R = G, A. [1] Manter et al. 2010, [2] Su et al. 2016, [3] Akinsanya et al. 2015, [4] Kembel et al. 2014, [5] Durand et al. 2018.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/femsec/96/7/10.1093_femsec_fiaa110/1/m_fiaa110fig2.jpeg?Expires=1712901334&Signature=F3JhSLGwBnEhM6i1N4r04lq67RmLbJy1cnCiF3G2r1~9Si~OZJuJgft5gjOLtL1duCkIhu2lBjzl7zZmongyFGDYMj8ox9Ws6lIo88GQoXbbuZivByKXfiNN1bAmeM1pNkout0uE7UyPo8ehK5y~UxFm0WeM-phUtwM50WP-tEsW5ZnSypQTr7xPNgcANJMhDsZwDhQxTjDdV5v2KR7KGiHPkTEKnQzPJoWlcx9BVoxNt~HBK7r7By9WzUbuKOZNE~v1v0V~qgbP78mAcznzXEEgfWyDdDkGexkgScAXzTlDR8Y-~QoVv8mVpiUfBkuap7c81aTO8TJL0MHQUYLJNw__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)



