Isolation of bacterial extrachromosomal DNA from human dental plaque associated with periodontal disease,using transposonaided capture (TRACA)

The human oral cavity is host to a complex microbial community estimated to comprise > 700 bacterial species, of which at least half are thought to be not yet cultivable in vitro. To investigate the plasmids present in this community, we used a transposon-aided capture system, which allowed the isolation of plasmids from human oral supra- and subgingival plaque samples. Thirty-two novel plasmids and a circular molecule that could be an integrase-generated circular intermediate were isolated.


Introduction
The human oral cavity contains a complex microbial community estimated to comprise 4 700 bacterial species (Kazor et al., 2003;Paster et al., 2006;Dewhirst et al., 2010). Mobile genetic elements are a fundamental part of this community, disseminating genes that facilitate adaptation or enabling the exploitation of certain environmental niches. Conjugative transposons have been shown to be responsible for the transfer of antibiotic resistance genes within the oral cavity (Warburton et al., 2007). However, data on the plasmid population within the oral metagenome are limited.
However, a major limitation of previous studies investigating the presence of plasmids in oral species is the reliance on bacterial culture, sometimes in conjunction with antibiotic selection. Because approximately 50% of oral bacteria cannot yet be cultured (Wade, 2002;Kazor et al., 2003), plasmids present in these organisms will be missed. Furthermore, plasmids lacking genes conferring resistance to antibiotics will also be missed in studies that use antibiotics as a selection. Therefore, this study uses a transposon-aided capture (TRACA) method (Jones & Marchesi, 2007), which is independent of bacterial culture or phenotype selection, to isolate plasmid DNA from dental plaque samples taken from patients with periodontal disease.

Collection of samples
Supragingival and subgingival plaque samples were collected from 50 patients presenting with periodontal disease ranging from gingivitis through to severe periodontitis, including aggressive periodontitis (UK Research Ethics Committee approval, reference number 06/MRE01/35). All patients were 18 years of age or above, had not taken antibiotics and had not had extensive dental treatment within the previous 3 months. The DNA was extracted from the plaque samples as described by Hunter et al. (2011) and pooled.

TRACA of plasmids
This technique was performed as described previously (Jones & Marchesi, 2007). Briefly, $1 mg of metagenomic DNA was digested with plasmid-safe TM DNase (Epicentre) to remove sheared genomic DNA. The sample was then subjected to an in vitro transposition reaction using the EZ-Tn5 OriV/Kan2 transposon (Epicentre). The reaction was purified and concentrated to a final volume of $10 mL using a YM-100 microconcentrator column (Millipore). All of the EZ-Tn5 reaction was electrotransformed into 100 mL Escherichia coli Transformax EPI300-T1 R cells (Epicentre) under the following conditions: 18 kV cm À1 , 200 O resistance, 25 mF capacitance (Biorad Pulser II) in a prechilled 0.1 cm 3 electroporation cuvette. Immediately after electroporation, 900 mL of SOC medium was added and the cells were transferred to a 15 mL falcon tube and incubated horizontally at 37 1C with shaking at 200 r.p.m., for 1 h. Transformants were selected on Luria-Bertani agar containing 50 mg mL À1 kanamycin and incubated at 37 1C, aerobically, for up to 48 h.

DNA sequence analysis
The initial sequence data from each plasmid were obtained using primers FP-1 and RP-1 located at the ends of EZ-Tn5 (Epicentre). The complete sequence of each plasmid was obtained using a primer walking strategy. Given the number of plasmids isolated, a minimum of double sequence coverage was determined. ORFs were defined as nucleotide sequences with the potential to encode proteins 4 39 amino acids and preceded by a Shine-Dalgarno sequence at an appropriate distance. Plasmid schematics were constructed using Vector NTI (Invitrogen).
A number of putative ORFs were identified on each of the plasmids (Fig. 1). The closest matches with the predicted amino acid sequence of these ORFs, identified by BLASTP analysis, are listed in Table 1. Some of these ORFs are predicted to encode polypeptides with homology to proteins of known function, such as replication, mobilization or plasmid stability. Others encode hypothetical proteins, many of which show no significant homology to sequences in both the NCBI protein and the nucleotide databases, indicating a potential reservoir of genes encoding as yet uncharacterized functions.
A putative replication (Rep) protein was identified in all except one (pTRACA61) of the plasmids isolated in this study (Table 1). The Rep from pTRACA45 shares 71% amino acid identity to that of pJD1, a 4.2-kb cryptic plasmid from Neisseria gonorrhoeae (Korch et al., 1985). Furthermore, the two additional ORFs on pTRACA45 are also closely related to those on pJD1 (Table 1), while its G1C content (50.6%) is similar to that of pJD1 (51.5%) and the genomes of Neisseria spp. ($51%), indicating that it is of neisserial origin.
The Rep proteins of the other 32 plasmids are more distantly related (25-43% amino acid identity) to plasmids found in bacteria belonging to either the Firmicutes or the Proteobacteria phyla. The pTRACA42-group comprises the majority of the plasmids isolated, 23 in total. This suggests either that this group of plasmids is more abundant in the oral metagenomic DNA and/or is more stable in the E. coli host. The plasmids within this group differ in length (1467-1482 bp) and share 4 92% nucleotide identity. One plasmid, pTRACA42, was selected for further study. The putative Rep protein is most closely related to that of the small, cryptic plasmid pCL2.1 from Lactococcus lactis (Chang et al., 1995) (Table 1). However, the G1C content of the pTRACA42 group of plasmids ($52%) is consider-ably higher than that of pCL2.1 (34%) and of L. lactis genomes ($35%), suggesting that they are not of lactococcal origin. The other ORFs on these plasmids have no Interestingly, nucleotide sequences with over 80% identity to pTRACA42 were identified in one of the two human lung viral metagenomes -project ID: 28439 (Dinsdale et al., 2008). The majority of the sequences in this metagenome were from phage.
The Rep protein from pTRACA66 is also most closely related to that from an L. lactis plasmid, specifically pKL001 (Table 1). However, the G1C content of pTRACA66 (45%) is higher than that of pKL001 (32.9%) and the L. lactis genomes ($35%), suggesting that it is not of lactococcal origin. This plasmid contains an ORF with the potential to encode an integrase and three other ORFs that have no significant homology to anything in the protein or the nucleotide databases (Table 1).
The Rep associated with the pTRACA63 group of plasmids are most closely related to that from pAB49, an Acinetobacter baumannii plasmid (Table 1). However, the G1C content of pAB49 (38.8%) and the genomes of Acinetobacter spp. (38-42%) are much lower than that of pTRACA63 (50.4%), suggesting that pTRACA63 originates from a different bacterial genus. This plasmid also contains an ORF with the potential to encode an integrase.
plasmids isolated from the gut metagenome using TRACA (Jones & Marchesi, 2007) (Table 1). Interestingly, the Rep from the pTRACA41 group and pTRACA73 are related to that of pTS1 (24% and 43% amino acid identity, respectively), a cryptic plasmid from the oral bacterium, Treponema denticola (Chauhan & Kuramitsu, 2004). In addition to the rep gene, a number of other putative ORFs were identified on each plasmid (Fig. 1.) The identities of the top hits identified by BLASTP analysis are listed in Table 1. Some of the ORFs on pTRACA73 are predicted to encode polypeptides with shared function, such as mobilization or plasmid stability, to those present on pTRACA22. However, based on sequence analysis, they are only distantly related and the G1C content of pTRACA73 (30.7%) is much lower than of pTRACA22 (51.4%). In contrast, the genes encoding polypeptides on the pTRAC41 group of plasmids share no homology with those present on pTRA-CA20; however, the G1C content of pTRACA41 (49.5%) is similar to that of pTRACA20 (48.7%).
In contrast to the other 32 plasmids, pTRACA61 does not contain a rep gene homologue, but contains an integrase gene homologue sharing 34% amino acid identity to a tyrosine integrase family protein (accession number ZP_06402565). It is possible that this is a circular intermediate of a mobile element. Integrases are site-specific recombinases that frequently produce circular molecules by recombination between their target sites (for reviews, see Smith & Thorpe, 2002;Roberts & Mullany, 2009), providing the intriguing possibility that TRACA has the ability to isolate mobile genetic elements other than plasmids.
The 159 metagenomic data sets currently in the NCBI database were investigated for the presence of the plasmid DNA isolated in this study; however, except for pTRACA41 no homology was found.
This study has identified several novel plasmids, most of which encode hypothetical proteins of unknown function. This shows that there is a relatively unexplored genetic reservoir in the oral metagenome. Although previous studies have reported plasmids in oral streptococci (Dunny et al., 1973;Yagi et al., 1978;Caufield et al., 1982;Vandenbergh et al., 1982), none were captured by the TRACA system. This may be because the bacterial community found at periodontal disease sites is dominated by obligate anaerobes, and streptococci are typically associated with periodontally healthy sites (Paster et al., 2001). However, the rep genes associated with pUA140 (Zou et al., 2001) and pLM7 plasmids, from S. mutans, could be detected in the sample by PCR amplification (data not shown). Similarly, plasmids previously isolated from gut bacteria were also not captured by Jones & Marchesi (2007), suggesting a limitation of the TRACA system. It is possible such plasmids are unstable in E. coli, refractory to transposon insertion or are not present in high enough copy number to enable capture. Further-more, all the plasmids captured were o 8 kb, mirroring the majority of the previously reported plasmids from oral bacteria, although large plasmids have been reported from the oral cavity (LeBlanc et al., 1993). Whether the isolation of only small plasmids with TRACA is a result of them being numerically dominant in the oral cavity and therefore preferentially captured or a possible limitation of the TRACA system is unknown. It is known that there is a logarithmic decrease in the transformation frequency of plasmids as the size increases; thus, larger plasmids will simply transform less easily into E. coli (Szostková & Horáková, 1998). Larger plasmids will also be present in lower copy number, making them harder to capture by TRACA. We are currently investigating whether the substitution of different origins of replication into Tn5 has allowed the capture of different plasmids. It also has to be borne in mind that it is not expected that the TRACA process is likely to capture linear plasmids because the origin of replication used by the modified Tn5 does not have the ability to replicate their extreme termini; these require specialized enzymes (reviewed in Ravin, 2003).
The TRACA protocol has successfully captured novel plasmids from human oral plaque, many of which carry genes encoding as yet uncharacterized functions. TRACA has an advantage over other plasmid isolation techniques as it does not require the expression of plasmid-encoded genes in a surrogate host; thus, as illustrated by this study, novel plasmids and circular molecules can be isolated.