Characterization of the replicon of a 51-kb native plasmid from the gram-positive bacterium Leifsonia xyli subsp. cynodontis

The 4992-bp replicon of a large cryptic plasmid in the gram-positive bacterium Leifsonia xyli subsp. cynodontis was identiﬁed and sequenced. The replicon encoded two proteins essential for plasmid replication and stability. The putative replication protein (RepA) is homologous to that of the plasmids in mycobacterial pLR7 family, while the putative ParA protein immediately downstream of RepA is signiﬁcantly homologous to the Walker-type ATPase required for partition of plasmid and chromosome of the gram-positive bacteria. These two proteins and other ORFs are clustered with the putative promoters and other regulatory sequences, illustrating an eﬃcient organization of the replicon for this novel plasmid.


Introduction
As an extrachromosomal genetic element, plasmid contains a replicon that automatically replicates the plasmid and controls the plasmid copy number. Several different mechanisms of the plasmid survival and propagation have been reported. These include the multimer resolution system ensuring the plasmid in the monomeric form, the active partitioning system enabling faithful segregation of the plasmid to the daughter cells, the post-segregational killing system resulting in the death of plasmid-free segregants, and the conjugative transfer and mobilization system spreading the plasmid between bacteria [1]. These molecular modules are usually clustered to form the plasmid survival kit, ensuring that the plasmids replicate and spread efficiently [1]. Recent analysis of the enriched data on the plasmid survival mechanisms in different bacteria has identified families of phylogenetically related function distributed across a range of disparate plasmids [2][3][4]. Supporting the phylogenetic analysis, the parAB genes from the chromosome of Pseudomonas putida and Bacillus subtilis have been found to stabilize the plasmid mini-F in Escherichia coli [5,6]. This result suggests that the plasmid and chromosome partitioning systems and the host components involved in plasmid partitioning are highly conserved between gram-positive and gram-negative bacteria.
Several plasmids from gram-positive bacteria have been extensively characterized regarding the mechanisms of replication and stability. For example, pAMb1 and pLS32 are best known as theta-replicating plasmids [7,8]. Plasmids pCI2000, pSK1 and pAW63 may maintain their stability with active partitioning [9][10][11]. However, compared to those of the gram-negative bacteria, the studies on the survival mechanisms of plasmids from gram-positive bacteria are relatively inadequate [3,9,12]. In this study, we have tried to understand the mechanisms of replication and propagation of pCXC100, a large cryptic plasmid from the grampositive bacterium Leifsonia xyli subsp. cynodontis.
Leifsonia xyli subsp. cynodontis (Lxc), originally named as Clavibacter xyli subsp. cynodontis subsp. nov., is a gram-positive, high G + C content, coryneform bacterium isolated from the xylem of bermudagrass (Cynodontis dactylon L. Per.). It colonizes many crop plants including maize, rice, sorghum, oats, white millet and sudan grass without causing wilting symptoms [13][14][15][16][17][18]. The broad host range and weak pathogenicity have inspired the idea of using this bacterium as a means for crop protection [18]. However, the extremely slow growth of this bacterium (4-6 h per generation) and the loss of vigor after prolonged time for growth in vitro significantly hinders the study of its genetics [19]. A cryptic plasmid about 51 kb in size, named pCXC100, was harbored by some L. xyli subsp. cynodontis isolates, but not all [19,20]. Little is known about the mechanism of replication and stability of this plasmid [20]. In this study, the 5 kb DNA replicon of pCXC100 in L. xyli subsp. cynodontis was cloned and sequenced. Analysis of the sequence revealed that pCXC100 utilizes a complete survival kit, with the genes encoding the replication and propagation proteins being tightly clustered, providing a new system for understanding the survival mechanism of the large cryptic plasmid for the gram-positive bacterium.

Growth and transformation of bacteria
All E. coli strains were grown in LB or 2YT medium at 37°C [21]. L. xyli subsp. cynodontis was grown on solid medium (DM agar) at 28°C as previously described [19].
Antibiotics for selections in E. coli were added to a final concentration of 50 lg/ml of ampicillin, 25 lg/ml of chloramphenicol, and 10 lg/ml of tetracycline. Chloramphenicol at 5 lg/ml or tetracycline at 2 lg/ml was used for selections in L. xyli subsp. cynodontis. E. coli strain DH5a (Gibco BRL, Gaitherburg, MD) was used as a host for all subclones of pCXC100. L. xyli subsp. cynodontis strain #3 lacking the 51 kb native plasmid was transformed by pCXC100 derivatives using electroporation method [19].

Plasmid isolation from L. xyli subsp. cynodontis and plasmid construction
The miniprep plasmid DNA was prepared from bacterial cells with a modified alkaline extraction procedure as previously described [19], then used to determine the presence of recombinant plasmids transformed to L. xyli subsp. cynodontis. Large-scale preparations of plasmid DNA for mapping and subcloning were made using CsCl gradient centrifugation [21]. Five plates of culture were harvested in 20 ml of saline and resuspended in 5 ml of 10 mg/ml lysozyme solution, the procedures followed were the same as that of the miniprep after scaling-up. The plasmid band was collected from the CsCl gradient after running at 44,000 rpm for 36 h.
Construction of plasmids containing various regions of the pCXC100 DNA is described in Table 1 as well as in some figures and text. All regular DNA manipulations were carried out following methods described by Sambrook et al. [21].

Plasmid stability
Due to the difficulty of growing L. xily subsp. cynodontis in liquid medium, and loss of vigor after prolonged time growth on DM agar solid medium, it is not possible to detect the plasmid stability for many generations as it is with other bacterium, like E. coli. We tested the plasmid stability in DM agar solid medium for a limit number of generations. Plasmid-containing L. xyli subsp. cynodontis cells were grown on solid DM agar in the absence of antibiotics for 5-7 days at 28°C to reach 24 generations (calculated from the number of divided cells). Subsequently, the culture was diluted and plated on DM agar in the presence or absence of tetracycline. The number of colonies grown in each plate was counted, and colonies of 50 or 100 were picked up from plates lacking antibiotic, then streaked on plates containing antibiotics to further confirm the result. The average ratio of plasmid loss per generation was calculated from these data.

DNA sequence analysis
DNA sequence determination was performed on an Applied Biosystems 373A automated DNA sequencer (Applied Biosystem). The sequence is available as GenBank Accession No. AY380839. Searches of the GenBank database were performed with the FASTA [22] and BLASTN [23] programs. Sequence alignments were performed by the Clustal method of the MEGALIGN program of the DNASTAR software package.

Determination of the replicon of pCXC100
The primary restriction map of pCXC100 was generated in this study by restriction of the high-purity pCXC100 DNA isolated from the wild-type L. xyli subsp. cynodontis (Fig. 1). The location of the plasmid replicon was then determined. High-purity pCXC100 DNA was then digested with restriction enzymes that generated DNA fragments covering the full-length pCXC100 with 3-8 kb overlaps between adjacent DNA fragments. These fragments were cloned into pBR325, resulting in the recombinant plasmids pLXC101 to pLXC105 (Fig. 1) that were transformed into a plasmidfree L. xyli subsp. cynodontis isolate. Because the ColE1based plasmids could not yield L. xyli subsp. cynodontis transformant [19], we can conclude that any resulting viable clones must host the recombinant plasmid containing the replicon of pCXC100. We found that pLXC101 and pLXC105 yielded viable and stable L. xyli subsp. cynodontis transformants. Thus the replicon of pCXC100, responsible for both the plasmid replication and stability, is located within the overlap of pCXC100 sequence inserted in pLXC101 and pLXC105, which was the 6.5 kb NcoI-BglII fragment (Fig. 1). A series of subclone plasmids (pLXC106-pLXC111) were then constructed to further locate the replicon of pCXC100 (Fig. 2). It was shown that all plasmids containing the 3.0 kb NcoI-AvaI DNA fragment yielded viable transformants, while those lacking this DNA region or containing only a part of this region did not (Fig. 2). Note that plasmid preparation from L. xyli subsp. cynodontis cells was readily restriction digested in vitro, suggesting that no DNA modification occurs in this organism to maintain the plasmid integrity. We can conclude that the untransformed plasmids must be due to the loss of replication ability, not the restriction barrier that limits transformation. On the other hand, plasmid containing the 3.0 kb NcoI-AvaI DNA fragment alone had a poor stability, demonstrated by the nearly 14% plasmid-loss per generation. Extending the DNA presence about 1 kb downstream from the AvaI site to the adjacent XhoI site decreased the plasmid loss to 1% per generation. Full plasmid stability was ensured by the presence of one more kilobase DNA from the XhoI site to the adjacent EcoRI site (Fig. 2). Therefore, we concluded that the replication function of pCXC100 was located within the 3.0 kb NcoI-AvaI DNA region; the major stability function was encoded by the 1.0 kb AvaI-XhoI DNA region, and the additional 1.0 kb XhoI-EcoRI DNA region may encode additional stability function.

Organization of the replicon of pCXC100
The nucleotide sequence of the 5.0 kb replicon, NcoI-EcoRI DNA fragment encoding the replication and stability functions, was then determined (Accession No. in GenBank, AY 380839). The identified replicon contains 4992 nucleotides with an overall G + C content of 65%. The high G + C content of the replicon sequence is Fig. 2. The location and organization of the pCXC100 replicon. The 17 kb HindIII fragment of pCXC100 was drawn to scale with the corresponding restriction sites indicated, A -AvaI, B -BglII, H -HindIII, N -NcoI, P -PstI, R -EcoRI, S -SmaI, X -XhoI. All the plasmids listed were generated as described in Table 1. The restricted DNA fragments of pCXC100 cloned into pBR325 (pLXC101, 106-111) were indicated. The restriction enzymes used to generate each DNA fragment are listed at their ends. The ability of each plasmid to yield viable L. xyli subsp. cynodontis transformants and the stability in the bacterium are indicated on the right side. A ''+'' indicates the plasmid was capable of yielding viable clones or was stable when propagating in L. xyli subsp. cynodontis, and a '')'' indicates the plasmid was incapable of yielding viable clones or was not stable when propagating in the bacterium. One asterisk indicates the plasmid loss at 1% per generation, and the double-asterisk indicates the plasmid loss at 14% per generation. The organization of the encoded proteins and cis-acting elements in the 4992-nt replicon was shown at the lower panel. The sequence of several cis-acting elements is listed at the bottom. IR indicates the invert repeat sequence, and DR indicates the direct repeat sequence. similar to that of the ribosomal RNA gene [24]. Proteins potentially encoded in the replicon were identified by searching the database. The search criteria were set to have the open reading frame (ORF) consist of at least 50 codons preceded by a potential Shine-Dalgarno sequence at an appropriate distance (6-15 bp) from one of the commonly used initiation codons (AUG, UUG, and GUG). Shine-Dalgarno sequence was determined by complementation with the 3 0 sequence of 16S rRNA of L. xyli subsp. cynodontis (5 0 GGCUGGAUCACCUCC-UUUCU 3 0 ) [24]. Search results revealed the presence of a putative replication protein in the NcoI-AvaI DNA region and two putative stability proteins in the AvaI-EcoRI region, which is consistent with the experimental results described above. These putative ORFs, together with a number of putative cis-acting elements, depict the primary structure of the plasmid survival kit of pCXC100 plasmid (Fig. 2).

The large replication protein RepA and regulatory elements
Consistent with our finding that NcoI-AvaI region was responsible for the replication of plasmids in L. xyli subsp. cynodontis, the largest ORF in this region encodes an arginine-rich protein that shares homology with the pLR7 family of replication proteins, and was therefore designated RepA (Fig. 3) [25][26][27][28][29]. We noticed that the sequence of the putative RepA is much longer than those of the pLR7 family, with the extra sequence being dispersed between the conserved regions. Nevertheless, the putative RepA of pCXC100 shares the most conserved regions the Rep proteins in the pLR7 family [27]. The conservation at the N-terminal region is much more extensive than that at the C-terminal (Fig. 3 and data not shown). Like the replicon of the plasmid pLR7 [25], a number of invert repeats are located in the putative promoter region of RepA (Fig. 2). These invert repeats may serve as cis-acting elements to regulate the RepA expressions. The putative RepA also contains three helix-turn-helix DNA-binding motifs (184-205 aa, 195-216 aa, 337-358 aa) that are the characteristic structure of Rep proteins [2]. Deletion of the N-terminus and the putative promoter region of RepA resulted in the loss of plasmid replication ability in L. xyli subsp. cynodontis, but partial deletion of C-terminus had no effect on the plasmid replication (see pLXC110 and pLXC111 in Fig. 2), confirming the importance of the promoter and N-terminal region. On the other hand, the 51 aa in the C-terminal region of RepA may be not essential in plasmid replication, as demonstrated by the replication capability of the deletion mutant pLXC100.
A conserved DNA region upstream of the rep gene of the plasmid was found in the pLR7 related plasmids [27,28], however, no such homology was found upstream of the repA gene of pCXC100. Nonetheless, two 21-nt repeat sequences, repeated 11 and 7 times separately, were evident within the deduced sequence of RepA (DR1 located in 2455-2685 bp, DR2 located in 2691-2837 bp Fig. 2). These repeat sequences may represent a potential regulation site that is similar to the 54 bp iteron within the deduced sequence of RepA of the plasmid pCI2000 from gram-positive bacterium Lactococcus lactis [9]. An AT-rich region was located before the putative promoter of RepA (Fig. 2), representing another potential regulation site for the plasmid replication [2,29]. Upstream of the AT-rich region is the ORF1 that potentially encodes a protein consisting 152 amino acids. A search of the database showed that a large portion of the ORF1 protein is homologous to the putative type I restrictionmodification system methylase of the Corynebacterium efficiens (gi:25028882). It is not clear if ORF1 encodes any function related to the methylation of plasmid DNA.

The partition protein ParA and par locus
Two ORFs were identified a few hundred nucleotides downstream of the repA gene and in the AvaI-EcoRI region encoding the plasmid stability function. The putative protein encoded by the upstream ORF (317 amino acids) is significantly homologous to the ParA family of Walker-type ATPases that are involved in active partition, and was thus designated ParA [3] (Fig. 4). This newly identified ParA contains two highly conserved ATP-binding motifs (motifs I and III) and two other conserved motifs (motifs II and IV) (Fig. 4) that characterize the ParA family of ATPase [9,[30][31][32]. The presence of N-terminal 177 amino acids comprising all four conserved motifs dramatically increased the plasmid stability (Fig. 2), indicating that the conserved region of parA gene plays major roles in plasmid stability.
There are nine purine-rich direct repeat sequences (DR3) located within the putative promoter region of parA (Fig. 2). DR3 might be the cis-acting centromerelike site, namely parS [3].
ORF4 (encoding 139 amino acids) was located downstream of parA gene with a little overlap. No homologous protein was found in the protein database. As it was demonstrated above, the region containing both the C-terminal region of parA gene and ORF4 further stabilized the plasmid, extending the plasmid stability from 99% to 100% in our analysis (Fig. 2). However, it is unclear which resultant protein is responsible for conferring this increased stability.
Almost all known plasmid-encoded par loci consist of three components: a cis-acting centromere-like site (parS) and two trans-acting proteins (ParA and ParB) forming a partition complex at parS. The upstream gene generally encodes an ATPase and the downstream gene encodes a protein binding to parS [3,32]. It is known that type I par loci contain the Walker-type ATPase and the type II contains ATPase that belongs to the actin/ hsp70 superfamily. Two subgroups of type I partition loci were discovered. Type Ia loci encode large ParA (251-420 aa) and ParB (182-336 aa) homologues, and the parS site is located downstream of parB. In contrast, the type Ib loci encode smaller ParA (182-336 aa) and ParB (46-113 aa) proteins, and the parS site is located in the promoter region upstream of the parAB operon [3]. The sizes of the ParA and the Orf4 protein, and the location of the potential parS site (DR3) suggest that the par locus of pCXC100 replicon belongs to the type Ib loci, in which the ORF4 should encode ParB protein. This is similar to the par locus of plamids pAW63 and pCI2000 isolated from gram-positive bacteria Bacillus thuringiensis and Lactococcus lactis, respectively [3,9,11]. However, a motif search of the Orf4 protein found no typical helix-turn-helix DNA-binding motif, weakening the possibility that the Orf4 protein serves as a DNA binding protein to regulate the expression and activity of ParA protein [3]. The detail function of ORF4 remains to be elicited.