Transfer activation of SXT/R391 integrative and conjugative elements: unraveling the SetCD regulon

Integrative and conjugative elements (ICEs) of the SXT/R391 family have been recognized as key drivers of antibiotic resistance dissemination in the seventh-pandemic lineage of Vibrio cholerae. SXT/R391 ICEs propagate by conjugation and integrate site-specifically into the chromosome of a wide range of environmental and clinical Gammaproteobacteria. SXT/R391 ICEs bear setC and setD, two conserved genes coding for a transcriptional activator complex that is essential for activation of conjugative transfer. We used chromatin immunoprecipitation coupled with exonuclease digestion (ChIP-exo) and RNA sequencing (RNA-seq) to characterize the SetCD regulon of three representative members of the SXT/R391 family. We also identified the DNA sequences bound by SetCD in MGIVflInd1, a mobilizable genomic island phylogenetically unrelated to SXT/R391 ICEs that hijacks the conjugative machinery of these ICEs to drive its own transfer. SetCD was found to bind a 19-bp sequence that is consistently located near the promoter −35 element of SetCD-activated genes, a position typical of class II transcriptional activators. Furthermore, we refined our understanding of the regulation of excision from and integration into the chromosome for SXT/R391 ICEs and demonstrated that de novo expression of SetCD is crucial to allow integration of the incoming ICE DNA into a naive host following conjugative transfer.


INTRODUCTION
Integrative and conjugative elements (ICEs) have recently been shown to be the most abundant conjugative elements in practically all prokaryotic clades (1,2). As such, ICEs are a major driving force of bacterial genome evolution allowing rapid acquisition of a variety of new traits and adaptive functions such as virulence, metabolic pathways and resistance to antimicrobial compounds, heavy metals or bacteriophage infection (3)(4)(5). For instance, ICEs of the SXT/R391 family largely contribute to the spread of antibiotic resistance genes in the seventh-pandemic lineage of Vibrio cholerae, the etiologic agent of cholera, which remains a major cause of mortality and morbidity on a global scale (6)(7)(8)(9). Most SXT/R391 ICEs found in V. cholerae clinical isolates confer resistance to sulfamethoxazole and trimethoprim, two antibiotics commonly used for the treatment of cholera (10,11). Since the early 90's, SXT/R391 ICEs have become widespread in environmental and clinical V. cholerae isolates from Asia and Africa (7,8). SXT/R391 ICEs are also present in all isolates recovered from cholera patients in Haiti (12)(13)(14)(15), are naturally occurring in many other Gammaproteobacteria (6,(16)(17)(18) and are easily transferred to Escherichia coli in the laboratory (19). The SXT/R391 ICEs are grouped together because they share a common set of 52 highly conserved genes, among which ∼25 are important for their maintenance, dissemination by conjugation, and regulation (6,20). Highly conserved genes in SXT/R391 ICEs are distributed in seven distinct clusters separated by variable cargo DNA (HS1 to 5 and VRI to IV) ( Figure 1A). These conserved clusters consist of int (integration/excision), mob1-2 (DNA processing), mpf1-3 (mating pair formation modules 1, 2 and 3) and reg (regulation) ( Figure 1A) (6).
Mobilizable genomic islands (MGIs) are small (<33 kb) genomic islands found in several species of marine  (6). Genes are represented by arrows and are color coded as follow: blue, integration and excision; yellow, DNA processing; orange, mating pair formation; purple, RecA-independent homologous recombination and Umu-like mutagenic repair; green, transcriptional activator; red, transcriptional repressor; grey, other or hypothetical functions. Variable cargo DNA inserted in the conserved core of SXT/R391 ICEs is marked by arrowheads (HS1 to 5 and VRI to IV). The left and right chromosomal attachment sites attL and attR are also shown. (B) SetCD ChIP-exo analysis for R391, ICEVflInd1 and SXT. For each ICE, four tracks are shown. First track: number of ChIP-exo reads mapped on ICE DNA sequence (pink dots at the top of black bars indicate a signal beyond the represented y-axis maximal value). Roman numbers indicate ChIP-exo peaks conserved between the three ICEs. Second track: genes from conserved core (same color code as in panel A), variable DNA regions (white) and antibiotic/heavy-metal resistance genes (pink). Third and fourth tracks: position of ChIP-exo enrichment peaks and the position of SetCD-binding motifs, respectively. SetCD motifs were identified for each ICE using the corresponding logo shown in panel D and are represented by green and red tick marks on positive and negative strands, respectively. (C) RNA-seq experiments on wild-type SXT in the presence or absence of mitomycin C, and SXT setCD in the presence of mitomycin C. For each condition, the upper track shows the reads per kilobase of transcript per million mapped reads value (RPKM) for each gene (black boxes) and the lower track shows the genome-wide 5 -RACE signals (positive strand in green, negative strand in red). Pink dots are as in B. (D) Logo sequences recognized by SetCD in R391, ICEVflInd1, SXT, as well as the consensus logo of all three ICEs.
Gammaproteobacteria (21). SXT/R391 ICEs can mediate the transfer in trans of a particular class of MGIs at high frequency by an unusual mechanism. ICE-encoded relaxosome proteins recognize a cis-acting locus in MGIs that mimics the origin of transfer (oriT) of SXT/R391 ICEs (22). Unlike conjugative plasmids, ICEs and MGIs do not stably maintain by extra-chromosomal replication and must integrate into the host cell's chromosome to be vertically inherited (3,(22)(23). The gene pairs int/xis and int MGI /rdfM are key components for the maintenance of SXT/R391 ICEs and MGIs, respectively. int and int MGI code for two distinct and unrelated integrases, which mediate the integration of SXT/R391 ICEs into the 5 end of prfC (peptide chain re-lease factor RF3), and the integration of MGIs into the 3 end of yicC (protein of unknown function), respectively (22,24). xis and rdfM encode recombination directionality factors (RDFs), which facilitate the integrase-mediated excision from the chromosome of SXT/R391 ICEs and MGIs as circular molecules that serve as substrates for the conjugative transfer machinery (23,25).
The conjugative transfer of SXT/R391 ICEs is regulated by three conserved genes located near the attR attachment site ( Figure 1A) (20). setR encodes a cI-related transcriptional repressor, which prevents the expression of setC and setD (26,27). Agents that damage DNA and induce the host SOS response (UV light, mitomycin C, ciprofloxacin) are Nucleic Acids Research, 2015, Vol. 43, No. 4 2047 thought to induce the RecA*-stimulated autoproteolysis and inactivation of SetR, thereby alleviating the repression of setCD and allowing excision and transfer of SXT/R391 ICEs (26) (Supplementary Figure S1A). The proteins SetC and SetD are thought to assemble as a heteromeric complex that activates the expression of SXT/R391 genes important for conjugative transfer (20,25). SetCD has also been reported to activate the expression of the mosAT toxinantitoxin system carried by SXT (28) as well as the expression of int MGI and rdfM of MGIs, thereby triggering the excision of MGIs from the chromosome (22,23). The number of promoters activated by SetCD in SXT/R391 ICEs and the nature of the SetCD operator sites are currently unknown.
In this study we characterized the SetCD regulon and DNA motifs bound by SetCD in three SXT/R391 ICEs and one MGI originating from three different pathogens using chromatin immunoprecipitation coupled to exonuclease digestion (ChIP-exo) and RNA sequencing (RNA-seq). From this analysis, we identified and validated sequences of the SetCD operators. Finally, we investigated the dynamics of integration and excision of SXT to address the regulation of expression of xis and int in both donor and recipient cells. We demonstrated that SetCD must be expressed de novo in the recipient cells to allow the establishment of SXT in a new host.

Bacterial strains and bacterial conjugation assays
The E. coli strains used in this study, all derivatives of CAG18439 or BW25113 (29,30), are described in Table 1. The strains were routinely grown in Luria-Bertani (LB) broth at 37 • C in an orbital shaker/incubator and were maintained at −80 • C in LB broth containing 15% (vol/vol) glycerol. Antibiotics were used as described in Text S1. Conjugation assays were performed as described elsewhere (31). To induce expression of Int from pInt33 and SetCD from pGG2B in complementation assays, mating experiments were carried on LB agar plates supplemented with 0.02% arabinose (32).

Molecular biology methods
Genomic and plasmid DNA preparation, PCR products amplification and purification, electro-transformation of E. coli, gene expression analysis by quantitative real-time PCR (qRT-PCR) and ␤-galactosidase assays, Southern blotting, contour-clamped homogeneous electric field pulsed field gel electrophoresis (CHEF-PFGE) and sequencing were performed using standard molecular biology techniques. Details are provided in Text S1.

Plasmid and strain constructions
Plasmids and primers used in this study are described in Table 1 and Supplementary Table S2, respectively. Mutants of SXT, R391 and ICEVflInd1 were constructed using the one-step chromosomal gene inactivation technique using pKD3, pKD13 and pVI36 as templates (33,34). Constructions of reporter and expression vectors were done using conventional molecular methods. Detailed methodology is described in Text S1.

ChIP-exo experiments and RNA sequencing
The ChIP-exo, RNA-seq and genome-wide 5 rapid amplification of cDNA ends (5 RACE) experiments were conducted as described in Carraro et al. (35). Additional details are provided in Text S1. Sequenced libraries are described in Supplementary Table S3.

Data availability
Fastq files for each experiment were deposited at the NCBI Sequence Read Archive under accession numbers SRX708080 and SRR1583172 for SXT ChIP-exo, SRX708425 and SRR1583516 for R391 ChIP-exo, SRX708426 and SRR1583532 for ICEVflInd1/MGIVflInd1 ChIP-exo, SRX708086 and SRR1583199 for SXT RNA-seq as well as SRX708115 and SRR1583202 for SXT 5 -RACE. Complete data from aligned reads for ChIP-exo, RNA-seq and 5 -RACE can also be visualized using the UCSC genome browser at http://bioinfo.ccs.usherbrooke.ca/setCD.html.

Characterization of the SetCD regulon in three SXT/R391 ICEs
The exact target genes and sequence motif recognized by the SetCD complex, which plays an essential role in ICE conjugative transfer activation, has yet to be determined. Using RNA-seq and ChIP-exo (35,36), we undertook an exhaustive characterization of the SetCD regulon in three ICEs found in clinical isolates of three different pathogens: the prototypical ICEs SXT from V. cholerae O139 (19) and R391 from Providencia rettgeri (37) as well as ICEVflInd1 from Vibrio fluvialis (38). E. coli strains DPL492, DPL491 and DPL493 ( Table 1) bearing derivatives of SXT, R391 and ICEVflInd1, each expressing a native SetD subunit along with a SetC subunit C-terminally fused to the 3xFLAG tag (SetC 3xFLAG ), were used in these experiments (Supplementary Figure S1A). The 3xFLAG tag did not affect the function of the SetC orthologs based on the similar transfer frequency of each ICE relative to its wild-type counterpart (Supplementary Figure S1B).
The ChIP-exo and RNA-seq assays were carried out after induction of the cell cultures using mitomycin C to trigger expression of SetCD from its native promoter. ChIP-exo data analyses revealed nine major SetCD enrichment peaks located upstream of the same genes and operons--most of which playing a key role in conjugative transfer--in the conserved core sequence shared by the three ICEs ( Figure 1A, B and Supplementary Table S1). Four of these peaks are located upstream of genes that are predicted to be involved in the formation of the mating pore: traL (conjugal transfer pilus assembly protein, peak V), traV (outer membrane lipoprotein, peak VI), dsbC (conjugative disulfide bond isomerase, peak VII) and traF (conjugal pilus assembly protein, peak IX). One peak was also present inside the 3 end of the predicted relaxase gene traI, upstream of the gene traD (type IV coupling protein, peak IV). Two additional peaks were observed, one upstream of xis (peak I) and one (peak II) in the intergenic region between mobI (auxiliary component of the relaxosome) and s003, which is part of the operon containing int. The strongest peak (peak VIII) was detected in the intergenic region between the two divergent genes s063 and s089. s089 is the first gene of a large operon coding for a RecA-independent homologous recombination system (39,40). The last statistically significant peak (peak III) is located in the intergenic region between rumA (UmuD-like protein) and s024. Transcriptional activity measured by RNA-seq in E. coli HW220 (wild-type SXT) and DPL3 (SXT setCD) correlates with the presence of a SetCD-binding site as the expression of 29 out of 52 core genes in SXT, including genes located downstream of SetCD-binding sites, is significantly increased upon mit-omycin C induction compared to a setCD mutant under the same conditions ( Figure 1C and Dataset S1). Most genes that are not significantly affected by the expression of SetCD in SXT appear to be either inactive or constitutively expressed, and are mainly found in variable cargo DNA. Their functions are unknown or not directly tied to conjugative transfer, and include the antibiotic resistance genes, transposase genes, the s027-s040 gene cluster, the diguanylate cyclase gene dgcL among others (Dataset S1).

Characterization of SetCD-dependent promoters
We carried out de novo motif discovery of DNA sequences bound by SetCD for each independent ICE ChIP-exo dataset, thereby generating three highly similar logo sequences ( Figure 1D) in which subtle differences between the extracted motifs reflect ICE-specific polymorphisms in pro-Nucleic Acids Research, 2015, Vol. 43, No. 4 2049 moter regions. For each ICE, we next determined the exact location of proposed SetCD-binding sites within the ChIPexo peaks ( Figure 1B) and observed a footprint likely corresponding to the SetCD and RNA polymerase holoenzyme complexes bound to the corresponding promoters (Figure 2A-F, and Supplementary Table S1) (35,41). In some instances, we were able to identify two occurrences of a SetCD motif within the same peak. For example, the intergenic region s063-s089 contains back-to-back SetCD-binding motifs, thereby revealing the presence of two SetCD-activated divergent promoters ( Figure 2E). 5 -RACE and primer extension analyses ( Figure 1C and Supplementary Figure S2) allowed us to determine transcription start sites (TSS) in SXT, revealing that SetCD-binding motifs are located immediately upstream of the −35 promoter box. This promoter structure was observed for all TSS located between a SetCD-binding motif and a gene in the same orientation ( Figure 2G). This organization is reminiscent of class II activation, in which the activator binds to a sequence that overlaps the promoter −35 element and usually contacts the RNA polymerase subunit (41).

Validation of the SetCD operator sequences
We validated that the proposed SetCD-binding motifs promote the observed binding of SetCD and transcriptional activation by fusing the lacZ reporter gene to the promoters P s003 (DPL453) and P xis (DPL393) that are responsible for the expression of int and xis, respectively. For each promoter, two mutants were also generated. A first variant, -53, lacked the sequence immediately upstream of the promoter-proximal SetCD motif (DPL400 and DPL394) while another variant, -36, lacked the entire region upstream of the −35 promoter box, thus removing the first 17 bp of the proximal SetCD box (DPL490 and DPL489) ( Figure 3A and B). ␤-galactosidase assays were then carried out upon setCD expression from the arabinose-inducible P BAD promoter. Addition of the predicted SetCD boxes of P s003 and P xis upstream of lacZ boosted the ␤-galactosidase activity by respectively ∼520and 2 200-fold, while the presence of additional upstream sequence made no difference ( Figure 3B, compare WT with -53). When the putative SetCD boxes were missing, the ␤-galactosidase activity dropped below the detection limit of our assay ( Figure 3B, compare -53 with -36).
To confirm that this motif alone, and not a hypothetical factor acting in cis, was sufficient to confer SetCDdependent induction of gene expression, we substituted the native-binding site of the cAMP receptor protein (CRP) of the P lac promoter upstream of lacZYA in E. coli MG1655 by the SetCD box of P s003 . Two chimeric promoters containing the operator mutations lacZo CD1 and lacZo CD2 were constructed ( Figure 3C). The lacZo CD1 mutant (DPL494) retained the −35 element of P lac , whereas it was substituted by the −35 element of P s003 in the lacZo CD2 mutant (DPL501). The three operator sites o 1 , o 2 and o 3 of the LacI repressor remained unaffected in both constructs. setCD was expressed under the control of P BAD from pGG2B in the strains containing the constructions and in wildtype MG1655. The absence of significant ␤-galactosidase activity observed on M9 glycerol medium supplemented with glucose for both lacZo CD1 or lacZo CD2 mutations confirmed the inability of the hybrid promoters to respond to the activation by CRP bound to cAMP regardless of the alleviation of LacI repression by IPTG and of the presence of repressed setCD ( Figure 3D). In contrast, when glucose was replaced by arabinose, expression of setCD triggered a strong expression from the promoter containing lacZo CD1 , producing dark blue colonies, and weak expression from the one containing lacZo CD2 (Figure 3D and E). Strong expression from the latter was observed only upon concomitant alleviation of LacI repression by addition of IPTG (Figure 3D and E). These results indicate that expression of the lacZYA operon became SetCD-dependent when the CRP operator site of P lac was replaced by either lacZo CD1 or lacZo CD2 . Interestingly, the variant lacZo CD2 seemed to remain strongly repressed by LacI upon setCD overexpression as shown by the lack of induction in the absence of IPTG, while the variant lacZo CD1 was not ( Figure 3E). This phenotypical difference observed between the two mutants can be attributed to their respective −35 sequence. The −35 of lacZo CD2 (CACCGC) is very distant from the 70 consensus, while lacZo CD1 harbors the more canonical −35 of P lac (TTTACA). Taken together, these experiments confirm that the ChIP-exo derived SetCD motif alone is sufficient to confer SetCD-dependent activation of gene expression.

ChIP-exo assays reveal SetCD-regulated conserved genes in MGIs
The strain DPL493 used for ChIP-exo assays also contains MGIVflInd1, an MGI originally detected in the same V. fluvialis strain that contains ICEVflInd1 (Table 1), allowing us to monitor on this MGI the binding of SetCD provided in trans by ICEVflInd1. Two major peaks were detected on MGIVflInd1, both mapping upstream of two of the four conserved core genes ( Figure 4A-C and Supplementary Table S1). The first one was detected upstream of rdfM and the second one was found upstream of cds4, a gene of unknown function. To test whether cds4 is regulated by SetCD, we measured its expression by qRT-PCR in E. coli containing SXT (AD72), SXT setCD (AD133) or pGG2B, the plasmid expressing SetCD from P BAD (AD132). While expression of cds4 was induced by mitomycin C in the strain containing wild-type SXT, it was nearly abolished in the absence of setCD regardless of the presence of mitomycin C ( Figure 4D). Overexpression of SetCD alone also dramatically increased the level of cds4 transcript (∼30 000fold induction, Figure 4E

Establishment of SXT into a naive host requires de novo expression of setCD
Although it is known, and confirmed by our results, that SetCD activates the expression of SXT int in the donor strain (20,25), the importance of SetCD for the expression of int in recipient cells and consecutive integration of SXT in the recipient's chromosome is not clearly established as conflicting observations have been reported. Results from Beaber et al. (20) suggest that setCD is expendable in the recipient cells as setC and setD mutants of SXT have been shown to transfer and establish in recipient cells when the deletions were complemented in the donor cells exclusively. If SetC and SetD are not necessary for integration of SXT in the chromosome of the recipient cells, then expression of int likely occurs at a low constitutive level. Alternatively, SetCD could be produced in the donor cell and translocated through the mating pore into the recipient cell during conjugative transfer to stimulate int expression in the recipient. However, Burrus and Waldor reported that a suicide vector containing attP and int driven by its native promoter (P s003 ) is unable to integrate in recipient cells lacking setCD (25), thereby suggesting that SetCD is required for de novo expression of int in recipient cells.
To clearly address the role of SetCD in the establishment of SXT in the recipient cells, we conducted mating assays using combinations of E. coli donor and recipient strains harboring SXT or its setCD mutant with or without plasmids expressing either int or setCD under control of P BAD ( Figure 5A). Overexpression of int in recipient cells did not enhance the transfer of SXT ( Figure 5A a and b), indicating that SXT integration into the recipient's chromosome is not a rate-limiting step. This conclusion is also supported by setCD overexpression in the donor cell, which resulted in a ∼3-log increase of transfer ( Figure 5A a and c). Consequently, mating pore assembly or DNA translocation is likely the rate-limiting step of SXT conjugative transfer. As expected, transfer of the setCD mutant was abolished, even upon expression of int in recipient cells ( Figure 5A d  and f).
Intriguingly, transfer of SXT setCD was only partially restored when setCD was overexpressed in donor cells, at a rate of only one-fifth of wild-type SXT and ∼4 logs lower than wild-type SXT upon setCD overexpression (Figure 5A, compare a versus e and c versus e). Because in such a context, expression of the transfer genes is not compromised in donor cells, the low rate of transfer of mating e suggests that integration of SXT setCD into the recipient's chromosome became the rate-limiting step of transfer. Assuming that this phenotype was attributable to weak or lack of expression of int in the recipient cells, we can rule out that the SetCD protein complex is translocated, at least in significant amounts, from the donor to recipient cells during SXT transfer. Indeed, stimulation of int expression mediated by translocated SetCD should have enabled normal  Table 1). The right panel reports ␤-galactosidase activities expressed as the ratio between the Miller units in the arabinose-induced versus non-induced conditions. Results are the means and standard deviations of at least three independent biological replicates. The P-values from a two-way ANOVA with Tukey's multiple comparison test comparing the log of the means of the P s003 and P xis WT or -53 variants relative to the corresponding -36 mutant are indicated. (C) Organization of two SetCD-dependent mutants of P lac . In both mutants the CRP-binding site was replaced by the SetCD operator of P s003 . lacZo CD1 carries the −35 of P lac , while lacZo CD2 has the −35 of P s003 . (D) Wild-type E. coli MG1655 (WT), lacZo CD1 mutant DPL494 (CD1) and lacZo CD2 mutant DPL501 (CD2) grown on M9 glycerol minimal medium supplemented with glucose or arabinose, with or without IPTG. The strains were carrying either pBAD30 or its setCD-expressing derivative pGG2B. (E) ␤-galactosidase activity measured for E. coli MG1655, DPL494 (lacZo CD1 mutant) and DPL501 (lacZo CD2 mutant) containing pGG2B grown in M9 glycerol minimal medium with glucose or arabinose, with or without IPTG. Ratios between the Miller units in the arabinose-induced versus glucose conditions are shown. Results are the means and standard deviations of four independent biological replicates. The P-values from a two-way ANOVA with Tukey's multiple comparison test comparing the log of the means of the lacZo CD1 and lacZo CD2 mutants relative to the wild-type in the corresponding conditions are indicated. integration of SXT setCD at a wild-type rate despite the lack of setCD genes in the recipient cells. In fact, transfer of SXT setCD was fully restored to wild-type level only upon concomitant overexpression of setCD in the donor and int in the recipient ( Figure 5A g and c versus e). Altogether, these results suggest that the SetCD complex is not translocated from the donor to recipient cells during conjugation. Instead, setCD is expressed de novo upon entry of SXT in the recipient cells, allowing int expression to mediate SXT integration.

A setCD-null mutant of SXT maintains atypically in exconjugant colonies
The setCD mutation seemed to hinder the expression of int in the recipient cells, eventually leading to loss of the incoming SXT. Suboptimal int expression could reduce integration or promote maintenance of the ICE by other means. We submitted a sample of seven setCD exconjugants randomly picked from mating e ( Figure 5A) to profiling by Southern blot hybridization and PFGE analyses. Southern blot probing of the genomic DNA of these exconjugants with a fragment overlapping attP revealed atypical restric-tion patterns compared to the control donor strain containing SXT integrated as a single copy into prfC (HW220) ( Figure 5B). All exhibited a signal for attP but lacked the characteristic attL and attR fragments normally present after correct SXT integration, suggesting that SXT setCD failed to integrate site-specifically. This result was also supported by PCR amplification of attB, which confirmed that the 5 end of prfC was intact in all seven exconjugants (Supplementary Figure S3A and B). At least two possible mechanisms could explain the formation of such anomalous exconjugants. First, SXT setCD could have integrated by homologous recombination or transposition potentially through one of the insertion sequence present in the variable region VRIII ( Figure 1A). Alternatively, SXT setCD could maintain as a circular replicative molecule. We tested both hypotheses by subjecting the genomic DNA of the same exconjugants to SpeI restriction, which does not cut SXT, and PFGE. None of the exconjugants exhibited the expected 166-kb SpeI fragment containing SXT as seen in the control donor strain HW220 ( Figure 5C). Instead distinct restriction patterns were observed. In the most frequent pattern (five out of seven exconjugants), a  Figure 1D. Sequences are organized as in Figure 2G. The int MGI SetCD-binding motif was found by FIMO while cds4 and rdfM motifs were found by MAST. The TSS of rdfM was determined by primer extension (Supplementary Figure S2) and is located at position 18 284-18 285 on the negative DNA strand. new large fragment of ca. 230 kb was observed. This is inconsistent with the possibility of tandem integration of multiple copies of SXT at prfC since two copies of SXT would result in a larger 257-kb SpeI fragment (31) suggesting that SXT setCD integrated into different chromosomal loci. None of these patterns exhibited a 99-kb band compatible with a replicative form of SXT. Altogether these results confirmed that SXT setCD is unable to integrate into prfC in a site-specific fashion upon entry into a naive host, sup-porting that int expression was compromised or abolished in setCD mutants.

Expression of int and xis requires activation by SetCD
The defective integration phenotype of SXT setCD suggests that expression of int strictly depends upon activation by SetCD. To test this hypothesis, we monitored the expression of int by qRT-PCR on cDNA derived from E. coli strains containing SXT (HW220) or its setCD deriva-tive (DPL3). We also monitored the expression of xis in the same conditions as it is expected to be expressed only upon activation of SXT transfer to promote the site-specific excision reaction over integration. The DNA-damaging agent mitomycin C increased int and xis mRNA transcript levels by 31-and 39-fold, respectively, in a setCD-dependent fashion ( Figure 5D), thereby supporting our RNA-seq results showing that their expression is controlled by SetCD. Furthermore, both int and xis transcript levels were slightly above the limit of detection for wild-type SXT in noninducing conditions ( Figure 5D). Spontaneous induction of the SOS response in a subpopulation of the cell culture (42) likely accounts for this low basal level of expression and for the basal level of SXT transfer (∼5×10 −4 exconjugant/recipient in Figure 5A a). In the same conditions, expression of both int and xis dropped below the detection level for SXT setCD regardless of the presence or absence of mitomycin C ( Figure 5D). Together with the failure of the setCD-null mutant of SXT to integrate into prfC of recipient cells, these results demonstrate that de novo expression of setCD is required to trigger the expression of int in recipient cells and allow stable maintenance and inheritance of SXT/R391 ICEs.

DISCUSSION
In many archetypical conjugative elements such as the In-cFI F conjugative plasmid or the ICE Tn916, genes coding for the conjugative apparatus are organized as a single polycistronic operon (43,44). In contrast, although the tra genes of SXT/R391 ICEs are syntenic with the tra genes of F-like conjugative plasmids (20,43), they are distributed in five distinct gene clusters, separated by variable DNA clusters ( Figure 1A) (6). Such an organization requires an adaptable activation system that allows coherent expression of the diverse components essential for ICE conjugation. This includes activation of the site-specific recombination system mediating excision from the chromosome as a circular molecule (xis and int), assembly of the mating apparatus (traLEKBVA, dsbC/traC/trhF/traWUN and traFHG), initiation of ICE DNA transfer (mobI/oriT and traIDJ) and finally integration into the recipient cell's chromosome (int) ( Figure 6A).
In the present study we have established using ChIP-exo and RNA-seq experiments that SetCD coordinate the expression of many genes and operons by binding upstream of the TSS of xis, s003, mobI, traI, traD, traL, traV, dsbC, s063, s089 and traF ( Figure 6A). This transcriptional regulation profile is similar to the one recently reported for the conjugative transfer functions encoded by the IncA/C conjugative plasmids that are activated by AcaCD, an activator complex distantly related to SetCD (34% and 23% identity between the C and D subunits, respectively) (35). In fact, activation of gene expression by SetCD and AcaCD seems to be reminiscent of the activation of transcription by the transcriptional master regulator of bacterial flagellum biogenesis FlhCD. The complex FlhC 2 D 2 binds 30-bp upstream of the 70 -dependent TSS and activates the expression of class II flagellar operons, which encode components of the flagella basal body and export machinery (45). O'Halloran et al. (46) previously reported a predicted potential SetCD-binding site upstream of xis in R391 based on analogies with 'FlhD 2 C 2 box' arms. However, no such a binding motif exists upstream of xis in ICEs of the SXT/R391 family as the SetCD DNA recognition motif obtained by combining the SetCD-bound promoters of SXT, R391 and ICEVflInd1 drastically differs from the FlhD 2 C 2 (47)(48)(49)(50) and AcaCD (35) binding sites. This is not surprising given the low similarity between SetCD, FlhCD and AcaCD (20,35). The −35 and −10 elements of all SetCD-dependent promoters studied here are poorly conserved relatively to the 70 canonical −35 and −10 promoter boxes (51,52). In fact, the −35 region lacks a recognizable motif of the canonical −35 signal (TTGACA). In all the SetCD-dependent promoters, we found that the SetCD box overlaps or is located immediately upstream of the −35 element (Figures 2G and 4F). The close proximity of the SetCD box near the sequence usually recognized by factors suggests that binding of SetCD compensates for the lack of a recognizable −35 element, allowing recruitment of the RNA polymerase holoenzyme, in a manner similar to the class II CRP-, FNR-and FlhD 2 C 2 -dependent promoters (53)(54)(55). Biochemical characterization of SetCD is needed to clarify whether SetCD operates like FlhD 2 C 2 by interacting with the RNA polymerase ␣ subunit C-terminal domain (56).
Our data allowed us to deepen our understanding of the regulation of ICEs of the SXT/R391 family. To date both traIDJ and traLEKBVA were presumed to be two single polycistronic operons each regulated by a unique promoter located upstream of traI and traL, respectively (6,20). To our surprise, our data indicate that expression of these two gene clusters is much more complex. First, the promoter of traI differs between ICEs because the −10 element is part of the conserved core whereas the −35 region is brought by variable DNA found in the variable region HS5 (Figure 1A and Supplementary Figure S4B). Yet clearly, traI expression is dependent of SetCD in SXT as shown by our RNA-seq data (Dataset S1) and close examination of the ChIP-exo signal in the region upstream of traI suggests the presence of a potential weaker promoter footprint as well as a degenerate SetCD-binding motif for all three ICEs (Supplementary Figure S4A and Table S1). This raises interesting questions about the selective pressure operating on sequences inserting near the −35 promoter element of traI for the conservation of a functional SetCD operator. Second, traD expression is driven from a SetCD-dependent promoter located within traI ( Figure 2C and G). Finally, the traLEKBVA gene cluster likely corresponds to two independent operons, traLEKB and traVA, although we cannot rule out the existence of mRNA transcripts containing traL to traA (Figures 1A and C, 2D and G). ChIP-exo data revealed a strong peak and a well-conserved SetCD motif within the 5 end of the coding sequence of traV in all three ICEs. Examination of the annotation of traV revealed that a much better suited ribosome binding site with a nearly canonical Shine-Dalgarno (SD) sequence is located 75 bp downstream of the original traV annotation. As a consequence, we propose to redefine the start codon of traV at this new location ( Figure 2D and G). Wozniak and Waldor (28) reported that mosAT, which is located in HS2 and encodes a toxin-antitoxin system promoting the maintenance of SXT, was induced by SetCD. Clearly, our results show that there is no SetCD-binding site upstream of mosA, and that mosAT is not differentially expressed upon induction of SetCD by mitomycin C (Supplementary Table S1 and Dataset S1). This suggests that the reported increased expression of mosAT likely resulted from read-through of the mRNA transcript initiated by SetCD at the promoter P traV .
The upstream region of the mutagenic DNA repair system rumAB of R391 was reported to contain a strong match to known LexA-binding sites at positions −44 to −25 relative to the rumA initiation codon (57). Although we have found a SetCD-binding site 530 bp upstream of rumA in the correct orientation to drive rumA expression in R391 only ( Figure 1B), our RNA-seq data in SXT clearly shows that upon mitomycin C induction, rumA is not differentially expressed in a wild-type SXT compared to the corresponding setCD-null mutant (Dataset S1). Our data rather support the idea that rumAB is part of the LexA regulon, not of the SetCD regulon (57).
The flexibility provided by SetCD has likely helped shaping the complex genetic structure and remarkable plasticity of SXT/R391 ICEs. Ironically, SetCD has also become a beacon signaling the presence of SXT/R391 ICEs to parasitic genomic islands that hijack the ICE transfer machinery. MGIs mimic SetCD-binding sites to activate their own excision in response to the presence of an ICE of the SXT/R391 family in the same cell. We have found here that SetCD binds upstream of rdfM, a gene that is known to be activated by SetCD (23), and upstream of cds4, a conserved gene of unknown function (Figure 4). Although int MGI has been shown to be activated by SetCD (22,23), ChIP-exo failed to identify a statistically significant binding of SetCD upstream of this gene in MGIVflInd1 but a more degenerate SetCD motif can be found using the Find Individual Motif Occurrences tool (FIMO) ( Figure 4F). This is consistent with our previous report that int MGI is induced only 300-fold by SetCD overexpression whereas rdfM is induced 2 000-fold in identical conditions (23). Nevertheless, int MGI was also shown to be constitutively expressed at low level in the absence of SetCD, thereby allowing integration of the MGI into the chromosome of the recipient cell independently of the cotransfer of an SXT/R391 ICE (23) (Figure 6B). This strategy likely favors MGI's 'survival' and dissemination as <2% of recipient cells receiving an MGI have been shown to simultaneously receive a copy of the helper ICE (22). Our data confirm that MGI excision is strictly regulated by the activation of rdfM expression, not by int MGI overexpression, and is consistent with the inability of an rdfM-null mutant to excise and transfer (23).
Unlike in MGIs, we demonstrated here that expression of SXT int and xis requires the presence of SetCD in both donor and recipient cells. Our results indicate that an SXT setCD mutant complemented in trans by SetCD only in the donor cells is incapable of integrating site-specifically into the 5 end of prfC. This observation led us to conclude that neither SetCD nor Int are translocated into the recipient strain during conjugation. Instead, setCD is expressed de novo in the recipient, allowing int expression to mediate SXT integration into prfC in a site-specific fashion (Figure 6B). This requirement contrasts with the regulation of MGI integration, which is independent of SetCD. As both int and setCD are physically linked on SXT/R391 ICEs, these mobile elements do not need to rely on a dual promoter regulation of their integrase gene to ensure their survival and dissemination.
Upon entry of an ICE of the SXT/R391 family into a naive host, the repressor SetR is initially absent, and concomitant expression of setCD and setR likely allows buildup of the SetCD and SetR pools. Entry by conjugation of the ICE as single-stranded DNA is known to activate the SOS response (58), which could transitorily contribute to maintain a low pool of SetR protein, thereby favoring the buildup of SetCD ( Figure 6B). However, we showed that the transcript levels are similar for both xis and int in the presence of SetCD whereas the presence of the RDF Xis in the recipient cells reduces SXT transfer, likely by interfering during the recombination between attP and attB (25).
Nucleic Acids Research, 2015, Vol. 43, No. 4 2055 Therefore a yet uncharacterized mechanism likely maintains a low level of Xis protein or delays its expression in the recipient cell to favor chromosomal ICE integration. In the conjugative transposon Tn916, the RDF xis and integrase int genes are part of a single long tetracycline-inducible mRNA transcript reading through the attP attachment site and extending to all of the transfer genes (44). Unlike many mobile integrating elements, xis and int in SXT/R391 ICEs are not organized as an operon. Instead they are two convergent genes suggesting that other factors besides SetCD regulate their expression.
In summary, our study establishes the SetCD-dependent regulation of excision, transfer and integration the ICEs of the SXT/R931 family as well as the genomic islands they mobilize, and highlights the importance of SetCD in triggering the spread of antibiotic resistance conferring mobile elements in V. cholerae populations and in other Gammaproteobacteria.

SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.