Transformation Asymmetry and the Evolution of the Bacterial Accessory Genome

Abstract Bacterial transformation can insert or delete genomic islands (GIs), depending on the donor and recipient genotypes, if an homologous recombination spans the GI’s integration site and includes sufficiently long flanking homologous arms. Combining mathematical models of recombination with experiments using pneumococci found GI insertion rates declined geometrically with the GI’s size. The decrease in acquisition frequency with length (1.08×10−3 bp−1) was higher than a previous estimate of the analogous rate at which core genome recombinations terminated. Although most efficient for shorter GIs, transformation-mediated deletion frequencies did not vary consistently with GI length, with removal of 10-kb GIs ∼50% as efficient as acquisition of base substitutions. Fragments of 2 kb, typical of transformation event sizes, could drive all these deletions independent of island length. The strong asymmetry of transformation, and its capacity to efficiently remove GIs, suggests nonmobile accessory loci will decline in frequency without preservation by selection.


Introduction
Acquisition of genomic islands (GIs) by bacteria can result in increased virulence (Groisman and Ochman 1996), antibiotic resistance (Dobrindt et al. 2004), or evasion of vaccineinduced immunity . Such additions may be driven by the GIs themselves if they are mobile genetic elements (MGEs). Consequently, evolutionary models of the bacterial accessory genome have tended to focus on the gain of novel loci, which either add into the existing genome (Baumdicker et al. 2012;Collins and Higgs 2012) or displace recipient genes (Haegeman and Weitz 2012;Lobkovsky et al. 2013). To maintain stable genome sizes (Mira et al. 2001;Dagan and Martin 2007), some models impose a fitness cost on this expansion (Marttinen et al. 2015), attributed to selection against the energetic costs of DNA replication (Hogg et al. 2007;Baumdicker et al. 2012). Typically, gene loss is modeled as spontaneous deletion (Vogan and Higgs 2011;Baumdicker et al. 2012;Collins and Higgs 2012), reflecting the mutational bias (Mira et al. 2001) that deletions appear to be both larger and more frequent than insertions .
The acquisition of non-MGE GIs typically requires homologous recombination between similar sequences, shared by the donor and recipient, flanking the GI. This can occur through natural transformation, the import of exogenous DNA by the competence machinery. Although the evolutionary advantage of transformation remains controversial, such import of novel genes has been proposed as a possible benefit of the competence machinery (Hogg et al. 2007;Johnston et al. 2013). Transformation also has the potential to remove GIs by replacing them with DNA from a donor that lacks the island. Despite rarely featuring in evolutionary models, this process may be advantageous if it removes deleterious GIs, such as parasitic MGEs (Croucher et al. 2016). Such RecAmediated recombination is expected to seamlessly stitch the flanking regions together (Rosselli and Stasiak 1991;Johnston et al. 2013), without the costs associated with spontaneous deletion, such as damaging surrounding regions or leaving behind nonfunctional GI fragments.
Two criteria must be met for deletion of GIs, particularly MGEs, by transformation to be biologically relevant. First, transformation must exhibit a pronounced asymmetry toward deleting, rather than inserting, heterologous sequences. Second, deletions of single genes and 10-to 30-kb GIs must occur with similar efficiency. Preferential deletion, rather than import, of sequence by transformation has been previously observed in Streptococcus pneumoniae (Claverys et al. 1980;Lefevre et al. 1989) and Bacillus subtilis (Adams 1972), although more recently contradictory results have been recorded (Pasta and Sicard 1996). However, the mutations transferred in these studies were small, else their size was not precisely established, hence their relevance to typical GIs is uncertain (Croucher et al. 2014). Here, we quantify the asymmetry and efficiency with which transformation eliminates GIs from chromosomes.

Letter
ß The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons. org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Results
Assaying the Properties of Transformation Figure 1A describes the components of a model of GI exchange through homologous recombination. A GI, j, of length L D in the donor DNA and L R in the recipient cell is exchanged between cells, e{j}, if a recombination initiating at i (a distance d from j) spans not just j but also the minimum lengths for homologous arms, H 5 0 and H 3 0 , on both sides. Assuming homologous recombinations have a geometric length distribution (Croucher et al. 2012), the probability of GI transfer relative to the exchange of a single-nucleotide polymorphism (SNP) S, e{S} can be quantified as (supplementary text S1, Supplementary Material online): Where k I is the per-base pair rate at which recombinations terminate in heterologous regions, and the factor s I accounts for length-independent differences between the efficiency of GI and SNP transformation. The rate of GI exchange depends on a length, L; yet how L relates to L D and L R results in four models with distinct evolutionary implications (supplementary text S1, Supplementary Material online). In two models, transformation is symmetrical: if any heterology between the donor and recipient DNA inhibits recombination ("heterology limited" model; L ¼ L D þ L R ), then large GI insertion and deletion will be slow, whereas if GI movement is limited only by homologous arm dynamics, all sizes will exchange at the same rate ("annealing limited" model; L ¼ 0). In two of these models, transformation is asymmetrical: GI insertion may be more efficient than deletion if transformation is limited by its size in the recipient genome ("deletion limited" model; L ¼ L R ), else if exchange is limited by its size in the donor DNA, deletion may be more efficient than insertion ("insertion limited" model; L ¼ L D ).
To test these models, an experimental system was generated to measure GI exchange (e{j}) relative to L R and L D . The unencapsulated strain Streptococcus pneumoniae R6x (Tiraby and Fox 1973) was modified through a streptomycin resistance mutation (rpsL*), insertion of the integrative and conjugative element ICESp23FST81 at att rplL (Croucher et al. 2009), and deletion of the phase variable ivr restrictionmodification locus (Croucher et al. 2014  MBE divergence between orthologous sequences (Majewski et al. 2000) or restriction endonucleases (Johnston et al. 2013). To assay the relative rates at which GIs of length L were inserted (e I,L {j}) and deleted (e D,L {j}) through transformation, four genotypes were constructed for each of nine tested L values ( fig. 1). The first, R6I-Li, had L kb of sequence within ICESp23FST81 separating the 5 0 half of the tetM tetracycline resistance gene, immediately upstream of an introduced aph3 0 aminoglycoside resistance gene, from the 3 0 half of tetM, immediately downstream of an introduced ermB erythromycin resistance gene ( fig. 1B). The exception was L ¼ 1 kb, where the tetM gene halves were separated by only ermB (fig. 1C). The second, R6I-Ld, had an intact tetM gene, the intervening L kb of sequence having been removed, including the aph3 0 and ermB genes. PCR assays verified these genotypes had undergone the expected changes (

Transformation Is Asymmetric and Insertion Limited
The analysis encompassed 183 biological replicates, with at least six per recipient genotype, each of which was estimated to generate at least 250 rifampicin-resistant transformants. The data showed clear variation in e L {S} between the constructed genotypes (supplementary fig. S11, Supplementary Material online), which could not be attributed to differences in growth rates (supplementary fig. S12, Supplementary Material online), and therefore an altered model was jointly fitted across all experimental results through maximum likelihood (supplementary text S1, Supplementary Material online): pðe L fjgÞ pðe L fSgÞ ¼ s I s g 1 À k I ð Þ L p e L Sg f Þ ¼ s g À Where s g represented a genotype-specific transformation rate, whereas s I (the length-independent relative GI transformation rate) and k I (the per-base pair rate of recombination termination in j) were fixed across all genotypes (supplementary fig. S11, Supplementary Material online). The experiments measuring e L {j} found a geometric decline with L, consistent with the "insertion limited" and "heterology limited" models (L a L D ). Rare insertions were observed at L ¼ 20 kb only with an elevated concentration of donor DNA. Using bootstrapping to calculate the confidence intervals, s I was estimated as 3.49 (full bootstrap range: 0.98-6.41), and k I was estimated as 1.08Â10 À3 bp À1 (bootstrap range: 6.81Â10 À4 -1.31Â10 À3 bp À1 ). Transformation is therefore inefficient at inserting long GIs.
To distinguish between the "insertion limited" and "heterology limited" models, the effect of L R on the rate of transformation-mediated deletion was measured for each L. Consistent with the latter model, the deletion frequency e D,L {j} was highest for L 2 kb ( fig. 2B). Fitting the geometric decline model estimated s I as 3.51 (bootstrap range: 1.70-6.44), and k I as 4.18Â10 À4 bp À1 (bootstrap range: 2.69Â10 À4 -6.05Â10 À4 bp À1 ). However, for L ! 3 kb, e D,L {j} varied by recipient genotype rather than L, more consistent with the "insertion limited" model. Even at L ¼ 10 kb, e D,L {j} was $50% of e L {S}. Hence transformation-mediated deletion of GIs is substantially more efficient than their insertion.
The asymmetry statistic u L , quantifying the relative insertion and deletion rates for a GI of length L, was calculated as The relationship between u L and L suggested the model: A maximum likelihood fit estimated u 0 , the asymmetry associated with a minimally sized GI, as 0.413 (bootstrap range: 0.355-0.470), and the parameter determining the rate of change with L, k u , as 3.47Â10 À4 bp À1 (bootstrap range: 2.99Â10 À4 -3.92Â10 À4 bp À1 ). Hence transformation is highly asymmetric, favoring deletions across all L.

Homologous Arm Lengths Unaffected by Size of Deletion
The assay was modified to test whether the variation in deletion efficiency reflected length differences in the associated homologous arms. Each of the R6I-Li genotypes was simultaneously transformed with a tetM fragment of length f, which symmetrically spanned j, to measure e D The f/M ratio adjusts for the use of a fixed concentration of donor DNA, meaning the number of molecules available for transformation varies with fragment length. A reproducible increase in y f with f was observed ( fig. 3B), with 500-bp fragments rarely causing deletions at a measureable rate. However, the consistency of the results between genotypes could not explain the irregular pattern of results in figure 2B.
Four different approaches were used to model the observed pattern of deletions (supplementary text S2 and fig. S14, Supplementary Material online), represented by the lines in figure 3B. The balanced models assumed homologous arms of at least H were necessary on each side of j, whereas the unbalanced models assumed the two homologous arms had to total 2 H, which could be unevenly spread across j. The other distinction related to whether the model required successful termination of recombination (terminating models), either through a randomly positioned nick in the donor DNA or other biochemical process, or assumed fragments were imported intact, and any recombination extending to the fragment's end resolved there (nonterminating models). The four models estimated H as between 469 and 499 bp (supplementary table S1, Supplementary Material online), although it was difficult to identify the closest-fitting formulation. To distinguish between the hypotheses, this experiment was repeated with genotype R6I-10d and DNA fragments that asymmetrically spanned j, with one homologous arm constant and the other varying between 250 and 1,500 bp ( fig. 3C). Only the unbalanced models estimated parameters similar to those from the first experiments, as deletions were consistently detectable when the variable homologous arm was just 250 bp (supplementary text S3, Supplementary  Material online and fig. 3D). This demonstrates deletions can occur even with one foreshortened homologous arm, although the imperfect model fits suggest there are nevertheless some constraints on both homologous arm lengths. Deletion was relatively efficient even with small DNA fragments, with little increase in y f as f rose from 2 to 2.5 kb.

Discussion
Transformation asymmetry is likely to have a strong impact on the evolution of the accessory genome, as recombinations of the mean size observed in the pneumococcus ($2.3 kb) (Croucher et al. 2012) are able to efficiently delete 10-to 20kb stretches of heterologous DNA, consistent with the size of pneumococcal GIs (Croucher et al. 2014). This assay should MBE be conservative in estimating GI deletion efficiency, as mismatch repair would further inhibit the exchange of SNPs, but not GIs (Tiraby and Fox 1973); restriction-modification systems should inhibit GI acquisition, but not deletion (Johnston et al. 2013); and the deletions in this assay formed potentially deleterious artificial junctions and, in the case of R6I-20i, necessitated the loss of a putative toxin-antitoxin system (SPN23F12920-12930) (Dy et al. 2014). Although measured in a highly transformable laboratory-adapted strain, the estimate of k I from the decline of e{j} with L for insertions (1.08Â10 À3 bp À1 ) was higher than a previous estimate of k R , governing the exponential length distribution of core genome transformation events in a distinct clinical isolate (4.40Â10 À4 bp À1 ) (Croucher et al. 2012). Hence the length of homologous recombinations and the spanning of heterologous regions may be limited by different mechanisms, such as RecA properties (Rosselli and Stasiak 1991) or donor DNA hydrolysis (Morrison and Guild 1972), else exhibit differing sensitivities to the same constraining process. Although these k I and k R estimates may be specific to pneumococcal transformation, similar principles will likely apply to all sequence exchange through RecA-mediated recombination, whether DNA is cut prior to packaging in a transducing phage or gene transfer agent (Lang et al. 2012), or imported from any potentially hydrolytic environment.
Therefore, the primary benefit of transformation seems more likely to be removal of deleterious GIs (Croucher et al. 2016), potentially counteracting MGE insertion through  Hence across all L, the mean relative efficiency is one at f ¼ 2 kb. Shorter fragments drove deletions less efficiently, hence were associated with y f < 1. Three biological replicates are shown for each genotype at each f, coloured according to the genotype of the recipient, which is of the type R6I-Li. The points for each value of f are distributed over a small fraction of the horizontal axis for display purposes. The horizontal gray line at the bottom represents the threshold to which all zero values were adjusted for plotting, and the curves show the fit of four models (see key). (C) Deletion of a region of heterology in R6I-10d by tetM fragments matching different lengths downstream of the heterologous locus (supplementary text S3, Supplementary Material online). Each fragment was identical to the 1 kb of tetM upstream of the heterologous locus, with different lengths matching the downstream region. (D) Efficiency of deletion with unbalanced homologous arms. Data are plotted as in panel (B), but only for R6I-10d. The y f metric is calculated in the same way, except the maximum f in this experiment is 2.5 kb, hence this is the point at which the mean y f is one.
Transformation Asymmetry and the Evolution of the Bacterial Accessory Genome . doi:10.1093/molbev/msx309 MBE integrase-mediated recombination, than adaptation by GI acquisition (Hogg et al. 2007). Simulations of a recombining population using the asymmetry estimates suggest transformation would be effective at removing large MGEs, and even IS element insertions (Rocha 2016), given the results for L ¼ 1 kb (supplementary fig. S15, Supplementary Material online). Alongside the observed mutational bias toward deletion (Mira et al. 2001;, this asymmetrical transfer of GIs suggests neutrally they should decline in frequency, congruent with the decay of genomes under relaxed selection Novichkov et al. 2009). Hence GIs surviving in transformable bacteria must either be advantageous to subpopulations through diversifying, frequency-dependent, or niche-specific selection, else evade elimination through elevated intercellular transmission.

Transformation Rate Assay
Generation of the DNA constructs and bacterial genotypes used in these experiments is described in supplementary text S4, Supplementary Material online. Each transformation assay used 1 ml of S. pneumoniae grown statically at 35 C in Todd-Hewitt broth with 0.5% yeast extract (THY; Thermo Fisher Scientific) to an OD 600 of 0.2-0.3. Five microliters of 500 mM calcium chloride (Sigma-Aldrich), 5 ml 5 ng ml À1 competence stimulating peptide 1, and 5 ml water containing 500 or 5,000 ng of genomic DNA, or 300 ng of PCR amplicon, were added. Transformants were selected on appropriately supplemented THY agar media (4 mg ml À1 rifampicin, 1 mg ml À1 erythromycin, or 10 mg ml À1 tetracycline) after 3 h of further incubation. Colonies were counted manually after 48 h.

Statistical Analyses
The statistical models described in supplementary text S2, Supplementary Material online, were fitted to the data in figure 2 using maximum likelihood optimization with the Brent method in R (R Core Team 2017). Owing to the irregular outputs of the statistical models in supplementary text S3 and S4, Supplementary Material online, they were fitted to the data in figure 3 through least squares using simulated annealing in the "maxLik" package (Henningsen and Toomet 2011).