Diverse regulatory pathways modulate bet hedging of competence induction in epigenetically-differentiated phase variants of Streptococcus pneumoniae

Abstract Despite enabling Streptococcus pneumoniae to acquire antibiotic resistance and evade vaccine-induced immunity, transformation occurs at variable rates across pneumococci. Phase variants of isolate RMV7, distinguished by altered methylation patterns driven by the translocating variable restriction-modification (tvr) locus, differed significantly in their transformation efficiencies and biofilm thicknesses. These differences were replicated when the corresponding tvr alleles were introduced into an RMV7 derivative lacking the locus. RNA-seq identified differential expression of the type 1 pilus, causing the variation in biofilm formation, and inhibition of competence induction in the less transformable variant, RMV7domi. This was partly attributable to RMV7domi’s lower expression of ManLMN, which promoted competence induction through importing N-acetylglucosamine. This effect was potentiated by analogues of some proteobacterial competence regulatory machinery. Additionally, one of RMV7domi’s phage-related chromosomal island was relatively active, which inhibited transformation by increasing expression of the stress response proteins ClpP and HrcA. However, HrcA increased competence induction in the other variant, with its effects depending on Ca2+ supplementation and heat shock. Hence the heterogeneity in transformation efficiency likely reflects the diverse signalling pathways by which it is affected. This regulatory complexity will modulate population-wide responses to synchronising quorum sensing signals to produce co-ordinated yet stochastic bet hedging behaviour.


INTRODUCTION
Competence for natural transformation was first identified in Streptococcus pneumoniae (the pneumococcus) in the early 20th century ( 1 ).Cells can be 'transformed' to express a new phenotype through the acquisition of exogenous DNA, which is integrated into their genome through homologous recombination following its import through the specialised cell-encoded competence machinery ( 2 ).Transformation has played a key role in the emergence of antibiotic-resistant S. pneumoniae , both through generating 'mosaic' alleles of core loci (3)(4)(5) and the acquisition of specialised resistance genes ( 6 ).It has also enabled vaccine evasion through recombina tions af fecting the capsule polysaccharide synthesis ( cps ) locus altering surfaceexposed antigens ( 7 , 8 ).
Despite the ability of transformation to accelerate such adapti v e e volution in S .pneumoniae , considerab le variation in the rate of di v ersification of strains through this mechanism persists across the species ( 9 ).Epidemiological studies have found the r / m ratio of base substitutions introduced through homologous r ecombination, r elati v e to those occurring through point mutation, varies from well over 10 ( 7 , 9 ) to below 0.1 ( 9 , 10 ) across the species.Similarly, in vitro assays have identified > 100-fold differences in the transforma tion ef ficiency of S. pneumoniae genotypes, with substantial variation e v en between isolates of the same serotype or strain (11)(12)(13)(14).Many isolates are routinely found not to be transformable under standard conditions ( 11 ).This is often the consequence of integrati v e mobile genetic elements (MGEs) disrupting genes necessary for transformation ( 6 , 15-17 ), selfishly pre v enting themselv es from being eliminated from the chromosome ( 17 ).Yet in other nontransformable isolates, the highly-conserved competence machinery is intact ( 11 , 18 ).This suggests the variation in tr ansformation r ates also reflects differences in regulation of the competence system.
The best-characterised stimulus inducing transformation in S. pneumoniae is the competence stimulating peptide (CSP) pheromone, which acts as a quorum-sensing signal that is recognised by the ComDE two-component regulator ( 19 ).This activates early competence genes after about ten minutes ( 20 ).These include comX , encoding an alternati v e sigma factor ( 21 ).ComX enables the RN A pol ymerase to recognise late competence genes ( 20 ), which feature a 'combox' signal in their promoters ( 22 , 23 ).This results in pneumococci entering a transient competent state around 20 minutes post-CSP induction, after which the relevant machinery is degraded ( 24 ), and the cells become temporarily refractory to induction ( 21 ).
Transforma tion ef ficiency is known to vary between isogenic pneumococci through phase variation in capsule production.Transparent colony variants produce less capsule than opaque colony variants, and are consequently less virulent and more transformable ( 25 ).This short-term variation has been linked to rapid changes at the phasevariab le inv erting variab le restriction ( ivr ) locus, encoding the conserved Type I Spn III restriction-modification system (RMS), and the IvrR recombinase that dri v es sequence inversions within the locus (26)(27)(28)(29)(30)(31).These rearrangements switch the target recognition domains (TRDs) within the acti v e HsdS specificity protein, which determines the DNA motif that is targeted by both the methylase and endonuclease activities of the system.Consequently, changes at this single locus can have pleiotropic effects through altering genome-wide methyla tion pa tterns ( 27 ).These phasevariable RMSs can thereby maintain phenotypic heterogeneity within a genetically near-homogenous population ( 32 ), resulting in bet hedging that can increase the chances of a species surviving a changing environment ( 33 , 34 ).
The second pneumococcal phase-variable Type I restriction-modification system (named Spn IV) ( 27 ), encoded by the translocating variable restriction ( tvr ) locus ( 28 ), varies through excision-reintegra tion media ted by the recombinase TvrR ( 35 ).This locus is acti v e in almost all pneumococci, but the complement of TRDs varies between isolates, increasing the di v ersity of HsdS proteins, and possible methyla tion pa tterns, across the species ( 28 , 35 ).This locus is inacti v e in the R6 laboratory isolate that is typically used to study pneumococcal competence ( 28 , 35 , 36 ).Here, we characterised clinical isolates in which the tvr locus is intact, to understand how phase variation in Spn IV activity might contribute to phenotypic heterogeneity in clonally-deri v ed populations.

Cell culture
Genotypes used in this study are described in Supplementary Table S1.Unless otherwise sta ted, encapsula ted S. pneumoniae were cultured statically at 35 • C in 5% CO 2 .Culturing on solid media used Todd-Hewitt broth supplemented with 0.5% y east extr act and 1.5% agar (Sigma-Aldrich).Media were supplemented antibiotics for selection of mutated genotypes: rifampicin (Fisher Scientific) at 4 g ml −1 ; kanamycin (Sigma-Aldrich) at 400 g ml −1 , or chloramphenicol (Sigma-Aldrich) at 4 g ml −1 .Phase contrast microscopy of colonies used a Leica DFC3000 G microscope.

Gr owth curv es and phenotypic assa ys
To measure growth curves, 2 × 10 4 cells from titrated frozen stocks were grown in mixed liquid media in 96-well microtiter pla tes a t 35 • C in 5% CO 2 for 20 h.Measurements of the optical density at 600 nm (OD 600 ) were taken at 30 min intervals over 16 hours using a FLUOstar Omega microplate reader (BMG LABTECH).Three replicate wells were assa yed f or each tested genotype in each experiment.The R package growthcurver was used for the inference of carrying capacity, K , and replication rate , r ( 37 ).For measuring adhesion to abiotic surfaces, at the end of the growth curve incubation, the microtitre plate was submerged in water and dried for 10 min.Then 125 l of a freshly-diluted 0.1% crystal violet solution (Scientific Laboratory Supplies) was added to each w ell, follow ed by incubation for 30 min at room temperature.Each well was then washed by repeatedly submerging the plate in water to remove excess crystal violet.The plate was incubated at room temperature in an inverted position for four hours.Subsequently, 125 l of 30% acetic acid (Honeywell) was added to each well.Adherence was quantified as OD 550 across r eplicate wells, measur ed by a FLUOstar Omega plate reader.The quantification of 3',5' cAMP production is detailed in Text S1.

Mutagenesis and tr ansf ormation assays
To assay transforma tion ef ficiency, one milliliter of bacterial culture was collected at an OD 600 between 0.15 and 0.25.Cells were then incubated for 2 hours at 35 • C with 5 l of 500 mM CaCl 2 (Sigma-Aldrich), 250 ng of competence stimulating peptide 1 (CSP-1; Cambridge Bioscience Ltd) and 100 ng of a purified PCR amplicon from the rpoB gene, containing a base substitution that conferr ed r esistance to rifampicin ( 38 ).In experiments using carbon source supplements, these were added at a final concentration of 33 mM.After two hours of incubation at 35 • C, a volume of between 1 and 200 L of the transformed culture was spread on an agar plate supplemented with rifampicin.Experiments screening for variation in transformation frequencies between large numbers of samples quantified the transformation efficiency as the number of resistant colon y-f orming units per 10 L of sampled cultur e.Any differ ences identified by such assays were validated through mor e pr ecise quantifica tion of transforma tion frequencies.This involved estimation of the overall cell population through spreading 1 l of a 10 3 -fold dilution of the same culture on a nonselecti v e plate in parallel.Colonies were counted after 24 h of incuba tion a t 35 • C in 5% CO 2 .This enabled estimation of transformation efficiency as the number of transformants per 10 4 colon y-f orming units.Statistical analyses of transf ormation assa ys are detailed in Text S2.
Transformation was also employed to produce mutants using constructs containing selectable and counterselectab le mar kers, which were generated with the oligonucleotides listed in Supplementary Table S2, as detailed in Text S3.

Pr epar ation of RNA samples and quantitative PCR
Thr ee r eplicate cultur es of RMV7 tvr domi ::Janus and RMV7 tvr rare ::Janus were grown in 25 ml of mixed liquid media until they reached an OD 600 of 0.15.A 5 ml sample of bacterial cells was collected and 50 L 100 ng ml −1 CSP1 was added to the remaining culture.Further 5 ml samples were taken from each culture 10 and 20 min post-CSP addition.Each sample was immedia tely trea ted with 10 ml RN A protect (Qiagen) and incubated at room temperature for 5 min.Cell were then pelleted by centrifuga tion a t 3,220 g for 10 min.RNA was extracted from the washed pellets using the SV Total RNA Isolation System (Promega) according to the manufacturer's instructions.The extracted RNA was used for RNA sequencing or qRT-PCR.
All qRT-PCR experiments were conducted as described previously ( 35 ).Details of these experiments and analyses are provided in Text S4.

Generation and analysis of RNA-seq data
RN A samples were anal ysed using an Agilent Bioanalyser RNA Nano Chip (Agilent Technologies), and treated with the RiboZero ® rRNA Removal Kit for Bacteria (Illumina) to deplete rRNA.The samples were then cleaned with Agencourt RNAClean Beads (Beckman Coulter).Sequencing libraries were generated with the NEBNext ® Ultra II Dir ectional Library Pr ep Kit for Illumina (New England BioLabs), modified to use oligonucleotide sequences appropriate for the sequencing pipelines of the Wellcome Sanger Institute.The library was amplified through nine PCR cycles using the Kapa HiFi HotStart Ready Mix (Roche) to generate sufficient material for sequencing.All eighteen samples were sequenced as a multiplexed library on a single lane of a HiSeq 4000 sequencing system (Illumina), generating 200 nt paired end reads.
The set of genes used for expression analysis were the 2088 protein coding sequences annotated on the S. pneumoniae RMV7 domi genome (accession code OV904788), and the 81 non-coding RNAs predicted by infernal version 1.1.2( 39 ) using the Rfam database ( 40 ).RNA-seq reads were aligned to these sequences using kallisto version 0.46.2 ( 41 ), using default settings and 100 bootstraps.Differential gene expression analysis used sleuth version 0.30 ( 42 ).Wald tests were conducted to compare the pre-CSP samples for RMV7 tvr domi ::Janus and RMV7 tvr rare ::Janus, and to compare the 10 min and 20 min post-CSP samples for each genotype to the corresponding pre-CSP samples.Visualisation and plotting of data used the genoplotR ( 43 ), circlize ( 44 ), cowplot ( 45 ), ggpubr ( 46 ) and tidyverse ( 47 ) packages.Subsequent bioinformatic and statistical analyses are detailed in Text S5.

Phase variants of S. pneumoniae RMV7 differ in their transformation efficiency
A pr evious scr een for varia tion a t the tvr locus identified a di v erse panel of restriction-modification variants (RMVs) ( 35 ).Four RMV isolates underwent sufficiently rapid phase variation in culture to enable the isolation of pairs of genotypes tha t dif fered in the motif targeted by their Spn IV systems, which is determined by the hsdS gene nearest the 5' end of the tvr locus ( 35 ), but were isogenic across the rest of their genomes.The differences in the arrangement of the tvr loci were 'locked' in each variant through tvrR being knocked out, or disrupted, by a selectable and counter-selectable Janus cassette marker ( 35 , 48 ).These pairs of otherwise-isogenic locked phase variants wer e scr eened for differ ences in their transformation efficiency (Figure 1 A).In the RMV6 and RMV7 pairs, the variant in which the acti v e tvr locus HsdS comprised the TRDs III-i (directing the Spn IV system to target the bipartite motif TGAN 7 TATC) was found to have significantly higher transforma tion ef ficiency, following induction by exogenous CSP, than their counterpart.These genotypes both originate from the multidrug-resistant GPSC1 strain ( 49 ).The RMV6 variants were found to be distinguished by a mutation in dltD , which commonly occurs during in vitro culturing ( 50 ) and affects cell wall biology.By contrast, the serotype 19F RMV7 variants exhibited a ∼100-fold difference in their tr ansformation r ate (Figure 1 A), despite having identical dlt operons, and were therefore characterised in greater detail.
This substantial difference reflected the RMV7 variant carrying the alternati v e form of the tvr locus, with an acti v e HsdS comprising the TRDs III-iii (which directs the Spn IV system to target the motif TGAN 7 T CC; Figur e 1 B and Supplementary Figure S1), having an almost-undetectable transforma tion ef ficiency.Culturing the wild-type isola te (RMV7 wt ) over successive days identified large changes in the relati v e frequency of these tvr variants, although the less transformable variant (RMV7 tvrR ::Janus; henceforth, RMV7 domi , carrying tvr domi ) was typically dominant in pr evalence r elative to the rar er, mor e transformable variant (RMV7 tvrR ; henceforth, RMV7 rare , carrying tvr rare ; Figure 1 C).The unmodified RMV7 wt had a transformation efficiency similar to RMV7 domi , concordant with the relative proportions of the variants observed in vitro , whereas that of RMV7 rare was confirmed to be ∼50-fold higher (Figure 1 D).This difference could not be explained by high spontaneous mutation rates, nor by changes in the speed at which competence for transformation was activated (Supplementary Figur e S2).Ther efor e, RMV7 domi and RMV7 rare exhibited distincti v e phenotypes tha t correla ted with their tvr locus arrangements.RMV7 rare was also significantly more adhesi v e to an abiotic surface (Figure 1 E), which can be considered a proxy for biofilm formation ( 51 ).These differences in both adhesion and transformation could be explained by RMV7 rare being enriched for transparent phase variants.Howe v er, microscopy found no clear difference in colony morphology between the variants (Supplementary Figure S3).An alternati v e e xplanation for the phenotypic differences could be mutations that occurred during genetic manipulation of the isolates ( 52 ).Alignment of the two variants' assemblies found them to be distinguished by se v en non-synonymous single nucleotide polymorphisms and two pr ematur e stop codons outside the tvr locus, none of which were within genes known to directly affect the competence machinery (Supplementary Table S3).Nevertheless, we tested how transformation was affected by mutations in RMV7 rare that were absent from both the RMV7 domi and RMV7 wt sequences.Neither a non-synonymous change in phoB , encoding a phosphate-sensiti v e r esponse r egulator ( 53 ), nor disruption of the transporter gene pstS , affect by a premature stop codon in RMV7 rare , explained the contrasting transforma tion ef ficiencies of the variants (Supplementary Figure S4).Hence the differences between RMV7 variants could not be explained by point mutations or alterations in encapsulation.
To test whether the phenotypic differ ences wer e causati v ely associated with variation in the Spn IV RMS, the tvr locus of RMV7 wt was replaced with a chlor amphenicol acetyltr ansfer ase ( cat ) resistance marker to generate RMV7 wt tvr :: cat .The tvr loci of RMV7 domi and RMV7 tvr rare tvrR ::Janus (both modified by a Janus cassette within tvrR ; see Text S3) were separately introduced into RMV7 wt tvr :: cat , generating the otherwise isogenic knock-in recombinants RMV7 tvr domi ::Janus and RMV7 tvr rare ::Janus, carrying the two different locked tvr loci (Supplementary Figure S5).An initial characterisation of these genotypes demonstrated RMV7 wt tvr :: cat was substantially more transformable than RMV7 wt , suggesting methyla tion a t tvr domi motifs caused the r epr ession of competence induction (Supplementary Figure S5).The tvr domi ::Janus and tvr rare ::Janus mutants reproduced the phenotypic di v ergence between RMV7 domi and RMV7 rare in both biofilm formation and transformation (Supplementary Figure S5), providing further evidence that these differ ences wer e dri v en b y epigenetic v ariation.
The ∼17-fold difference in transformation efficiency between the tvr domi ::Janus and tvr rare ::Janus variants was nevertheless smaller than that measured between RMV7 domi and RMV7 rare .Hence the transformability of these genotypes was assayed ov er fiv e independent six-day passages, to test whether the difference between them would rise, in case any consequences of DNA methylation were slow to emer ge.Ho we v er, the observ ed disparity in the transforma tion ef ficiency of the tvr domi ::J anus and tvr rare ::J anus genotypes was stable (Figure 1 F).This smaller difference may r epr esent changes in the expr ession of the introduced tvr loci, the effect of mutations outside the tvr locus in RMV7 domi or RMV7 rare , else suggest that the effect of methylation on gene expression may be indirectly media ted via ef fects on nucleoid organisa tion.Ne v ertheless, both na turally-isola ted and genetically-engineered pairs of RMV7 tvr rare and tvr domi variants replicated a significant and r eproducible differ ence in biofilm formation and transforma tion ef ficiency.

Epigenetic effects on the induction of competence genes
To understand how the tvr loci caused a difference in transformation, RNA-seq was used to quantify patterns of transcription in the recombinants RMV7 tvr domi ::Janus and RMV7 tvr rare ::Janus.Samples were taken pre-CSP, 10 min post-CSP, and 20 min post-CSP from each of three biological r eplicates (Figur e 2 ).The 18 RNA samples were sequenced as 200 nt pair ed-end multiplex ed libraries on a single Illumina HiSeq 4000 lane.After alignment to the RMV7 domi genome, analysis of the RNA-seq data found the fragment length distributions (Supplementary Figure S6) and gene expression densities (Supplementary Figure S7) were consistent across samples (Supplementary Table S4).Q-Q plots suggested a Benjamini-Hochberg corrected q value of 10 −3 was an appropriate threshold for identifying significant transcriptional variation (Supplementary Figures S8 and S9).This identified 154 genes that significantly differed in their expression between the two genotypes prior to CSP exposure, or between the pre-and post-CSP samples from the same genotype (Figure 2 ; Supplementary Table S5).
Comparing the overall transcriptional patterns among datasets found the biggest separation distinguished the post-CSP RMV7 tvr rare ::Janus transcriptomes from the others (Supplementary Figure S10), suggesting a major difference in the induction of the competence system between the variants.In RMV7 tvr rare ::Janus, the early competence genes showed elevated expression 10 min post-CSP, including a > 20-fold induction of comCDE (Supplementary Figure S11).The late competence genes exhibited more variable patterns of transcription (Supplementary Figure S12).Multiple nucleotide metabolism and transporter genes ( purA , tadA , dut , ribF , adeQ ; Supplementary Figure S13) were upregulated, and transcription of the competence-induced biofilm formation signal gene briC ( 54 ) rose > 10-fold (Supplementary Figure S11).Transcription of a three gene cluster encoding another transporter, induced by CSP ( 55 ) and antimicrobial peptides ( 56 ), rose > 3-fold, and consequently was named pieABC (peptide-induced exporter; CDSs IONPJBJN 01324-6 in RMV7 domi , corresponding to SP 0785-787 in TIGR4; Supplementary Figure S13, Supplementary Table S5).By contrast, CSP significantly upregulated just fiv e genes in RMV7 tvr domi ::Janus: 5.7-fold and 6.5-fold increases in the transcription of the quorum sensing genes comCDE and briC respecti v ely (Supplementary Figure S11), and a 2.9fold increase in transcription of the stress response gene csbD .Howe v er, there was no sign of late competence genes being activated, which r equir es the alternati v e sigma factor ComX. Howe v er, e xpression of the comX gene itself can be difficult to determine through RNA-seq (Supplementary Figure S11), owing to the presence of two near-identical paralogues in pneumococcal genomes ( 57 ).
As an independent test of these transcriptional differ ences, qRT-PCR experiments wer e undertaken on the original RMV7 domi and RMV7 rare variants, and the control genotypes RMV7 wt and RMV7 wt tvr :: cat .The qRT-PCR data showed genotypes of both lower (RMV7 wt and RMV7 domi ) and higher (RMV7 rare and RMV7 wt tvr :: cat ) transforma tion ef ficiency up-regula ted the early competence genes comD and comX in response to CSP (Supplementary Figure S14) .Howe v er, the late competence genes comEA and comYC were only significantly upregulated in the more transformable genotypes.Hence the difference in transformability between the RMV7 variants was a consequence of late competence gene activation being blocked in RMV7 domi through effects of tvr domi expression.

Pr e-CSP expr ession differ ences associate elev ated intr acellular stress with tvr domi
This difference in competence induction was likely caused by the 53 genes that significantly differed in their expression between RMV7 tvr domi ::Janus and tvr rare ::Janus prior to CSP exposure (Figure 2 ; Supplementary Table S5).These did not include any cps locus genes, which were nonsignificantly more highly expressed in RMV7 tvr rare ::Janus (Supplementary Figure S15), confirming that the elevated transforma tion ef ficiency of this genotype did not reflect an enrichment of transparent phase variants ( 27 , 58 ).An analysis of the distances from protein coding sequence start codons to the nearest upstream methylation site found no general relationship between differ ential expr ession and proximal methylation for either tvr domi or tvr rare motifs (Supplementary Figures S1 and S16).Of the four cases of significantl y differentiall y-expressed genes being close to variable methylation sites, the modification was only likely to modulate transcription initiation at the piuABCD operon (Supplementary Figure S17).This encodes an iron transporter, and was > 5-fold more highly expressed in RMV7 rare (Figure 2 B; Supplementary Figure S13).Howe v er, knocking out piuA did not reduce the transformation efficiency of RMV7 rare , suggesting this change was independent of those affecting competence (Supplementary Figure S18).Hence the differences in the pre-CSP transcriptomes are unlikely to be attributable to a small number of promoters that are strongly affected by direct modification, consistent with other genome-wide studies ( 27 , 59 ).Instead, the differ ences likely r epr esent the consequences of chromosomele v el changes in DNA conformation or nucleoid interactions at particularly sensiti v e promoters ( 60 ).One notab le aspect of the overall distribution of the Spn IV target motifs was that the tvr rare motifs were uniformly distributed, whereas the tvr domi motifs were enriched in one segment of the chromosome (Supplementary Figure S19).Such an une v en distribution could accentuate the effects of shifting patterns of modification.
Corr espondingly, thr ee transcriptional patterns suggested the tvr domi methyla tion pa ttern was associated with d ysregula tion and increased intracellular stress.The first was the activation of heat shock responses.Transcription of the chaperone gene prsA and chaperone regulator gene hrcA were 3.4-fold and 4.1-fold higher in tvr domi ::Janus, respecti v el y.Correspondingl y, the gr oESgr oEL and grpE -dnaK -IONPJBJN 02152-dnaJ operons of the hrcA regulon were also more highly expressed in tvr domi ::Janus, although the difference was only significant for ION-PJBJN 02152 (Supplementary Figure S21; Supplementary Table S5).The accBC -yqhY -amaP -csbD cell wall stress operon ( 61 ) (IONPJBJN 01032-6) was also upregulated in RMV7 tvr domi ::Janus (Supplementary Figure S20).These genes ar e r egula ted by MgrA, a transcription factor tha t was non-significantly more highly expressed in RMV7 tvr domi ::Janus (Supplementary Figure S20).MgrA represses expression of the rlrA islet ( 62 ), encoding the type 1 pilus.Correspondingl y, rlrA mRN A le v els were a pproximatel y fiv e-fold lower in tvr domi ::Janus.
The second indicator of stress in RMV7 tvr domi ::Janus was the 3.7-fold higher transcription of ciaRH , encoding a two-component system that enables cells to survi v e lysisinducing conditions ( 63 ), and is known to inhibit competence ( 64 ).The third was the ∼1.5-fold higher transcription of a phage-related chromosomal island (PRCI; also known as a phage-inducible chromosomal island), integrated adjacent to dnaN (PRCI dnaN ; Supplementary Figure S22; Supplementary Table S5).This is one of two PRCIs associated with GPSC1 ( 7 , 28 ), the other being integrated near uvrA (Figure 2 ).The regulatory mechanisms of these elements are not thoroughly characterised ( 65 ), and in the absence of a helper prophage, it is unclear exactl y w hat signal may have trigger ed this incr eased acti vity.Gi v en integrati v e MGEs are likely under selection to reduce host cell transformability ( 17 ), the increased activity of PRCI dnaN could drive inhibition of the competence machinery.Hence PRCI dnaN and ciaRH were the primary candidates for causing the observed difference in transformability between the RMV7 variants.

ManLMN links competence induction to carbon source metabolism
The higher expression of the ciaRH genes in the poorlytransformable genotypes RMV7 domi and RMV7 wt , relative to the more transformable genotype RMV7 rare , was confirmed by qRT-PCR (Supplementary Figure S14).CiaRH binds at least 15 promoter sequences, fiv e of which dri v e the expression of cia -dependent small RN As (csRN A) that suppress the induction of competence by inhibiting CSP production ( 66 ).Their expression is difficult to determine using conventional RNA-seq approaches due to their size ( 67 ), but they are unlikely to affect competence induced by exogenous CSP ( 68 ).The remaining ten dri v e the e xpression of protein coding sequences (CDSs) ( 66 ), which wer e mor e highly expr essed in RMV7 tvr domi ::Janus compared to RMV7 tvr rare ::Janus pre-CSP (Supplementary Figure S23).These included the extracytoplasmic chaperone and serine protease HtrA ( 66 , 69 , 70 ), which can block competence induction at low CSP concentrations through degr ading extr acellular CSP ( 69 , 71 ).In agreement with some previous studies, elimination of htrA further increased the transformability of RMV7 rare ( 69 ), but the same mutation had no significant effect in RMV7 wt (Supplementary Figure S24).This suggests HtrA inhibits the induction of competence, but is unlikely to explain much of the difference between these variants.Similarly, knock out of ciaRH itself had little effect in either genotype (Supplementary Figure S24).This concurred with ciaRH being expressed at similar le v els in RMV7 domi and tvr :: cat (Supplementary Figure S14), despite the substantial differences between the transforma tion ef ficiencies of these two genotypes.These results suggested the expression of ciaRH was correlated with, rather than causati v e of, the difference in transformation efficiency between the variants.
Among the CiaRH regulon, the most significant difference in expression between RMV7 tvr rare ::Janus and RMV7 tvr domi ::Janus was the ManLMN carbon source importer operon (Supplementary Table S5).The manLMN genes wer e mor e highly expr essed in RMV7 rare , as CiaRH binds promoter sequences within the operon ( 36 , 66 , 72 , 73 ) and acts as a r epr essor ( 66 , 74 ).Disrupting the manLMN locus reduced the transformation efficiency of RMV7 rare by > 5-fold, while having little effect in RMV7 wt (Figure 3 A, B, Supplementary Figure S25).To test whether the observed transformation differences reflected a growth defect, manLMN ::Janus mutants of both RMV7 wt and RMV7 rare wer e cultur ed in the same rich media (Supplementary Figure S26; Supplementary Table S6).RMV7 wt grew more slowly than RMV7 rare , consistent with the former suffering greater intracellular stress.Howe v er, removal of manLMN had little effect on growth in either variant, suggesting the transporter's effect on transformation was through regula tion ra ther than prolifera tion.Hence the reduced transforma tion ef ficiency of RMV7 domi rela ti v e to RMV7 rare is likely a consequence of the r epr ession of manLMN in the former.
ManLMN is a phosphotr ansfer ase system (PTS) tr ansporter that can facilitate the import of glucose , mannose , galactose , fructose , aminoglucose and N -acetylglucosamine (GlcNAc) ( 75 ).Supplementation of liquid media with these carbon sources did not substantially affect growth profiles (Supplementary Figure S26), apart from the addition of glucose causing small ManLMN-dependent increases in the replica tion ra te of both phase variants (Supplementary Figure S27).Howe v er, synchronising GlcNAc supplementation with competence induction increased transformation efficiency ∼10-fold in RMV7 wt (Figure 3 B) and ∼2-fold in RMV7 rare (Figure 3 A).In both variants, this effect was dependent upon manLMN , as was confirmed through disrupting, and then restoring, manL (Supplementary Figure S28).This potentiation of the induction of competence by Glc-NAc was also observed in the other RMV isolates (Figure 3 E).Ther efor e, ManLMN links carbon sour ce availability to pneumococcal transformability.

N -acetylglucosamine activates competence through TfoX and YjbK
Competence in Vibrio cholerae is induced GlcNAc disaccharides ( 76 ), thought to be generated from degradation of chitin ( 77 ).This is mediated through the Transformation Factor X (TfoX) protein ( 78 , 79 ).An orthologue of this protein (TfoX Hflu , also called Sxy) is also central to induction of competence in Haemophilus influenzae ( 80 ) by 3',5'-cyclic AMP (cAMP) signalling ( 81 , 82 ).Intracellular concentrations of 3',5'-cAMP rise in many Proteobacteria when the primary glucose PTS transporter is inacti v e, as the accumulation of phosphorylated EIIA PTS subunits stimulates adenylate cy clase acti vity ( 83 ).A search was undertaken for analogues of either of these pathways in S. pneumoniae .
An orthologue of the N terminal domain of TfoX Hflu was annotated, but not described, in S. pneumoniae ATCC 700669 ( 84 ).In RMV7, this gene ( tfoX Spn ; ION-PJBJN 02097) is conserved in the same genomic location as in ATCC 706669, two genes upstream of the comEA -comEC competence operon (Supplementary Figure S29).The TfoX Spn amino acid sequence was predicted to form a four-strand beta sheet flanked by alpha helices (Supplementary Figure S30), resembling the N-terminal domain of gram-negati v e orthologues (Supplementary Figure S31).Disruption of tfoX Spn in RMV7 rare both decreased the pneumococcus' basal transformation efficiency in the absence of supplements (Supplementary Figure S25), and eliminated the GlcNAc-induced elevation in this rate (Figure 3 A).Restoring tfoX rescued RMV7 rare 's responsi v eness to GlcNAc.
A gene encoding a candidate adenylate cyclase, yjbK , was also identified in RMV7 (IONPJBJN 01639).This protein is predicted to have a ␤-barr el structur e (Supplementary Figure S32), as observed for orthologous enzymes synthesising 3',5'-cAMP (Supplementary Figure S33).The yjbK gene could be both disrupted, and r estor ed, in RMV7 rare .Transf ormation assa ys with these genotypes demonstrated the loss of YjbK reduced the transforma tion ef ficiency of RMV7 rare both in the absence of supplements (Supplementary Figure S25), and following the addition of GlcNAc (Figure 3 A).Howe v er, no 3',5'-cAMP signalling pathway is known in Firmicutes ( 85 ).Correspondingly, an ELISA assay demonstrated 3',5'-cAMP le v els in S .pneumoniae were close to the lower detection threshold, far below those of Esc heric hia coli , and unaffected by yjbK disruption (Supplementary Figure S34).Additionall y, exo genous 3',5'-cAMP had no effect on transformation efficiencies in any RMV7 genotypes (Supplementary Figure S35).Therefore, it is unlikely that YjbK's regulatory role is mediated through 3',5'-cAMP production.
To test whether the effects of GlcNAc were the consequence of it acting as a signal, or as a metabolic substrate, nagA was disrupted in RMV7 wt and RMV7 rare (Supplementary Figures S36, S37).The import of GlcNAc by ManLMN generates intracellular GlcNAc-6-phosphate, which can be used in cell wall synthesis, or converted to glucosamine-6-phosphate by NagA.Glucosamine-6phosphate is also generated by the import of aminoglucose by ManLMN, and must be converted to the glycolytic substra te fructose-6-phospha te, as both GlcNAc-6phosphate and glucosamine-6-phosphate are cytotoxic ( 86 ).
The nagA ::Janus mutant in the faster-growing RMV7 rare variant exhibited a growth defect that was exacerbated by GlcNAc supplementa tion, demonstra ting NagA was necessary for processing imported GlcNAc, thereby producing glucosamine-6-phosphate.Similarl y, a nagA ::Janus m utant in the more GlcNAc-responsi v e RMV7 wt variant could not be transformed in the presence of GlcNAc (Supplementary Figure S37).Howe v er, e xo genous amino glucose did not increase growth or transformation in these nagA − mutants, nor in any nagA + genotypes (Figure 3 ).This confirmed the effect of GlcNAc on transformation was not caused by an increased intracellular concentration of glucosamine-6phosphate resulting in faster growth or gl ycol ysis.By contrast, RMV7 rare tfoX ::J anus and yjbK ::J anus mutants grew faster than RMV7 rare (Supplementary Figure S36), and this difference was enhanced when media were supplemented with GlcNAc (Supplementary Figure S37), demonstrating their reduced transforma tion ef ficiencies did not reflect a growth defect.This heightened replica tion ra te occurred despite the disruption of tfoX and yjbK not affecting nagA transcription, and decreasing the expression of manL slightly.Ther efor e TfoX and YjbK appear to be regulators, rather than metabolic enzymes.Hence the differential effects of GlcNAc and aminoglucose on cells is likely to reflect signalling effects specific to GlcNAc involving TfoX, YjbK, and other proteins.

GlcNAc and ManLMN regulate pneumococcal physiology through multiple pathways
Disruption of manLMN , tfoX and yjbK in the laboratory genotype R6 was also found to reduce transformation efficiency, although detecting these effects r equir ed culturing in a low-sugar chemically-defined medium (see Methods; Supplementary Figure S38).Howe v er, disrupting yjbK and tfoX in RMV7 wt did not affect the GlcNAc-associated increase in transformation efficiency, although transformation was notably reduced in the yjbK ::Janus mutant (Figure 3 B).These results suggested ManLMN affected competence through at least two pathways, at least one of which was dependent upon TfoX and YjbK.
To test this proposed arrangement of pathways, genotypes were constructed by combining pairs of mutations in manLMN , tfoX and yjbK .Transf ormation assa ys demonstrated the double mutants in which manLMN was disrupted behaved similarly to the manLMN ::Janus single mutant (Figure 3 D).Disruption of both tfoX and yjbK did not cause a substantially greater effect than observed in either of the corresponding single mutants.This suggested TfoX and YjbK oper ated intr acellular ly within the same pathway (Figure 3 D), and their activity depended on the import of molecules by ManLMN.This is consistent with the tfoX ::Janus and yjbK ::Janus mutants exhibiting similar growth phenotypes (Figure 3 and Supplementary Figures S36, S37), whereas ManLMN has a broader effect on both RMV7 variants as a key GlcNAc-responsi v e pleiotropic regulator.
To test whether differences in ManLMN activity may also explain the difference in biofilm formation between the RMV7 variants, adhesion of RMV7 rare and deri v ed manLMN ::Janus mutants to an abiotic surface was quantified in the presence of different carbon sources (Figure 3 C).Although disruption of manLMN resulted in changes to biofilm thicknesses in response to different carbon source supplements, there was little difference in unsupplemented media.Instead, we hypothesised that the difference was caused by the type 1 pilus, as the rlrA pilus islet was more highly expressed in RMV7 tvr rare ::Janus (Figure 2 G), which replicated the thicker biofilm phenotype of RMV7 rare (Figure 1 E).Disruption of the pilus structural genes ( rrgABC ), or their activator gene ( rlrA ), reduced the surface adherence of RMV7 rare to that of RMV7 wt.Howe v er, the same mutation in the RMV7 wt background had little effect (Supplementary Figure S39).Restoring the pilus in RMV7 rare rescued the biofilm thickness phenotype (Figure 3 C).Hence multiple regulatory pathways underlie the phenotypic differences between the phase variants.

Mobile element activation represses transformation by incr easing intr acellular str ess
As additional regulatory pathways were likely to affect competence induction, we tested the hypothesis that the increased activity of PRCI dnaN in RMV7 tvr domi ::Janus may also inhibit the competence of the host cell, in order to pre v ent the MGE being deleted through homologous recombination ( 17 ).The entire element was removed, either with (RMV7 wt PRCI dnaN+att ::Janus) or without (RMV7 wt PRCI dnaN ::Janus) the flanking att sites.In both cases, a ∼5fold increase in tr ansformation r ates was observed (Figure 4 A).This implied the PRCI inhibited the activation of the competence system.To test if this were caused by a specific locus within the MGE, four mutations were genera ted af fecting the PRCI: one removing the regulatory genes; one removing the regulatory genes and IONPJBJN 00496, a gene that encoded a protein similar to DNA damageinducible protein D (DinD), which inhibits RecA activity in E. coli ( 87 ); one removing the replication genes; and one r emoving the integration, r egulatory and r eplication genes.Howe v er, none of these mutations had such a large effect on tr ansformation r ates as the elimination of the entire element (Figure 4 A-B).This suggested the inhibition of competence induction was not the consequence of a single gene product, but instead the activity of the MGE itself.
A qRT-PCR assay was employed to test whether the deletion of PRCI dnaN could af fect transforma tion through disrupting the expression of other loci.Neither ciaR nor manL expr ession was alter ed when the PRCI was r emoved (Figure 4 C), suggesting an alternati v e pathway was involved.As PRCI dnaN caused a growth defect in RMV7 wt (Supplementary Figure S40), it was hypothesised that the element's activity might drive the higher expression of the str ess r esponse proteins in tvr domi ::Janus (Supplementary Figur e S21).Corr espondingly, expr ession of m ultiple cha perone genes decreased after the deletion of PRCI dnaN + att .Transcript le v els of the chaperone r egulator Hr cA  a pproximatel y halved after removal of the PRCI, mirroring its a pproximatel y f our-f old lower pr e-CSP expr ession in RMV7 tvr rare ::Janus.The mRNA le v els of other genes within the Hr cA r egulon (e.g.dnaK , grpE , gr oEL , gr oES ) also generally fell, as quantified by modelling of the qRT-PCR data (Figure 4 D), but these r eductions wer e less consistent than that of hrcA .This again replicated the lower pr e-CSP expr ession of these genes in RMV7 tvr r are ::Janus, albeit the variation in transcript le v els meant the differences were generally not significant (Supplementary Figure S21).This likely reflects the Hr cA-r egulated locus being only one of multiple promoters driving transcription of these genes ( 36 , 88 ).Both ClpE and ClpP showed similarly elevated but variable patterns of expression in RMV7 tvr domi ::Janus in the RNA-seq data (Supplementary Figure S21), despite not being part of the Hr cA r egulon, and wer e confirmed to be more highly expressed in RMV7 wt than RMV7 rare by qRT-PCR (Supplementary Figure S41).While the removal of PRCI dnaN + att did not affect clpE expression, clpP transcript le v els fell by a similar amount to those of hrcA following PRCI dnaN + att deletion (Figure 4 D).Hence HrcA and ClpP appeared to be key mediators of the stress response to intracellular MGE activity.The ClpP protease is known to inhibit the induction of competence ( 89 ), suggesting the incr eased expr ession of this protein may link PRCI activation with reducing transforma tion.Correspondingly, disrupting c lpP significantly increased the transforma tion ef ficiency of RMV7 wt (Figure 4 B).Removing PRCI dnaN in this clpP − background did not elevate the transformation efficiency further, consistent with the effects of mobile element activation being mediated through this protease.Disruption of hrcA also caused a significant rise in the transformation efficiency of RMV7 wt , and this elevation was slightly larger in the hrcA :: cat PRCI dnaN+att ::Janus double mutant (Figure 4 B).This suggests the inhibition of transformation dri v en by PRCI dnaN primarily reflected its stimulation of incr eased ClpP activity, with Hr cA independently r egulating the competent state.This is consistent with the hrcA ::Janus clpP :: cat double mutant exhibiting a stronger growth defect than either single mutant, indicating the proteins have nonredundant functions (Supplementary Figure S41).Hence the increased intracellular stress dri v en by MGE activity r epr esses competence through at least one chaperonemedia ted pa thway.

Activation and repression of competence by the chaperone r egulator Hr cA
The lower activity of PRCI dnaN in RMV7 rare suggested that ClpP would be less acti v e in this variant.Correspondingly, disruption of clpP caused a smaller growth defect in RMV7 rare (Supplementary Figure S42), and only a threefold increase in transforma tion ef ficiency, as compared to the > 100-fold increase observed in RMV7 wt (Figure 5 A).Yet transforma tion ef ficiency decreased in RMV7 rare following the disruption of hrcA , contrasting with the same m utation reproducibl y causing increased transformation efficiency in RMV7 wt (Figures 4 B, 5 A).Furthermore, the RMV7 rare hrcA ::Janus clpP :: cat double mutant also exhib-ited a reduced transformation efficiency, suggesting this effect of HrcA dominated that of ClpP in this phase variant.
HrcA is unusual in having two conformations ( 90 , 91 ), only one of which binds the CIRCE DNA motif, enabling autor egulation through r epr essing the chaperone-encoding gene cluster that includes hrcA ( 92 ).In S. pneumoniae , HrcA binding of CIRCE is reduced at elevated temperatur es, r elieving its r epr ession of the dnaK and groEL operons, enabling a heat shock response ( 93 ).By contrast, Ca 2+ ions facilitate HrcA-CIRCE motif binding, inhibiting chaperone expression ( 91 ).Both low Ca 2+ concentrations and extr eme temperatur es inhibit the induction of competence ( 94 ).Hence the di v ergent effects in the phase variants could reflect the two conformations of HrcA having different effects on the regulation of transformation.
The DNA-binding conformation of HrcA was acti v e in RMV rare , as dnaK and groEL expression was increased in the hrcA ::Janus genotype (Supplementary Figure S43).There was no change in transcription of clpP , which is outside the Hr cA r egulon (Supplementary Figure S41).Increasing the proportion of the DNA-binding conformation through supplementation with CaCl 2 increased the transforma tion ef ficiency of RMV7 rare in a dose-dependent manner at the standard culturing temperature of 35 • C (Figure 5 B).This response was lost in an hrcA − genotype, and regained when hrcA was restored (Figure 5 B).These effects could be reproduced following gene disruption and restoration in S. pneumoniae R6 (Supplementary Figure S44).Hence HrcA aids the activation of transformation when adopting the DNA-binding conformation facilitated by Ca 2+ ions.
As Ca 2+ increases HrcA's DNA binding ability, the regula tion of transforma tion seemed likely to occur through altering transcription.Quantifying the expression of hrcA post-CSP with and without CaCl 2 supplementation confirmed the ion concentrations added alter ed Hr cA's autor epressi v e acti (Figure 5 C).Expression of the regulon r epr esentati v e dnaK also decreased, albeit only after a 30-60 min lag (Supplementary Figure S45).This is consistent with dnaK regulation not being entirely controlled by Ca 2+sensiti v e regulation by HrcA ( 88 ), corresponding with the effects of PRCI dnaN on the Hr cA r egulon (Figur e 4 C).In contrast to growth in unsupplemented media, the Ca 2+induced post-CSP reduction in hrcA transcription was associated with slightly increased clpP expression, as a potential compensatory mechanism (Supplementary Figure S41).Howe v er, neither the early competence gene comX , nor the late competence gene comEA , showed a decreased response to CSP in the hrcA ::Janus mutant (Figure 5 D).This mirrors their insensitivity to Ca 2+ supplementation (Supplementary Figure S45).In contrast, strong suppression of comEA transcription was evident in the manLMN ::Janus mutant.This suggests HrcA does not affect the induction of competence through the same mechanism as ManLMN.
As both ClpP and HrcA have roles in thermotolerance, to test whether they caused the reduction in transformation ef ficiency associa ted with eleva ted tempera tures, the transforma tion ef ficiencies of RMV7 rare and RMV7 wt hrcA − and clpP − mutants wer e compar ed at 40 • C (Figur e 5 A).This heat shock caused a growth defect (Supplementary Figure S42), and decreased transformation efficiency, in both RMV7 rare and RMV7 wt .The disruption of clpP had no significant effect on transformation efficiency in either variant under these conditions, whereas the disruption of hrcA significantly increased transforma tion ef ficiency in both phase variants after the heat shock (Figure 5 A).Restoration of hrcA re v ersed the phenotypic change caused by the gene disruption (Supplementary Figure S46).This is consistent with Hr cA r epr essing the induction of competence when intracellular stresses shift the protein away from its DNA-binding conformation.This could also explain the phenotypes of the hrcA -clpP -double mutants.The mor e str essed RMV7 wt hrcA ::Janus clpP :: cat genotype was not transformable, as it exhibited a severe growth defect.By contrast, the RMV7 rare hrcA ::Janus clpP::cat double mutant transformed and grew at similar rates to the hrcA ::Janus single mutant, suggesting a lower dependence on ClpP in the absence of a highly-acti v e PRCI (Supplementary Figure S41).Hence HrcA is a pleiotropic regulator that can activate competence in healthy cells, but r epr esses it in response to stress.

Independent pathways limit the competent cell subpopulation
To test whether ManLMN and HrcA separately contributed to the difference between the variants, the transforma tion ef ficiency of the RMV7 rare hrcA ::Janus manLMN :: cat double mutant was compared with that of the progenitor genotype, and the corresponding single mutants (Figure 5 E).This demonstrated an a pproximatel y fiv efold decrease in transformation for each single mutant, and a ∼25-fold reduction for the double mutant.This is consistent with HrcA and ManLMN both regulating competence independently.
Experiments with two unlinked resistance markers were used to test whether the difference in competence between variants reflected a uniform reduction in DNA import across cells, or an alteration in the fraction of cells in which competence was induced.The excess of double mutants, relati v e to the expected frequency calculated from the single mutants (Supplementary Figure S47), demonstrated the la tter explana tion accounted for the distinct behaviours of the variants.Most bacteria remained recalcitrant to CSP in both variants: with GlcNAc supplementation, it was estima ted tha t 1-2% of the RMV7 rare population became competent f or transf ormation, whereas 0.5% of the RMV7 wt population did under the same conditions (Supplementary Figure S47).Hence HrcA and ManLMN independently changed the probability of an individual cell entering the competent state.

DISCUSSION
This analysis highlights important challenges in the functional genomic characterisation of clinical pathogen isolates.The RMV7 domi and RMV7 rare phase variants differed by few polymorphisms, and have the same gene content, with the key genetic differences corresponding to re v ersib le DNA excision-reintegra tion d ynamics and altera tions to methylation.Yet they exhibited distinct phenotypes that affected the interpretation of multiple experiments.These differ ences wer e a ttributable to the methyla tion pa tterns dri v en by the Spn IV system, as the transformation efficiency of RMV7 wt increased when the tvr locus was removed (Supplementary Figure S5), and fell when the tvr domi ::Janus locus was reinstated (Figure 1 F, Supplementary Figure S5).Correspondingly, the phenotype could not be associated with mutations outside of the the tvr locus.Yet the mechanism linking the epigenetic cause with the phenotypic consequences was difficult to establish, as the effects of DNA methylation were not primarily manifested at genes with modified bases in their promoters.This is consistent with the effects of Spn III methylation at the cps locus, despite the lack of nearby modification sites ( 27 , 30 ), and genome-wide analyses of the effects of methylation in other species ( 59 ).Instead, it is likely that afflicted genes have an intrinsic sensitivity to intracellular perturbations.For instance, the contribution of the type 1 pilus to biofilm formation was only detectable in RMV7 rare (Figure 3 C, Supplementary Figure S39), while a previous study found the effect of the pilus on the same phenotype differed between a wild-type bacterium and an unencapsulated deri vati v e ( 95 ).Hence the substantial impact of methyla tion varia tion on competence induction suggests a sensitivity to small genetic, epigenetic and physiological changes that likely underlies its heterogeneity across populations ( 9 , 10 , 96 ), and over the history of individual strains ( 8 , 97 ).
This susceptibility to variation is likely symptomatic of the man y pathwa ys tha t regula te this phenotype (Figure 6 ).One of the important signals identified in this analysis was the availability of GlcN Ac, likel y the most abundant non-glucose carbon source in the nasopharyngeal mucosa, reaching concentrations similar to the supplements in this work ( 98 ).GlcNAc can be liberated from host mucins by pneumococcal glycosylases ( 99 ), and used as a carbon and nitrogen source either for growth or metabolism, making it a highly informati v e signal of nutrient availability and cell physiology ( 100 ).Hence GlcNAc-6-phosphate is a regulatory molecule recognised by proteins in some species ( 101 ), including V. cholerae ( 102 ), in which GlcNAc also modulates the induction of competence by a quorum sensing signal (Figure 6 ).This analysis identified similarities with components of the V. cholerae GlcNAc-signalling pathways in pneumococci, including TfoX, found in many bacterial phyla (Supplementary Figure S48), and YjbK, which belongs to a recently-defined subset of CYTH proteins ( 103 , 104 ) of unknown function in gram-positi v e bacteria ( 85 , 105 ).Despite their highly-conserved nature, the corresponding genes were not substantially upregulated by CSP (Supplementary Figure S49).Hence these proteins are likely to regulate multiple aspects of pneumococcal physiology, rather than being specific regulators of transformation.
The effects of TfoX and YjbK were dependent on the primary glucose transporter and central metabolic regulator ManLMN ( 106 ).As ManLMN is the only effecti v e route by which GlcNAc can be imported ( 75 ), signalling by this molecule is limited by the competence regulator CiaRH, but not subject to carbon catabolite r epr ession.Orthologues of ManLMN serve as the main glucose transporter across many Firmicutes, including other streptococci, Lactococcus lactis and Listeria monocytogenes ( 75 ), and the transporter has been associated with regulation of biofilm formation and transformability in Streptococ-  ( 129 ).In each, competence is regulated by a quorum-sensing system: CSP in S. pneumoniae , and the cholera autoinducer 1 (CAI-1) in V. cholerae .The production of CSP is known to be inhibited by the non-coding csRNAs, and the HtrA protease degrades the signal.Similarly, the srn206 non-coding RNA r epr esses the ComD receptor of CSP ( 130 ).CAI-1 operates through inhibiting LuxO, thereb y activ ating the Ha pR protein, w hich indicates a high-cell density environment.HapR activates competence through the Quorum-Sensing and TfoX-dependent Regulator (QstR).This regulator also senses the activation of TfoX in response to GlcNAc being detected by the tr ansmembr ane regulator TfoS, via the TfoR small RNA.TfoX activity is also regulated by the Catabolite Regulatory Protein (CRP), which is activ ated b y 3',5'-cAMP, generated by the CyaA adenylate cyclase under carbon source starvation conditions.Hence ther e ar e parallels with the TfoX orthologue, and adenylate cyclase-like protein YjbK, responding to GlcNAc in S. pneumoniae RMV7.GlcNAc also appears to promote competence through a TfoX / YjbK-independent route, based on the behaviour of RMV7 wt .In the pneumococcus, competence is also regulated by the chaperones ClpP and Hr cA.ClpP r epr esses the induction of competence in response to stresses, such as MGE replication, and has been previously shown to degrade ComX ( 131 ).One of HrcA's two conformations appears to have a similar effect, r epr essing competence in response to heat shock, albeit likely through a different mechanism.The other acti v e conformation of HrcA, denoted HrcA*, appears to activate competence in response to Ca 2+ , again through an unknown mechanism.
cus mutans ( 107 ) suggesting its signalling role is likely to be common among Firmicutes.Hence the heterogeneity of pneumococcal competence induction partly reflects highlyconserved metabolic signalling networks intervening in competence-specific pathways.
HrcA is another widely-conserved key regulator of pneumococcal physiology that this analysis found to regulate competence induction.This protein is sensiti v e to physiolo gicall y-relevant concentrations of CaCl 2 ( 108 , 109 ), the most common ionic compound in the nasopharyngeal mucosa ( 98 ).Hence rather than Ca 2+ aiding the translocation of DNA molecules across the plasma membrane, as suggested previously ( 110 ), HrcA appears to mediate a rare example of Ca 2+ signalling in bacteria ( 91 , 111 ).Another unusual aspect of HrcA's activity is that its two conformations appear to have opposing effects on competence acti vation, enab ling the protein to modulate competence induction through integrating information on intracellular stress and extracellular ion concentrations.In RMV7, this resulted in the regulator's effect depending on the epigenetic context of the cell.The mechanism by which this was achie v ed is not clear.The lack of a detectable change in competence gene expression in an hrcA − genotype (Figure 5 D) contrasted with the independent effects of ManLMN, which limited the induction of late competence genes.Hence general regulators of cell biology affect multiple steps of the competence regulatory cascade (Figure 6 ).
The structure of this regulatory network can help explain the paradox of two common, but contrasting, aspects of competence regulation ( 112 ): quorum sensing, which dri v es coor dinated responses, and bet hedging, which underlies population-wide heterogeneity.In isolated cells, bet hedging could result from intrinsic noise, the variation in gene activity reflecting the inherent stochasticity of transcription and translation ( 113 ).Yet competent pneumococci increase their production of CSP, propagating induction to neighbouring cells ( 114 ), tending to homogenise the population-wide response.Maintaining heterogeneity ther efor e r equir es cells within a clonally-rela ted popula tion, adapted to the stable niche of the nasopharynx, stochastically differ in their susceptibility to the quorum sensing signal.
The examples of HrcA and ManLMN demonstrate how this is achie v ed by pneumococci.Firstly, both le v erage variation in cell clusters' micr oenvir onments and intracellular physiology ( 113 ) as sources of extrinsic noise ( 115 ).Secondly, both mechanisms act on steps of the regulatory cascade that do not affect CSP production, meaning their effects occur independently within cells, and are not propagated intercellularl y.Thirdl y, both act on independent steps of the cascade, rather than being integrated into a common mechanism, enabling them to have uncorrelated, independent effects on competence induction.Fourthly, although temperature, and the availability of Ca 2+ and GlcNAc, substantially altered the probability of competence induction, none alone had a large enough effect to risk the entire population becoming competent, maintaining heterogeneity in the population.Hence bet hedging can emerge as an apparently random output of combining multiple noisy signals, with the phase-variable Spn III and Spn IV restrictionmodification systems potentiating such variation ( 27 , 68 ).The function of CSP is ther efor e not to homogenise the population, but to coordinate the induction of this transient state in a subset of pneumococci.
This complex regulation in S. pneumoniae is similar to that in Bacillus subtilis ( 116 ), as well as Vibrio cholerae ( 117 ) and some other gram-negati v e bacteria ( 118 ).These distantly-related species all have strongly-inducing quorum sensing signals that ra pidl y induce a transient competence state in a subset of bacteria, with coordinated release of DNA from conspecific cells through fratricide or cannibalism (119)(120)(121).In each case, the intercellular signalling is modulated by multiple extracellular stimuli, although the signals themselves vary between the bacteria, likely dri v en by their di v ergent ecolo gies.Hence V. c holerae responds to chitin ( 77 ), while B. subtilis induces competence under starvation conditions ( 120 ), whereas this work demonstrates that pneumococcal competence is favoured in healthy bacteria, replete with host-deri v ed nitrogen, carbon and ion sources.Instead, it is the complex structure of the regulatory network that modulates responses to quorum sensing that is shared.Hence some naturally transformable bacteria appear to have conver gently ev olved noisy regulatory systems tha t stochastically segrega te popula tions into donors and r ecipients, ther eby enabling the efficient transfer of DNA during a transient period of competence.
Other natur ally tr ansformable bacteria are either constituti v ely competent, in the case of some Neisseria species ( 122 ), or do not employ quorum sensing, as appears to be the case for H. influenzae ( 123 ).Howe v er, the import of DNA is constrained to molecules containing DNA uptake sequences (DUSs) in these bacteria ( 122 , 124 ).Therefore it has been proposed that the purpose of the transient nature of competence induction by quorum sensing is to synchronise the release and acquisition of DNA from conspecific bacteria, thereby serving as an alternati v e mechanism to DUSs for ensuring imported genetic material comes from close relati v es ( 125 ).Hence it is highly unlikely the competence system primarily functions to acquire nucleic acids as a source of nutrients (126)(127)(128), as both common types of competence system regulation limit the import of DNA to sequences that are sufficiently closely-related to be integrated through homologous recombination.Rather, in species regulating competence using quorum sensing, multiple signals are likely used to ensure both the coordination of induction, and the emergence of population-le v el heterogeneity.Hence the variable nature of species-wide transforma tion ef ficiency r epr esents the delicate balance between chaos and order necessary for the synchronised bet hedging that characterises competence in many bacterial species.

DA T A A V AILABILITY
The genome sequence and annotation of S. pneumoniae RMV7 domi is available from the European Nucleotide Archi v e (ENA) with the accession code OV904788.The RNA-seq data are available from the ENA with the accession codes listed in Supplementary Table S4.The expression values and statistical tests calculated for all analysed genes in the RNA-seq analysis are available from FigShare, alongside the raw gel images, micro gra phs, and the results of qRT-PCR and microbiological experiments ( https://figshare.com/projects/Div erse regulatory pa thways modula te bet hedging of competence induction in epigenetically-dif ferentia ted phase variants of Streptococcus pneumoniae/171060 ), as detailed in Supplementary Table S7.

SUPPLEMENT ARY DA T A
Supplementary Data are available at NAR Online.

5 FFigure 1 .
Figure 1.Phenotypic differences between locked tvr variants.( A ) Violin plot showing the transformation efficiency of four pairs of tvr locus variants constructed from isolates RMV5, RMV6, RMV7 and RMV8.Each individual point r epr esents an independent transformation experiment.The horizontal line within each violin shows the median for each genotype.The brackets indicate the statistical significance of the comparison between variants from the same isolate background.(B ) Schematic of the tvr loci from RMV7 wt , RMV7 domi and RMV7 rare to show the genes encoding the methylase ( hsdM ), endonuclease ( hsdR ), regulatory system ( tvrAT ) and recombinase ( tvrR ).The variants differ in their acti v e hsdS genes, upstream of tvrATR .The black arrows indicate the binding sites of a forward primer, in hsdM , and re v erse primers, in hsdS fr agments.( C ) Line gr aph showing the ratio of RMV7 domi to RMV7 rare loci in eight-day passages of RMV7 wt .( D ) Violin plot comparing the transforma tion ef ficiencies of RMV rare , RMV7 domi and RMV7 wt .Each point r epr esents an independent experiment in which the number of transformants, and number of overall colon y-f orming units (cfu), were calculated.The horizontal line within each violin shows the median for each genotype.Both mutants wer e compar ed with RMV7 wt ; the horizontal bracket shows a significant difference.( E ) Violin plot showing the adhesion of variants to an abiotic surface, as quantified by the optical density at 550 nm. ( F ) Violin plots showing the transforma tion ef ficiency of knock-in mutants during a passage experiment.The tvr domi ::Janus and RMV7 tvr rare ::Janus loci were each introduced into an RMV7 tvr :: cat background (Supplementary FigureS5).This pair were separately passaged in liquid cultures over six days in fiv e independent replicates.The number of transformants observed from three transformation assays, conducted each day for both variants, is shown by the individual points' shapes and colours.The violin plots summarise the median and distribution of these values.The brackets indicate the statistical significance of the comparison between variants from the same day of the passage.Across all panels, significance was assessed using two-tailed Wilco x on rank sum tests, and was coded as: P < 0.05, *; P < 0.01, **; P < 10 −3 , ***; P < 10 −4 , ****.All P values were subject to a Holm-Bonferroni correction within each panel.

Figure 2 .
Figure 2. Genes exhibiting significant differences in transcription between RNA-seq samples.( A ) Design of the RNA-seq experiment.(B-F) Volcano plots showing the variation in gene expression between RNA-seq samples.The horizontal axis shows the natural logarithm of the fold difference in transcript le v els between the genotypes, ␤.The v ertical axis shows the negati v e base 10 logarithm of the q value, corresponding to a Benjamini-Hochberg corrected P value.Points are coloured red where this value exceeds the threshold false discovery rate threshold of 10 −3 .The panels correspond to (B) the differences between the pre-CSP samples of tvr domi ::Janus and tvr rare ::Janus; (C-D) the differences between the pre-CSP and 10 minute post-CSP samples for (C) tvr domi ::Janus and (D) tvr rare ::Janus; (E-F) and the differences between the pre-CSP and 20 minute post-CSP samples for (E) tvr domi ::Janus and (F) tvr rare ::Janus.The significant difference in tvrR expression between the tvr domi ::Janus and tvr rare ::Janus mutants is an artefact of the different constructs used to disrupt tvrR in these two genotypes (see Text S3). ( G ) Chromosomal distribution of differ entially-expr essed genes.The outer ring shows the annotation of RMV7 domi (accession code OV904788).Protein coding sequences are represented as black boxes, with the vertical positioning within the ring indicating the strand of the genome on which they are encoded.The next ring inwards shows significant pr e-CSP differ ences in transcription: green genes were more highly expressed in RMV domi , and blue genes were more highly expressed in RMV rare .The next ring inwards show significant changes in gene expression 10 min post-CSP in RMV7 domi : pink genes were upregulated, and purple genes were downregulated.The third ring inwards shows significant changes in gene expression 10 min post-CSP in RMV7 rare : red genes were upregulated, and orange genes were downregulated.The two inner rings repeat this r epr esentation for changes in gene expression 20 min post-CSP.

Figure 3 .
Figure 3.The effect of carbon source availability on transformation efficiency.( A ) Violin plots showing the transformation efficiency of RMV7 rare relati v e to mutant deri vati v es.The counter selectable Janus cassette was used to disrupt the manLMN , tfoX and yjbK genes.This enabled the restoration of the tfoX and yjbK genes (similar data for manL are shown in Supplementary FigureS28).Each genotype was transformed in unsupplemented media, and in the presence of one of six carbon sources, as indicated by the plot colour (see key).Each point r epr esents an independent experiment, and the horizontal line within the violin plots show the median for each combination of recipient cell genotype and carbon source.For each genotype, two-tailed Wilco x on rank-sum tests were used to test for evidence of changes in transformation efficiency caused by each carbon sour ce, r elati v e to the unsupplemented media.Significant differ ences ar e indicated by the black brackets at the top of the panel.( B ) Violin plots showing the transformation efficiency of RMV7 wt , relati v e to mutant deri vati v es, in the presence of different carbon sources.Data are displayed as in panel A. ( C ) Violin plots quantifying the effects of carbon source supplementation, and disruption of manLMN and the rlrA pilus islet, on adhesion of bacteria to an abiotic surface.The density of the biofilm formed follo wing gro wth in GlcNAc-supplemented media was used as the comparator for two-tailed Wilco x on rank sum tests, as all carbon source supplements increased biofilm forma tion rela ti v e to unsupplemented media.( D ) Violin plots showing the transformation efficiency of double mutants constructed in the RMV7 rare genotype.For comparison with panel A, the horizontal lines show the median transforma tion ef ficiencies of the corresponding single mutants in unsupplemented and GlcNAc-supplemented media: manLMN ::Janus as solid lines; tfoX ::Janus as dotted lines, and yjbK ::Janus as dashed lines.( E ) Violin plots showing the effect of GlcNAc supplementation on the transformation efficiency of tvr variants of the RMV5, RMV6 and RMV8 isolates.Across all panels, significance was coded as: P < 0.05, *; P < 0.01, **; P < 10 −3 , ***; P < 10 −4 , ****.All P values were subject to a Holm-Bonferroni correction within each panel.

Figure 4 .
Figure 4. Effect of removing PRCI dnaN on the transformation efficiency of RMV7 wt .( A ) Violin plots showing the number of transformants observedin assays of RMV7 wt mutants in which different parts of PRCI dnaN were replaced with a Janus cassette.The genotypes are arranged left to right, and coloured black to green, to represent the increasing proportion of the element that was replaced by the cassette.Mutants removed different combinations of the dinD -like gene (IONPJBJN 00496); the regulatory genes ( reg ); the replication genes ( rep ); the integration genes ( int ), and the att site.The structure of RMV7 PRCI dnaN meant that replacing the intrep region also deleted the intervening reg genes.Each point r epr esents an independent transformation assay.The violin plot summarises the result for each mutant, with a horizontal line indicating the median.Asterisks at the top of the plot indicate significant differences in the number of observed transformants between mutants and the parental RMV7 wt genotype, as calculated using two-tailed Wilco x on rank sum tests.( B ) Violin plot quantifying the effect of PRCI dnaN , and the chaperones HrcA and ClpP, on transformation efficiency in RMV7 wt .The comparison of RMV7 wt with a mutant in which the PRCI and its att site were removed was independent of the experiments presented in panel A, and more accurately quantified transforma tion ef ficiency as a frequency relati v e to the overall cell population.This approach was also used to compare the effects of single and double mutations that disrupted hrcA , clpP and PRCI dnaN .Asterisks at the top of the plot indicate significant differences in transformation efficiency between mutants and the parental RMV7 wt genotype, as calculated using two-tailed Wilco x on rank sum tests.( C ) Violin plots showing the effect of the disruption of PRCI dnaN on gene expression, quantified as abundance relati v e to rpoA by qRT-PCR.IONPJBJN 00507 is a coding sequence within the PRCI that is absent from RMV7 wt PRCI dnaN + att ::Janus .The nine points for each gene correspond to three technical replicate assays on each of three biological replicates.The horizontal line on the violin plot shows the median relati v e abundance for each gene in each genotype.( D ) The changes in chaperone gene expression in RMV7 wt associated with PRCI disruption, calculated by a ppl ying a linear model (see Text S2) to the data in panel C.The points show the estima ted dif fer ence in expr ession, and the err or bars show the 95% confidence intervals.Acr oss all panels, significance is coded as: P < 0.05, *; P < 0.01, **; P < 10 −3 , ***; P < 10 −4 , ****.All P values were subject to a Holm-Bonferroni correction within each panel.

Figure 5 .
Figure 5.The regulation of transformation by Ca 2+ and heat shock in RMV7.( A ) Violin plots showing the transformation efficiency of RMV7 wt , RMV7 rare and mutant deri vati v es in which hrcA and clpP were disrupted.Transformation was assayed during normal growth (35 • C) or a 40 • C heat shock.Each point corresponds to an independent transformation experiment, and the violin plots have a horizontal line indicating the median transformation efficiency of each mutant at each temperature.( B ) Scatterplot showing the dose-dependent effect of CaCl 2 on transformation efficiency in RMV7 rare (cyan) and mutants in which hrcA had been disrupted (magenta), and then r estor ed (dark blue).Each point r epr esents an independent transf ormation assa y of one genotype at the indicated CaCl 2 concentration.The best-fitting dose response logistic models are shown, with the shaded ar eas corr esponding to the 95% confidence intervals.( C ) Expression of hrcA , measured as transcript abundance relati v e to rpoA by qRT-PCR, following the exposure of cells to CSP.CSP stimulated higher expression of hrcA , which was suppressed by co-administeration of 12.5 mM CaCl 2 , demonstra ting tha t CaCl 2 addition affects HrcA r egulatory activity.( D ) Expr ession of the early competence gene comX and late competence gene comEA following the addition of CSP in RMV7 rare (cyan), RMV7 rare hrcA ::Janus (magenta) and RMV7 rare manLMN ::Janus (peach), measured relati v e to the abundance of rpoA by qRT-PCR.( E ) Combined effects of chaperone and carbon source regulation on transformation efficiency.Results are displayed as in panel A. Wilco x on rank-sum tests were conducted between all pairs of genotypes.These found both the single mutants, lacking manLMN or hrcA , were significantly less transformable than the parental genotype.Furthermore, the double mutant was less transformable than either single mutant.Across all panels, significance is coded as: P < 0.05, *; P < 0.01, **; P < 10 −3 , ***; P < 10 −4 , ****.All P values were subject to a Holm-Bonferroni correction within each panel.

Figure 6 .
Figure 6.Comparison of the regulation of competence in ( A ) S. pneumoniae , from this work, and ( B ) V. cholerae , summarized from( 129 ).In each, competence is regulated by a quorum-sensing system: CSP in S. pneumoniae , and the cholera autoinducer 1 (CAI-1) in V. cholerae .The production of CSP is known to be inhibited by the non-coding csRNAs, and the HtrA protease degrades the signal.Similarly, the srn206 non-coding RNA r epr esses the ComD receptor of CSP( 130 ).CAI-1 operates through inhibiting LuxO, thereb y activ ating the Ha pR protein, w hich indicates a high-cell density environment.HapR activates competence through the Quorum-Sensing and TfoX-dependent Regulator (QstR).This regulator also senses the activation of TfoX in response to GlcNAc being detected by the tr ansmembr ane regulator TfoS, via the TfoR small RNA.TfoX activity is also regulated by the Catabolite Regulatory Protein (CRP), which is activ ated b y 3',5'-cAMP, generated by the CyaA adenylate cyclase under carbon source starvation conditions.Hence ther e ar e parallels with the TfoX orthologue, and adenylate cyclase-like protein YjbK, responding to GlcNAc in S. pneumoniae RMV7.GlcNAc also appears to promote competence through a TfoX / YjbK-independent route, based on the behaviour of RMV7 wt .In the pneumococcus, competence is also regulated by the chaperones ClpP and Hr cA.ClpP r epr esses the induction of competence in response to stresses, such as MGE replication, and has been previously shown to degrade ComX( 131 ).One of HrcA's two conformations appears to have a similar effect, r epr essing competence in response to heat shock, albeit likely through a different mechanism.The other acti v e conformation of HrcA, denoted HrcA*, appears to activate competence in response to Ca 2+ , again through an unknown mechanism.