Control of transcription elongation by GreA determines rate of gene expression in Streptococcus pneumoniae

Transcription by RNA polymerase may be interrupted by pauses caused by backtracking or misincorporation that can be resolved by the conserved bacterial Gre-factors. However, the consequences of such pausing in the living cell remain obscure. Here, we developed molecular biology and transcriptome sequencing tools in the human pathogen Streptococcus pneumoniae and provide evidence that transcription elongation is rate-limiting on highly expressed genes. Our results suggest that transcription elongation may be a highly regulated step of gene expression in S. pneumoniae. Regulation is accomplished via long-living elongation pauses and their resolution by elongation factor GreA. Interestingly, mathematical modeling indicates that long-living pauses cause queuing of RNA polymerases, which results in ‘transcription traffic jams’ on the gene and thus blocks its expression. Together, our results suggest that long-living pauses and RNA polymerase queues caused by them are a major problem on highly expressed genes and are detrimental for cell viability. The major and possibly sole function of GreA in S. pneumoniae is to prevent formation of backtracked elongation complexes.

and 0.4, respectively. The cells were centrifuged for 2 min at 14000 rpm and the cell pellet was resuspended in a volume of fresh medium containing 14.5% glycerol (v/v) that would result in an OD600nm of exactly 0.2 and 0.4, respectively. The cells were then aliquoted and stored at -80°C.
For transformation of S. pneumoniae, cells were grown in C+Y at 37°C to an OD600nm of approximately 0.12, then 100ng/ml of synthetic CSP-1 was added and cells were incubated 10 minutes at 37°C. Transforming DNA was added to the activated cells and a 20-minute incubation at 30°C followed. Cells were then diluted 10 times in fresh C+Y and incubated for 1h at 37°C.
Transformants were selected by plating in Columbia agar supplemented with blood and antibiotics when needed.

Recombinant DNA techniques and oligonucleotides
Common DNA procedures such as DNA isolation, restriction, ligation, gel electrophoresis and transformation of E. coli were performed as described (4).
Chromosomal DNA of S. pneumoniae was isolated using the Promega Wizard Genomic DNA Purification Kit. Oligonucleotides used in this study are listed in Table S2 and were purchased from Biolegio (NL) or Metabion (DE). Enzymes were purchased from Roche (Mannheim, Germany), New England Biolabs (Ipswich, USA), Bioline (London, UK) and Fermentas (Burlington, Canada) and used as described by the manufacturer. For PCR amplification, Velocity polymerase (Bioline) or Pfu polymerase (Stratagene) were used.

Plasmids
To construct plasmid pJWV100, carrying a codon optimised variant of superfolder gfp (gfp(Sp), the synthetic construct from pUC57-gfp_sf was subcloned using the SphI/BlpI sites into similarly digested pPP2 (5) thereby replacing lacZ with gfp(Sp) flanked by terminators (6). To construct plasmid pJWV101, carrying the zinc inducible czcD promoter in front of gfp(Sp), plasmid pMP2 (7) was digested with BamHI/EcoRI and the PczcD (600bps) product was ligated into similarly digested pJWV100. The construction of plasmid pHK102 (P32-gfp) and derivatives carrying mutations in the -10 of the P32 promoter will be described elsewhere (Jorgensen, Karsens and Veening).
To construct plasmid pLA01, a PCR was performed using the primers p5luc-F+BamHI and p5-luc-R+SpeI on the plasmid p5.00 (8). This PCR was purified and cut with BamHI and SpeI and cloned into pJWV100 cut BamHI/SpeI, thereby replacing the gfp_sf gene with the luc gene. To construct plasmids pLA13, the promoter of the late competence gene ssbB (P ssbB ) was amplified by PCR from D39 chromosome using the primers PssbB-F+NotI and PssbB-R+BamHI. This PCR was purified and cut with NotI and BamHI and cloned into pLA01 cut NotI/BamHI, thereby placing the luc gene under the control of the P ssbB (pLA13) as described (9).
To construct plasmid pLA18 (9) allowing expression of both luc and a gene encoding a superfolder variant of GFP (gfp(Bs)) under the control of P ssbB , gfp(Bs) was subcloned from pUC57-gfp_sf_DSM (6) using the XbaI/BlpI sites into similarly digested pLA13.
To construct plasmid pPGs4, carrying greA under the control of the Zinc inducible czcD promoter (P Zn ), greA gene was amplified with oligonucleotides sPG49 and sPG50, digested with BamHI and SpeI and ligated to an equally cut pJWV101.
Plasmid pPGs6 carries a constitutively expressed lacZ reporter gene in which a STOP codon (TAA) has been introduced in place of codon 15. The P 32 -lacZ G15stop cassette was amplified from plasmid pPGs3 (lab collection) with the oligonucleotide pair sPG57-sPG58, digested with BglII and SpeI and ligated to an equally cut pNZ8902 (10).
Plasmid pPGs9, carrying a catalytically inactive greA Spn-D43A/E46A mutant allele under control of P Zn , was obtained by site-directed mutagenesis using plasmid pPGs4 as a template and by introducing the desired mutations with oligonucleotides sPG106 and sPG107.

Strains
S. pneumoniae strain PGs6 was constructed by replacing the greA gene with a chloramphenicol resistance cassette. Approximately 3,000 bp upstream and downstream greA coding sequence was amplified with the oligonucleotide pairs sPG13-sPG14, and sPG15-sPG16 respectively, using chromosomal DNA of strain D39 as a template. PCR products were digested with either AscI or NotI and ligated to an equally cut chloramphenicol resistance gene, amplified from plasmid pNZ8048 (11) with oligonucleotides sPG11 and sPG12. S. pneumoniae D39 was transformed with the ligation product, transformants were selected on GM17 agar plates supplemented with 1% defibrinated sheep blood and 2 µg/ml chloramphenicol. Gene replacement was verified by PCR.
S. pneumoniae strain PGs30 was constructed by replacing the hexA gene with a trimetoprim resistance cassette. Approximately 2,500 and 3,000 bp upstream and downstream hexA coding sequence was amplified with the oligonucleotide pairs sPG59-sPG61 and sPG60-sPG62 respectively, using chromosomal DNA of strain D39 as a template. PCR products were digested with either AscI or NotI and ligated to an equally cut trimetoprim resistance gene, amplified from plasmid pKOT (12)

Fluorescence Microscopy
Cells were grown at 37°C in C+Y medium in half filled 5 ml capped tubes. Where relevant, Nile red (Invitrogen) was added to a final concentration of 8 ng/ml, and DAPI was added to a final concentration of 0.2 μg/ml. 0.5 μl of the cell suspension was spotted on a microscope slide containing a slab of 1% PBS agarose. Images were taken with a Deltavision (Applied Precision) IX71Microscope (Olympus) using a CoolSNAP HQ2 camera (Princeton Instruments) with a 100X phase contrast objective. Emission/excitation filters were from Chroma. For DAPI, typical exposure times were between 1 and 2 seconds with 100% xenon light (300W). For Nile red, typical exposure times were 200ms with 32% of excitation light. Microscopy images modified for publication using ImageJ (http://rsb.info.nih.gov/ij/).

Growth curves, luminescence and fluorescence assays
C+Y medium was inoculated with mid-exponential phase frozen cultures. Cells were grown at 37°C in 96 wells plate (Polystyrol, white, flat and clear bottom, Corning) in a microtiterplate reader (Tecan Infinite F200 pro). Throughout the growth, absorbance (OD595nm), luminescence (expressed in RLU, relative luminescence unit) and fluorescence (arbitrary unit) were measured every 10 min. Expression of the luc gene results in the production of luciferase and thereby in the emission of light when the medium contains luciferin (13). For GFP fluorescence, excitation wavelength was 475nm and emission wavelength was 504nm. Other than shaking the plate for 10s prior to every measurement, no other measures were taken to maintain aeration since oxygen limitation is not an issue for S. pneumoniae which is a microaerophilic bacterium. Elongation complexes were assembled in TB lacking Mg 2+ with 13 nt-long RNA radioactively labeled at the 5' end as described (14), except for complexes were immobilized on streptavidin agarose beads (Fluka) through biotin of the 5' end of DNA template strand (15). To form misincorporated complex (mEC14), 10 mM ATP and 20 mM MgCl 2 were added for 30 s. mEC14 was chased with 1 mM NTPs in the presence or absence of 10 nM GreA Spn for times indicated in the figure. Reactions were stopped and products analyzed as above.

In vitro transcription assays
Permanganate footprinting of open complexes was performed on promoters 32 P-kinased on either template or non-template strand, by addition of 5 mM KMnO 4 for 30 s, in the presence of 0.5 mM GTP (initiating nucleotide).
Reactions were terminated by addition of β-mercaptoethanol to 330 mM, followed by phenol-extraction, ethanol-precipitation and 10% piperidine treatment.

RNA isolation, cDNA library construction and Illumina sequencing
Total RNA was isolated from mid-expontially growing cultures as described by (16). Sample preparation and sequencing were performed by Vertis (GER). In brief: the RNA samples were fragmented with ultrasound (7 pulses of 30 sec at 4°C) and then dephosphorylated with antarctic phosphatase followed by  17)). Bowtie settings were "-k 1 -best -strata". To extract RPKM values for rRNAs (Table   S4), Rockhopper was employed (18).

Calculating transcription mistakes using RNA-Seq data
Transcription errors were tallied using a custom Python script using the Bowtie output, by comparing the sequence of the mapped reads to the corresponding positions in the reference genome of Streptococcus pneumoniae D39 (NCBI annotation ID NC_008533.1, downloaded 20-2-2012; the python script is available on request. The errors were tallied with the assumption that the errorrate caused by the RNA Seq procedures itself would be equal for all samples using that same procedure, and would therefore not bias any comparison amongst them.

RT-qPCR
Strains D39 and ΔgreA were grown in 50 ml C+Y acidic medium and samples for RNA isolation were taken between OD 600 = 0.  Table S6.

Stochastic transcription model
The model used to simulate transcription with and without GreA is based on the model of (20): Elongation complexes are described as stochastic steppers on a one-dimensional lattice representing the DNA and the lattice spacing corresponds to one nucleotide. Each elongation complex covers 50 sites.
Elongation complexes move by stochastic single-site (single-nucleotide) steps, which occur with the elongation rate ε≈50 sec -1 . If the target site (the site to which an elongation complex attempts to step) is occupied by another elongation complex, the stepping attempt is rejected (exclusion rule). Elongation complexes are initiated at the promoter with the rate α, provided that the promoter is free.
This rate reflects the promoter strength and is varied to simulate genes with different expression levels. We refer to it as the initiation attempt rate, as the true initiation rate is smaller because the promoter is not always free. This rate is used to modulate the expression level in the simulations. Termination is assumed to be rapid once an elongation complex reaches the termination site at the end of the system (gene or operon). Stall events are allowed to occur at specific sites, which are selected randomly at the beginning of the simulation with an average distance of 30 between those sites. These sites mimic pause sites seen in the in vitro transcription assays, but the presence of specific sites is not crucial for the results presented here. At these sites an elongation complex may enter the stalled state with rate f. Stalled elongation complexes are rescued and reactivated with rate 1/τ, where τ is the duration of the stall event.
The key assumption of the model for studying the role of GreA is that the presence of GreA strongly reduces τ. As we do not have precise quantitative estimates for the pause parameters, simulations were performed with a wide parameter range.
Results presented in the figures were obtained for f=ε/100, and τ=1000/ε (without GreA) or τ=1/2ε (with GreA), i.e. stalling durations of 20 sec and 0.01 sec. In addition, we test alternative scenarios where GreA is assumed to reduce the stepping rate ε or to increase the pause frequency f (Fig. S3). Qualitatively similar results are obtained for all situations where, for realistic initiation rates, transcription is limited by initiation in the presence of GreA, but becomes limited by elongation in its absence (Fig. S4). However, the scenario with a reduced stepping rate can be excluded as it is not consistent with the elongation experiment, as it shows a change of the slope of both the 3' and 5' probes as well as an increase in the time different between the first appearance of the probes (Fig. S3F). The results for a reduction of the pause frequency ( Fig. S3 A-C) are similar to those for a reduction of the pause duration, although a small effect on the slope of the early probe, which is absent in the experimental data, is obtained in this case. Since Gre factors in E coli have been shown to stimulate transcript cleavage during backtracking pauses (14,21), however, we consider shortening of the pause duration more likely.
The stochastic model was simulated with the kinetic Monte Carlo method described by Klumpp and Hwa (20) with