Native elongation transcript sequencing reveals temperature dependent dynamics of nascent RNAPII transcription in Arabidopsis

Abstract Temperature profoundly affects the kinetics of biochemical reactions, yet how large molecular complexes such as the transcription machinery accommodate changing temperatures to maintain cellular function is poorly understood. Here, we developed plant native elongating transcripts sequencing (plaNET-seq) to profile genome-wide nascent RNA polymerase II (RNAPII) transcription during the cold-response of Arabidopsis thaliana with single-nucleotide resolution. Combined with temporal resolution, these data revealed transient genome-wide reprogramming of nascent RNAPII transcription during cold, including characteristics of RNAPII elongation and thousands of non-coding transcripts connected to gene expression. Our results suggest a role for promoter–proximal RNAPII stalling in predisposing genes for transcriptional activation during plant–environment interactions. At gene 3′-ends, cold initially facilitated transcriptional termination by limiting the distance of read-through transcription. Within gene bodies, cold reduced the kinetics of co-transcriptional splicing leading to increased intragenic stalling. Our data resolved multiple distinct mechanisms by which temperature transiently altered the dynamics of nascent RNAPII transcription and associated RNA processing, illustrating potential biotechnological solutions and future focus areas to promote food security in the context of a changing climate.


INTRODUCTION
Changes to ambient temperatures challenge the development and growth of living organisms. While mammals re-tain a stable body temperature, sessile organisms such as plants continually sense their environment and rely on molecular mechanisms that compensate for temperature changes (1). Alterations to the ambient temperature frequently lead to re-programming of the transcriptional output by RNA polymerase II (RNAPII) that reflects steadystate levels of messenger RNAs and non-coding RNAs in the cell (2,3). Sequence-specific transcription factors controlling the initiation of transcription often shape these responses. However, the significance of mechanisms regulating eukaryotic gene expression after initiation, for example through control of elongation of the nascent RNA chain is increasingly appreciated (4). Genome-wide profiling of transcriptionally engaged RNAPII complexes has identified low-velocity regions of RNAPII elongation at the beginning (i.e. promoter-proximal stalling) and the end (i.e. poly-(A) associated stalling) of genes (4,5). Organisms appear to alter the activity of RNAPII at these regions to re-program their transcriptional output to acclimate to temperature changes. The release from promoterproximal stalling at heat-shock genes facilitates rapid transcriptional induction in response to heat in Drosophila (6), and promoter-proximal stalling is reduced genome-wide when temperatures increase in mammalian cell cultures (7). RNAPII accumulation at gene ends is associated with the mechanism of transcriptional termination (8). Here, molecular complexes associated with nascent RNAPII transcript cleavage at the poly(A)-signal (PAS) regulate RNAPII activity to ensure accurate processing of the nascent transcript (8). RNAPII continues to transcribe past the PAS until 5 -to-3 exonucleases catch up with transcribing RNAPII to mediate transcriptional termination (8)(9)(10). Hence, transcriptional termination is determined by kinetic competition between the speed of RNAPII transcription after nascent transcript cleavage and the termination factor (11). Temperature increases the read-through transcription distance at gene ends in several organisms (11,12), suggesting connections between temperature, RNAPII stalling at gene borders and the efficiency of transcriptional termination. However, the immediate genome-wide effects of low temperatures on nascent RNAPII transcription in eukaryotes are unclear.
Transcriptionally engaged RNAPII complexes can be visualized by Native Elongating Transcript sequencing (NETseq) (13)(14)(15)(16). NET-seq provides a strand-specific snapshot of nascent RNAPII transcription at single-nucleotide resolution genome-wide (16). The capture of nascent RNA by NET-seq enables the detection of RNAs that are usually subjected to co-transcriptional RNA degradation. This advantage of NET-seq helps to detect long non-coding RNAs (lncRNAs), as these tend to be targeted for cotranscriptional RNA degradation by the nuclear exosome RNA degradation complex (17,18). Moreover, NET-seq in yeast and mammals allowed estimates of the average length of cryptic read-through transcription that allows quantitative analyses of the transcription termination mechanism (19,20). An additional advantage of NET-seq data are insights into co-transcriptional RNA splicing, since part of the spliceosome is co-purified with transcribing RNAPII complexes (15,21). Nascent RNAPII transcription slows down close to exon-intron boundaries in a splicing-dependent manner and is responsible for intragenic RNAPII stalling (15). Splicing regulation is essential for the cold-response in Arabidopsis (22,23) but how this is connected to molecular adjustments of nascent RNAPII transcription is largely unknown.
Here, we developed a NET-seq approach to study nascent transcription in the model plant Arabidopsis thaliana (plaNET-seq). We analyzed the temporal dynamics of nascent RNAPII transcription in response to cold. Our data revealed transient molecular adaptations of transcription that include changes to promoter-proximal stalling, elongation, termination and many novel non-coding transcription events overlapping gene expression domains. Our data provide genome-wide support for a transient reprogramming of nascent RNAPII transcription during cold exposure, highlighting a cellular compensation mechanism at the level of nascent RNAPII transcription to assist optimal growth of multicellular organisms in challenging environments.

Plant material and growth conditions
Arabidopsis thaliana seeds were surface-sterilized in ethanol and grown on 1 2 MS + 1% sucrose media in long day conditions (16 h light/8 h dark) at 22 • C/18 • C. Light intensity during day hours was ∼100 E m −2 s −1 . 10-day old seedlings were used for all experiments. The NRPB2-FLAG line was described in (24). The construct covers a lethal nrpb2-1 allele (SAIL 859B04). For inhibition of splicing, seedlings were grown on filter paper covered 1 2 MS + 1% sucrose for 10 days then transferred to DMSO, 5 M pladienolide B (Santa Cruz) or 5 M Herboxidiene (Focus Biomolecules) containing plates for 6 or 24 h. For low temperature treatment, 10-day old seedlings were transferred from 22 • C directly to 4 • C and ∼25 E m −2 s −1 for indicated times.

Total RNA isolation and RT-qPCR
Total RNA was isolated from Arabidopsis seedlings grown for 10 days and exposed to DMSO or splicing inhibitors for 6 or 24 h with RNeasy Plant Mini Kit (Qiagen) according to manufacturers' instructions. 5 g of total RNA was treated with Turbo DNaseI (Ambion) to remove any genomic DNA. Subsequently, 1 g of DNase-treated RNA was converted to cDNA using SuperScript IV (Invitrogen) with random primers according to manufacturers' instructions. Quantitative PCR was performed in three technical replicates with the GoTaq qPCR Master Mix (Promega) in 384-well plates. The PCR was run in a CFX384 Touch Real-Time PCR Detection System (BioRad) and monitored by the CFX Manager software (BioRad). Threshold values were subsequently exported to Excel and processed further. All oligos used for the PCR can be found in Supplementary  Table S3.

Preparation of plaNET-seq libraries and sequencing
Libraries were constructed using Bioo Scientific's NEXTflex Small RNA-seq kit v3 following a custom protocol. Unlike the original protocol provided by the manufacturer, our custom protocol incorporates RNA fragmentation step in order to avoid underrepresentation of longer molecules of nascent RNA compared to shorter ones (Supplementary Figure S1B). Approximately 100 ng RNA was used for each library. After the ligation of the 3 -linker, RNA was fragmented in alkaline solution (100 mM NaCO 3 pH 9.2, 2 mM EDTA) to a fragment size of 20-150 bp. After fragmentation, RNA was cleaned up with AMPure RNAclean XP beads, treated with PNK (20 U, NEB) for 20 min at 37 • C and then re-annealed with 8 M RT-primer (70 • C, 5 min; 37 • C, 30 min; 25 • C, 15 min. Oligo sequence: 5 -GCCTTGGCACCCGAGAATTCCA-3 ). The RNA was then re-introduced to the gel-free version of the manufacturer's protocol at the adapter inactivation step. For detailed step-by-step library preparation protocol, refer to Supplementary Figure S1B. Depending on the library, 10-16 cycles of PCR was used and the final library was validated with Agilent's DNA High Sensitivity kit on a Bioanalyzer 2100. Libraries were sequenced on the Illumina HiSeq-PE150 platform at Novogene (en.novogene.com).

Alignment of sequencing reads
The first 4 bases of both R1 and R2 reads in plaNET-seq are Unique Molecular Identifiers (UMIs). They were trimmed from read sequences and appended to read names using UMI-Tools v0.5.3. After UMI trimming, the 5 -terminal base of R2 corresponds to the 3 -end of original RNA molecule and thus denotes the genomic position of RNAPII active center. R2 reads were aligned to TAIR10 genome assembly using STAR v2.5.2b in transcriptome-guided mode with the following settings: -outSAMmultNmax 1 -alignEndsType Extend5pOfRead1 -clip3pAdapterSeq GA TCGTCGGACT. Ensembl Plants release 28 was used as the source of transcript annotation for alignment. The BAM files were sorted using Samtools v1.3.1. The following categories of reads were filtered out: (i) PCR duplicates (UMI-Tools); (ii) Reads aligned within 100 bp from any rRNA, tRNA, snRNA or snoRNA gene from Araport11 on either strand (BEDtools v2.17.0); (iii) Reads aligned with MAPQ < 10 (Samtools). The filtered BAM files were imported into R environment v3.5.1 using Genomi-cAlignments 1.18.1 library. The strand orientation of reads was flipped to restore strandedness of the original RNA molecules. 3 -terminal bases of flipped reads were found to overlap with 5 or 3 splice sites much more frequently than could be expected by chance. Such reads most likely represent splicing intermediates due to co-immunoprecipitation of the spliceosome together with FLAG-tagged RNAPII complexes. These reads were filtered out by overlap with the union of splice sites obtained from both Ensembl Plants 28 (TxDb.Athaliana.BioMart.plantsmart28 package) and Araport11 annotations. In addition, all split reads were removed as possible mature RNA contaminations. The remaining reads are expected to represent the nascent RNA population. Their genomic coverage was exported as strand-specific BigWig and bed-Graph files using rtracklayer 1.42.2. The full pipeline is provided in the 01-Alignment plaNET-Seq.sh and 02-Postprocessing plaNET-Seq.R scripts in the code repository.
Araport11 annotation was used throughout all further steps of data analysis because it is more comprehensive in terms of non-coding transcripts than both TAIR10 and Ensembl Plants 28 annotations. However, the downside or Araport11 compared to TAIR10 are the unrealistically long 5 -and 3 -UTRs (31). To compensate for this effect, we adjusted gene borders from Araport11 using TSS-seq and DR-seq data. If multiple TSS or PAS tag clusters were connected to the same gene, the strongest of them was chosen as the new border. The relevant code is available in 04-Adjustment Araport11.R script.

Metagene plots
To draw metagene plots of plaNET-seq and other datasets mentioned above, we merged biological replicates and normalized the tracks to 1 million reads in nuclear proteincoding genes. The X axes of metagene plots represent the genomic intervals of choice which were scaled to the defined number of bins. Intervals overlapping multiple annotated transcription units were excluded from consideration. In particular, both introns and exons were trimmed by 5 bp each side prior to scaling to avoid possible artifacts. The Y axes show the sequencing coverage averaged between the genomic intervals. The code required to reproduce metagene plots from bedGraph tracks is available in 05-Metagenes.R script.

Calling of novel transcripts
Transcript borders were called de novo from each plaNETseq sample using groHMM package (32). Intervals which have <50% reciprocal overlap on the same strand with any known transcription unit in Araport11 were considered as novel (previously unannotated) transcripts. The novel transcripts were clustered between plaNET-seq samples and merged to obtain a non-redundant set (n = 7228). They Nucleic Acids Research, 2020, Vol. 48, No. 5 2335 were further classified into divergent, convergent, PAS antisense, distal antisense or intergenic (for more details, see 06-groHMM pipeline.R).

Analysis of differentially transcribed genes
Differentially transcribed known genes and novel transcripts were called by DESeq2 (33) from unnormalized plaNET-seq tracks with FDR <0.05 and log 2 FC > 1 (see 07-DESeq2 pipeline.R)

Calculation of read-through length
To calculate the read-through (RT) length, we considered strongly transcribed genes (plaNET-seq FPKM in WT samples above 5). Genomic intervals for RT length estimation were defined to extend from PAS of the analyzed gene to the nearest downstream TSS. Coordinates of TSS and PAS clusters were called from TSS-seq and Direct RNAseq datasets as described above. For each gene of interest, the empirical distribution of plaNET-seq tag counts in 100 bp sliding window was obtained (the 'transcription' model).
The 'random' model corresponding to the untranscribed state was represented by Poisson distribution where the rate parameter was estimated from plaNET-seq tag counts in intergenic regions. Then plaNET-seq tags were counted in every 100 bp window moving in 10 bp steps along the candidate RT genomic interval. For each window, the probability to observe at most this tag count under the genespecific 'transcription' model was divided by the probability to observe at least this tag count under the alternative 'random' model. The start position of the first window where the probability ratio dropped below 1 was considered as the end of the read-through region. The code is available in 08-Readthrough distance.R

Calculation of stalling indexes
To calculate promoter-proximal RNAPII stalling index for each gene longer than 1 kb, we first found 100 bp windows with the highest plaNET-seq coverage within the interval [TSS -100 bp, TSS + 300 bp]. Center of this window was considered as the summit of promoter-proximal RNAPII peak. The stalling index was then calculated as the ratio of plaNET-seq coverage in this window vs the whole gene (normalized by gene length). Similarly, the intronic stalling index (ISI) was calculated for each intron longer than 50 bp: first we found the 'best' 10 bp window within the intron, and then we divided its plaNET-seq coverage by length-normalized coverage of the whole intron. Introns with FPKM-normalized plaNET-seq coverage >10 were further classified by their stalling index into 'strong' (ISI ≥ 5.5), 'medium' (3.5 < ISI < 5.5) and 'weak' (ISI ≤ 3.5). For more detailed description, refer to 09-Stalling index.R.

plaNET-seq robustly detects nascent RNAPII transcription in Arabidopsis
To purify RNAPII complexes, we relied on a FLAGimmunoprecipitation of the second-largest RNAPII subunit (NRPB2-FLAG). The NRPB2-FLAG construct covers lethal null-alleles of nrpb2, which makes these lines suitable to capture RNAPII as all complexes carry the tagged NRPB2 subunit (24). The stable line shows wild-type phenotype. We used the nuclear fraction of flash-frozen Arabidopsis seedlings as starting material ( Figure 1A). RNAPII complexes were immunoprecipitated with high efficiency (Supplementary Figure S1A), and nascent RNA was purified and used for library construction ( Supplementary Figure S1B). Processed reads were aligned to the Arabidopsis genome, identifying positions of the nascent RNA 3ends ( Figure 1B, upper panel). Visualized in a genome browser, plaNET-seq shows the characteristic 'spiky' pattern that represents the nascent RNAPII transcription at each nucleotide. Our plaNET-seq libraries showed high reproducibility (Pearson coefficient r > 0.98) between replicates and confirmed low-velocity nascent RNAPII transcription at gene boundaries (Supplementary Figure S1C-F). We also generated a mock-IP plaNET-seq library to assess the stringency of our protocol. The mock-IP signal showed weak correlation to the signal from a FLAG-IP library (Supplementary Figure S1G, Pearson coefficient r < 0.35). The signal of mock-IP plaNET-seq libraries was extremely low, supporting FLAG-IP specific signal corresponding to nascent RNAPII transcription in our samples ( Figure 1B, Supplementary Figure S1H). Our protocol is gel-free and differs in some steps from the published pNET-seq protocol. For example, the FLAG antibody is used instead of endogenous RNAPII antibodies at the IP step (Supplementary Figure S2A-C). Our libraries of nascent RNA appeared enriched for intronic reads and reads downstream of the annotated poly-(A)-site that represented RNAPII complexes undergoing termination of transcription. Steady-state methods such as RNAseq do not provide this information on nascent RNAPII transcription, further supporting our successful enrichment for nascent RNA ( Figure 1B). We called transcripts de novo from plaNET-seq data using the groHMM algorithm (32) and identified thousands of transcripts not annotated in Araport11 ( Figure 1C and D, Supplementary Table S1). The majority of these novel transcripts were in proximity to known genes, or overlapping them on the antisense strand ( Figure 1C and D). Overall, RNA-seq data correlated well with our plaNET-seq data for annotated transcripts but poorly for unannotated transcripts, emphasizing the power of plaNET-seq to capture transcripts undergoing rapid RNA degradation (Supplementary Figure S2D  U p s t r e a m D o w n s t r e a m I n t e r g e n ic D iv e r g e n t C o n v e r g e n t P A Sa n t is e n s e D is t a l A S

Characterization of divergent and convergent transcription
To further characterize the novel transcripts detected by plaNET-seq, we defined transcripts that start upstream (0-500 bp) from the TSS of a protein-coding gene but on the opposite strand as divergent non-coding transcripts (DNC) (Figures 1C and 2A). DNC represents an important source of lncRNA transcription in yeast and metazoans (16,(35)(36)(37), but the presence of DNC in Arabidopsis has been questioned (38). plaNET-seq provided evidence for DNC at 917 protein-coding genes and the DNC transcription start site (divTSS) was most often located 200-400 bp upstream from the coding TSS ( Figure 2B). Thus, these data support the presence of DNC in plant genomes, although to a lower extent compared to yeast or mammals. An example of DNC was identified at the At3g28140 locus ( Figure  2C). In general, genes driving DNC in plants had higher nascent RNAPII transcription on the coding strand compared to non-DNC genes ( Figure 2D), indicating that DNC was associated with Nucleosome Depleted Regions (NDRs) of highly expressed genes. Metagene analyses of DNC using TSS-seq data in the hua enhancer 2-2 mutant (hen2-2, a nuclear exosome mutant) (39) showed DNC degradation by the nuclear exosome in Arabidopsis ( Figure 2E), similar as in yeast and metazoans (36,40). DNC promoters had higher nucleosome density in the divergent non-coding direction compared to a control set of genes with similar transcription level (Supplementary Figure S3A). DNC promoters exhibited NDRs with well-defined flanking -1 and +1 nucleosomes (Supplementary Figure S3B). In conclusion, DNC transcription shares regulatory principles with budding yeast (41), an association with high definition of the -1 nucleosome, and is repressed by co-transcriptional RNA degradation (42).
In addition to DNC, groHMM detected 5313 novel transcripts that overlap a single annotated gene transcription unit fully or partially on the antisense strand ( Figure 3A). We considered novel transcripts as antisense transcripts when they either started internally of a host gene, or no more than 20% of its length downstream (n = 4922). We detected two preferential initiation sites for such antisense transcripts along the gene body ( Figure 3A). The predominant peak of initiation site frequency was found at the 3end of genes, defined as PAS-associated antisense transcription (n = 3223). The second peak was located within the first 50% of the gene body, and we defined these transcripts as convergent antisense transcripts (CAS; n = 1699). CASs have been detected in human cells (13,19) but have so far been uncharacterized in plants. The TSS of convergent transcripts (casTSS) most often initiated at a distance between 250 and 1000 bp from the sense TSSs ( Supplementary Figure S4A), exemplified by the At2g46710 gene ( Figure 3B). Interestingly, casTSSs showed a strong bias towards early exon-intron boundaries with a peak very close to the first 5 splice site (5 SS, Figure 3C). Over 50% of all CAS initiated from the host gene's first exon or intron (Supplementary Figure S4B). Moreover, genes harboring a CAS had a significantly longer first exon and intron, indicating that a specific 5 -gene structure correlates with CAS expression (Supplementary Figure S4B). The nucleosome density upstream of the casTSS showed a sharp decrease, suggest-ing an intragenic NDR (Supplementary Figure S4C). Interestingly, when we assigned previously described chromatin states (34) to the bodies of Arabidopsis genes and explored where CAS transcription initiated, we detected an overrepresentation of casTSS within the chromatin states we denoted as promoter-to-early elongation ( Supplementary Figure S4D and E). This indicated that the CAS initiation region coincided with a location where RNAPII complexes enter productive elongation. Genes giving rise to CAS had higher sense strand transcription compared to genes without detectable CAS ( Figure 3D). These data indicated an association of CAS with a subset of highly transcribed genes. In addition, a comparison of TSS-seq data in wild type Col-0 seedlings and hen2-2 mutants showed that CAS transcripts are nuclear exosome targets ( Figure 3E). Thus, we characterized Arabidopsis CAS as nuclear exosome targets that initiate from a NDR in promoter-proximal intervals of highly expressed genes with a long first exon and intron. All in all, our plaNET-seq data highlights the strength of a nascent RNA detection method to identify cryptic noncoding transcripts.

Low temperature lead to major re-programming of nascent RNAPII transcription
In addition to the capture of cryptic transcripts, NET-seq interrogates the RNAPII transcription dynamics over coding and non-coding transcription units, revealing regions of low-velocity transcription. The link between temperature and transcriptional output in plants (3) lead us to hypothesize that chilling temperatures may regulate nascent RNAPII transcription over these regions. Therefore, we exposed seedlings to early stages of cold-acclimation (3 and 12 h at 4 • C, Figure 4A). Numerous transcripts had significantly changed plaNET-seq signal over their transcription units in our conditions ( Figure 4B, Supplementary Table  S2). The number of differentially transcribed known genes at 3 h at 4 • C versus 22 • C greatly exceeded those detected as differentially expressed in the same conditions and identical cut off values by Transcription Start Site sequencing (TSSseq) (2). These data suggest that the detection of steadystate levels of RNA species (i.e. by TSS-seq) does not fully capture the actual changes in nascent transcription during exposure to 4 • C ( Figure 4C) (2). Strikingly, 47% and 50% of known transcripts which were upregulated or downregulated after 3 h at 4 • C, returned to baseline levels after 12 h at 4 • C ( Figure 4D), suggesting transient re-programming of nascent RNAPII transcription. Nascent transcription of the novel non-coding transcripts was also affected by the cold treatment, as shown on metagene plots for divergent, convergent and PAS-associated antisense transcripts (Figure 4E-G). We detected a rapid decrease of plaNET-seq signal after 3 h at 4 • C that reverted back to or close to control levels after 12 h at 4 • C. Thus, our results support the notion that transcription of many non-coding transcripts respond rapidly to a changing environment (43). Taken together, plaNET-seq detected genome-wide transcriptional changes with increased sensitivity compared to steady-state methods and revealed a major re-programming of nascent RNAPII transcription in response to chilling temperatures. DNC could be detected with TSS-seq data and DNC were targeted by the nuclear exosome. The shaded area shows 95% confidence interval for the mean.

Exons and co-transcriptional splicing represent transient transcriptional barriers at low temperature
The re-programming of nascent RNAPII transcription in response to chilling temperatures prompted us to look closer at the effects on coding regions in the genome. Eukaryotic genes have exon-intron architecture where introns are co-transcriptionally spliced out to form a functional mRNA. The close proximity of a transcribing RNAPII complex and the spliceosome is detected with NET-seq (15,21). Splicing intermediates can readily be detected in NET-seq data, in particular the 5 splice site (5 SS) that is protected by the co-purified spliceosome ( Figure 5A), as previously reported in human NET-seq (15). We thus filtered out these read positions in our analysis since the RNAPII-associated RNA 3 -ends through co-purification of the spliceosome may not precisely inform on the position of nascent RNAPII transcription (15). Interestingly, when we analyzed the fraction of 5 SS reads in our low temperature exposed plaNET-seq samples, we detected a strong genome-wide decrease of 5 SS reads after 3 h at 4 • C compared to 22 • C ( Figure 5B and C, Supplementary Figure  S5A). The decrease reverted back to control levels after 12 h at 4 • C, suggesting that the kinetics of the splicing reaction was initially affected by low temperature ( Figure 5B). Moreover, we detected a transient increase of the exon to intron ratio of nascent RNAPII transcription after 3 h at 4 • C compared to 22 • C and 12 at 4 • C ( Figure 5D). These data indicated a transiently increased nascent RNAPII transcription over exons at 4 • C. Consistently, many of the transcripts upregulated after 3 h were relatively long, multi-exonic genes compared to downregulated genes, whereas an inverse relationship was detected for expression changes from 3 h to 12 h at 4 • C (Supplementary Figure S5B and C).
The hypothesis that splicing kinetics may be transiently affected by low temperature prompted us to examine the connection between splicing and RNAPII transcription more closely. We applied the splicing inhibitors pladienolide B (plaB) and herboxidiene and confirmed their effect  on sensitive splicing events (44,45) with RT-qPCR ( Figure  5E, Supplementary Figure S6A). Next, we treated seedlings with DMSO or plaB for 6 hours and generated plaNETseq libraries. We detected a large genome-wide decrease in 5 SS reads in our plaB samples compared to the DMSO samples, confirming a successful inhibition of the splicing reaction ( Figure 5F and G, Supplementary Figure S6B). Our analysis identified small nuclear RNAs involved in splicing, confirming co-purification of the spliceosome with RNAPII complexes (Supplementary Figure S6C), consistent with earlier reports (15,21). Metagene profiles of internal exons revealed increased nascent RNAPII transcrip-tion upstream of the 5 SS in DMSO compared to plaB, supporting splicing-dependent RNAPII stalling before the end of exons ( Figure 5H, dashed box). This exonic RNAPII stalling was visible also in the re-analyzed pNET-Seq data (14), however only in the serine-5 phosphorylation (Ser5P) track which corresponds to NRPB1 phosphorylated at Ser5 position of its C-terminal domain ( Figure 5I). In our coldtreated samples, we detected an increased peak at the end of exons after 3 h 4 • C compared to 22 • C ( Figure 5J and K, dashed box). The increased height of the peak was transient and reverted to baseline levels after 12 h at 4 • C. In conclusion, our analyses support a splicing-dependent dynamic in-  crease of nascent RNAPII transcription at the end of exons during low temperature. These data may indicate that the kinetics of the splicing reaction is transiently reduced in the chilling response.

Identification of a novel intragenic RNAPII stalling site
In introns, plaNET-seq metagene profiles of our plaB and DMSO samples revealed a peak of nascent RNAPII transcription close to the 5 SS ( Figure 6A). Moreover, this intronic peak is most clearly visible in the Ser5P track of pNET-seq data (Supplementary Figure S7A). We called the peak coordinates in each intron using sliding window approach on Ser5p pNET-seq data. Next, we calculated an 'Intronic stalling index' (ISI) for each intron. Finally, we divided the introns based on ISI into those with strong, medium or weak stalling (for more details, see Methods). ISI for expressed introns can be found in Supplementary  Table S4. The intronic peak was most frequently observed at 25 nt downstream of the 5 SS, irrespective of the ISI level ( Figure 6B). An example of the intronic peak can be seen for intron 2 of At1g59870 ( Figure 6C). Grouping introns by ISI revealed that introns with higher ISI scores were on average longer than low ISI-score introns (Supplementary Figure  S7B). This insight made us return to genes with a convergent antisense transcript (CAS), which show a longer first intron compared to control genes. Indeed, CAS host genes showed a significantly higher ISI compared to control genes (Supplementary Figure S7C). Thus, there was a correlation between the presence of a CAS and sense RNAPII stalling in the long first intron found in CAS host genes. Next, we stratified introns according to their length to explore other potential effects of the intronic peak. We detected no evidence for increased nucleosome signal in short introns (60-250 bp; n = 97 558), Supplementary Figure S7D). However, we detected peaks of nucleosome density in longer introns (250-1000 bp, n = 15 991), suggesting that these included one or several phased nucleosomes (Supplementary Figure S7D). We next plotted nascent RNAPII transcription over long introns compared to a control set of short introns (obtained from the same genes to avoid any effect of gene expression level). We detected a higher plaNET-seq signal over longer introns, suggesting that long introns were transcribed more slowly compared to short introns (Supplementary Figure S7E). Thus, nucleosome barriers may contribute to a reduced transcription speed and increased plaNET-seq signal of longer introns. Interestingly, the intronic peak in short introns was largely plaB insensitive ( Figure 6D), whereas stalling in long introns was sensitive to plaB ( Figure 6E). Similarly, our cold-treated samples showed small effects of the intron peak for short introns ( Figure 6F) but a large increase of nascent RNAPII transcription after 3 h at 4 • C that reverted back to control levels after 12 h at 4 • C in long introns ( Figure 6G). This observation further supported a transient decrease in kinetics of the splicing reaction after low temperature exposure. All in all, our plaB and cold-treated samples provide key information to distinguish plaNET-seq signal that is dependent on the splicing reaction from peaks of RNAPII activity independent of splicing. Our data support a RNAPII stalling site 25 nt into plant introns. The sensitivity of this peak to plaB and to low temperature correlates with intron length, perhaps indicating RNAPII-stalling associated checkpoint to improve splicing accuracy of long introns. The intronic peak of RNAPII activity represents a novel site of RNAPII stalling during gene transcription that represents the third stalling site in addition to the positions at gene boundaries.

Low temperature affects promoter-proximal RNAPII stalling
To further investigate RNAPII stalling at gene boundaries, we first focused on the beginning of transcription units (i.e. promoter-proximal stalling). plaNET-seq detected a large fraction of reads at 5 -ends of genes, consistent with previous studies in plants and metazoans (5,14) (Supplementary Figure S1C). However, we found no clear association between the annotated TSS position and the maximal density of nascent RNA signal on the sense strand (Supplementary Figure S8A and B). To test if other genomic features could offer a better association we used nucleosome positioning data (MNase-seq). Metagene plots anchored at the center of the first nucleosome revealed a strong association with peaks of nascent RNAPII transcription ( Figure 7A, Supplementary Figure S8C), suggesting a nucleosome defined promoter proximal stalling mechanism in Arabidopsis. We found that 16.6% of the expressed genes (FPKM ≥ 1) showed RNAPII stalling in the promoter proximal region (Promoter-proximal Stalling Index ≥ 3, Supplementary Table S5). Metagene profiles for 0, 3 and 12 h at 4 • C indicated that low temperature affected RNAPII stalling at the first (i.e. +1) nucleosome ( Figure  7A). 3 h at 4 • C resulted in an increased peak around the center of the +1 nucleosome, indicating greater promoterproximal stalling. In contrast, the 12 h 4 • C samples resulted in decreased stalling. These results prompted us to investigate if pools of RNAPII engaged in promoter-proximal stalling may facilitate temperature-dependent gene regulation. We calculated a 'Promoter-proximal stalling index' from plaNET-seq data (i.e. relative nascent RNAPII transcription at the promoter proximal region versus the gene body) as previously described (14). Transcripts that were up-regulated after 3 h at 4 • C showed a significantly increased stalling index before low temperature treatment (22 • C). In addition, transcripts that were down-regulated after 3 h at 4 • C exhibited significantly decreased promoter proximal stalling compared to non-regulated transcripts ( Figure 7B). These results support a role for RNAPII promoter-proximal stalling to adjust transcription to low temperature. In conclusion, plaNET-seq revealed a nucleosome defined promoter-proximal RNAPII stalling mechanism that may facilitate reprogramming of gene expression in response to temperature changes.

Low temperature transiently reduces 3 -end associated RNAPII stalling and read-through transcription
In addition to promoter-proximal positions, RNAPII stalls near 3 -ends of Arabidopsis genes (14,38,46). We detected increased nascent RNAPII transcription downstream of the poly(A) sites (PAS) ( Figure 7C). We plotted the mean  plaNET-seq signal anchored on PAS sites to examine the effect of low temperature on PAS-associated RNAPII stalling. As expected, samples taken before the treatment (22 • C) and after 12 h at 4 • C showed that RNAPII stalled downstream of the PAS ( Figure 7C). Surprisingly, the peak of RNAPII stalled downstream of the PAS was abolished after 3 h at 4 • C, suggesting a major change in transcription dynamics associated with termination ( Figure 7C). RNAPII complexes transcribe beyond the PAS, representing the zone of transcription termination ( Figure 7D, upper panel). At control conditions (22 • C), we detected a median read-through distance of 497 bp ( Figure 7D). This distance was significantly decreased at 3 h 4 • C (median 462 bp, Figure 7D). However, at 12 h 4 • C, we detected a slightly increased read-through distance (median 524 bp, Figure 7D). Thus, genome-wide distribution of RNAPII such as PAS-associated stalling and read-through distance were transiently altered by low temperature.

plaNET-seq reveals novel transcription units near annotated genes
Here, we have used NET-seq to study how nascent RNAPII transcription adjusts to low temperature in Arabidopsis.
Our data detected numerous novel transcripts adjacent and antisense to coding sequences. We identified divergent transcription (DNC) from promoter NDRs in Arabidopsis, although at a limited number of genes ( Figure 2) compared to other eukaryotes (15,36  mRNA) and a well-defined NDR with well-positioned -1 and +1 nucleosomes ( Figure 2). However, highly expressed genes in Arabidopsis exist without evidence for DNC originating from their promoter NDR. We confirm the repressive effect of nuclear RNA degradation on the detection of DNC. Future studies will be required to elucidate the function of DNC, and the molecular mechanisms that direct RNAPII more strictly into the direction of mRNA transcription at shared promoter NDRs in Arabidopsis com-pared to metazoans. Arabidopsis protein-coding genes also show extensive antisense initiation from promoter proximal exon-intron boundaries (i.e. CAS; Figure 3), a common form of antisense transcription in human cells (13,19). In human and plants, promoter-proximal introns regulate gene expression and include many cis-elements for transcription factor binding (47,48), which may explain the favored site of initiation for CAS. A focused functional dissection of CAS is currently lacking, however CAS tran-scription may shape the chromatin environment of the corresponding sense promoter as suggested in yeast and human (13,19,49). The casTSSs overlapped frequently with chromatin states which correspond to the transition zone for RNAPII between initiation and productive elongation (Supplementary Figure S4E), thus highlighting the effects of intragenic chromatin dynamics on TSS selection (29). In summary, our identification of thousands novel transcription units enabled us to detect non-coding transcription linked to gene expression at equivalent positions of transcription units across eukaryotes.

Co-transcriptional splicing may decrease in response to low temperature
Our results reveal an intragenic peak of RNAPII activity located towards the end of exons ( Figure 5H-K). Exons have well-positioned nucleosomes in human (50) and Arabidopsis (51) that may alter RNAPII progression to result in gradual accumulation of nascent RNAPII transcription towards the end of exons. Our data show that the exonic peak is most pronounced after 3 h 4 • C, perhaps reflecting challenges to transcribe through nucleosome-rich regions during initial low temperature exposure. We detected a similar position of the major stalling site within exons close to the 5 SS in DMSO-treated samples, however this exonic RNAPII stalling was abolished when splicing was chemically inhibited by plaB ( Figure 5H). These data argue that the transient peak of RNAPII at the end of exons may reflect the impact of altered splicing kinetics on nascent RNAPII transcription (52). The decreased RNAPII speed nearby 5 SS may be used by the plant for regulation of alternative splicing events (53), a biologically essential mechanism for cold acclimation in Arabidopsis (23). In addition to the exonic peak, we detect a sharp peak of nascent RNAPII transcription at about 25 bp into introns. This intronic peak co-localized with RNAPII decorated by CTD-Ser5P, a post-translational RNAPII modification that has previously been linked to splicing (15). Interestingly, this peak has not been detected in yeast or human cells, arguing for diverse transcription dynamics within gene bodies between eukaryotes. We identified a reduction of nascent RNAPII transcription at the intronic peak in response to splicing inhibition by plaB treatment for long introns (i.e. 250-1000 bp). These data reveal unprecedented insight into the connections between RNAPII stalling, splicing and intron length that shape plant gene expression. We consider it plausible that this intronic RNAPII peak may represent a checkpoint (54) for accurate splicing of long introns, where we imagine the canonical splice sites to be in a greater competition with cryptic intronic splice sites.

Low temperature effects RNAPII stalling at gene boundaries
Our analyses of nascent RNAPII transcription highlights the relevance for mechanisms regulation 'post-initiation', in other words beyond RNAPII recruitment to gene promoters through sequence-specific transcription factors. At the 5 -end of genes, RNAPII stalls at the +1 nucleosome during Arabidopsis gene expression ( Figure 7A). In human, RNAPII complexes stall at a narrow window of 20-60 bp between the TSS and the +1 nucleosome boundary (15). The stalling in metazoans is influenced by the Negative Elongation Factor (NELF) complex that prevents RNAPII complexes to proceed into productive elongation (55). Interestingly, NELF is conspicuously absent in plants, which may reconcile our identification of the +1 nucleosome as the main determinant for promoter-proximal RNAPII stalling. An interesting question is if RNAPII is in a 'paused state' at this stage before entering productive elongation or if the increased transcriptional activity is due to pre-mature termination, something that has been reported in human cells (56). Our data support the idea that RNAPII complexes stalled at promoter-proximal positions may be released to adjust transcription in response to decreased temperature in Arabidopsis ( Figure 7B). A key modulator of temperature-dependent plant gene expression is the histone variant H2A.Z incorporated into the +1 nucleosome (57). It is tempting to speculate that temperature-regulated properties of the +1 nucleosome contribute to temperatureinduced expression changes of plant genes by effects on promoter-proximal RNAPII stalling. At the 3 -end of genes we find a transient chilling-induced contraction of transcription units ( Figure 7C and D). We calculate the read-through distance in Arabidopsis to a median length of 497 bp ( Figure 7D). This can be compared to the median read-through distance in S. cerevisiae (200 bp) (58) and human cells (3300 bp) (20). The difference in read-through distance may be connected to the level of genome compaction; both Arabidopsis and S. cerevisiae have gene-denser genomes compared to humans. Genedense genomes increase the probability of RNAPII collisions by read-through transcription with harmful consequences for genome stability (59,60). We have not failed to notice that the transient effects on RNAPII read-through distance during low temperature exposure could be consistent with changes in liquid phase viscosity implicated in Arabidopsis 3 -end formation (61). Perhaps, our 3 h 4 • C time-point captures cells during a metabolic adjustment of nuclear liquid environments including those promoting 3end formation.
In conclusion, the temperature-induced genome-wide adaptions required to maintain cellular functions provide insight into molecular alterations that promote organismal fitness during environmental change. Our work identifies key parameters of nascent RNAPII transcription that control the transcriptional cold-response in Arabidopsis and possibly other eukaryotes.

DATA AVAILABILITY
The scripts required to reproduce all results and figures are available at GitHub: [https://github.com/Maxim-Ivanov/ Kindgren et al 2019]. plaNET-seq data are available at NCBI GEO database with accession code GSE131733.