-
PDF
- Split View
-
Views
-
Cite
Cite
Vivek Kumar Raxwal, Sourav Ghosh, Somya Singh, Surekha Katiyar-Agarwal, Shailendra Goel, Arun Jagannath, Amar Kumar, Vinod Scaria, Manu Agarwal, Abiotic stress-mediated modulation of the chromatin landscape in Arabidopsis thaliana, Journal of Experimental Botany, Volume 71, Issue 17, 17 August 2020, Pages 5280–5293, https://doi.org/10.1093/jxb/eraa286
- Share Icon Share
Abstract
Limited information is available on abiotic stress-mediated alterations of chromatin conformation influencing gene expression in plants. In order to characterize the effect of abiotic stresses on changes in chromatin conformation, we employed FAIRE-seq (formaldehyde-assisted isolation of regulatory element sequencing) and DNase-seq to isolate accessible regions of chromatin from Arabidopsis thaliana seedlings exposed to either heat, cold, salt, or drought stress. Approximately 25% of regions in the Arabidopsis genome were captured as open chromatin, the majority of which included promoters and exons. A large proportion of chromatin regions apparently did not change their conformation in response to any of the four stresses. Digital footprints present within these regions had differential enrichment of motifs for binding of 43 different transcription factors. Further, in contrast to drought and salt stress, both high and low temperature treatments resulted in increased accessibility of the chromatin. Also, pseudogenes attained increased chromatin accessibility in response to cold and drought stresses. The highly accessible and inaccessible chromatin regions of seedlings exposed to drought stress correlated with the Ser/Thr protein kinases (MLK1 and MLK2)-mediated reduction and increase in H3 phosphorylation (H3T3Ph), respectively. The presented results provide a deeper understanding of abiotic stress-mediated chromatin modulation in plants.
Introduction
Optimal environmental conditions are vital for the fitness and survival of every organism. For plants, abrupt changes in environmental conditions such as temperature fluctuations, disturbed water availability, and soil salinity not only affect growth in the short term, but also impact their geographical distribution in the long term (Sakai and Weiser, 1973; Pigott and Huntley, 1981; Pigott and Pigott, 1993; Normand et al., 2009). In extreme cases, abiotic stresses can reduce the crop yield by >50% (Wang et al., 2003), thereby causing distress to farmers on one hand and challenging food security on the other. Plants, being sessile, exhibit multiple ways to adapt to the perturbations in the environmental conditions, and encounter the changes in quality and quantity of external stimuli by modulating their morphological, physiological, biochemical, and molecular processes (Haak et al., 2017).
The adaptive responses to stress are mainly governed by transcriptional changes in expression of genes (Hirayama and Shinozaki, 2010), which is directly dependent on accessibility of chromatin to the transcriptional machinery. Open chromatin regions (OCRs) constitute nucleosome-depleted regions of chromatin and contain cis-regulatory elements (CREs) which provide a binding platform for transcription factors (TFs) and other regulatory proteins (ENCODE Project Consortium, 2012; Roadmap Epigenomics Consortium, 2015). Chromatin of promoters and enhancers often exists in an open conformation to facilitate transcriptional regulation (ENCODE Project Consortium, 2012; Andersson et al., 2014). Determinants of nucleosome depletion also include epigenetic marks such as DNA methylation and histone modifications (ENCODE Project Consortium, 2012; Roadmap Epigenomics Consortium, 2015). Fine tuning of gene expression during developmental stages or on exposure to external stimuli is linked to differential epigenetic marks (Barrero et al., 2010; Hong et al., 2011; Trynka et al., 2013; Zhu et al., 2013; Roadmap Epigenomics Consortium, 2015). Although some information is available to associate genome-wide epigenetic changes with stress response, a direct measurement of the chromatin accessibility on a genomic scale in response to stress and its impact on gene expression is still lacking in plants.
Multiple approaches such as DNase I hypersensitive assay, MNase (micrococcal nuclease) and restriction enzyme-based assays, transposon accessibility-based assays (ATAC-seq), formaldehyde-assisted isolation of regulatory elements (FAIRE), and Sono-seq have been employed to identify genome-wide changes in chromatin accessibility (Gross and Garrard, 1988; Crawford et al., 2004; Giresi et al., 2007; Schones et al., 2008; Auerbach et al., 2009). In human cell types, genome-wide identification of open chromatin and its correlation with transcript abundance was performed within the framework of the ENCODE consortium (Thurman et al., 2012). Open chromatin was identified during development and in response to external stimuli in Arabidopsis and rice (Zhang et al., 2012a, b; Sullivan et al., 2014). In tomato, changes in the chromatin configuration were captured during the process of fruit ripening (Qiu et al., 2016). In maize, open chromatin regions co-localized with patterns of meiotic recombination and genetic variations associated with complex agronomic traits (Rodgers-Melnick et al., 2016).
The core methodologies used for identification of open chromatin in these studies have individual merits and demerits, resulting in a method-specific bias, and, therefore, portions of accessible chromatin can be identified more precisely by utilizing a combination of multiple methods and increased sequencing depth (Song et al., 2011; Nordström et al., 2019). In this study, two different next-generation sequencing (NGS)-based methods (DNase-seq and FAIRE-seq) were employed to discover the accessible portions of the genome in Arabidopsis seedlings that were exposed to different abiotic stresses.
Using a combinatorial approach involving two different methods for capturing open chromatin and greater sequencing depth, ~25% of the genome was observed to be in an open configuration. The majority of the regions identified by DNase-seq and FAIRE-seq were similar, although technique-specific unique regions were also captured. A substantial portion of the genome was found to be in a constitutively accessible state and had digital footprints (DFPs) of motifs on which condition-specific hosting and eviction of TFs was observed. A smaller portion of the accessible chromatin was found to be hyperactively modulated by abiotic stresses. The accessibility scores of hyperactive regions were largely similar in high and low temperature stress. Hyperactive chromatin regions representing highly accessible portions of the genome were enriched within pseudogenes. Drought stress resulted in chromatin opening of pseudogenes located in pericentric regions, and the hyperactive regions in drought stress were correlated with the changes in the phosphorylation status of histone H3. An understanding of chromatin accessibility-defining gene expression changes during abiotic stress will not only help in improving knowledge of the molecular mechanisms of plant adaptation, but will also open up additional avenues for crop improvement.
Materials and methods
Plant material, growth conditions, and stress treatments
Arabidopsis thaliana ecotype Col-0 was obtained from the Arabidopsis Biological Resource Center, USA (ABRC) and grown at 22±1 °C under a photoperiod of 16 h/8 h on half-strength Murashige and Skoog medium supplemented with 3% sucrose and 0.7% agar. Ten-day-old seedlings were subjected to heat stress at 38 °C for 6 h (Queitsch et al., 2000; Suzuki et al., 2013), cold stress at 4 °C for 24 h (Gilmour et al., 1988; Calixto et al., 2018; Yu et al., 2020), salt stress at 250 mM NaCl for 6 h (Taji et al., 2004; Kamei et al., 2005), and drought stress by transferring seedlings to Whatman paper no. 3 for 1 h (Ding et al., 2012; Lee et al., 2016). The 10-day-old seedlings grown at 22 °C were taken as the control sample. Three independent biological replicates were prepared for all the samples of DNase-seq and FAIRE-seq, except in cold stress where two biological replicates were prepared. A single biological replicate was used for RNA-seq.
Isolation of nuclei
Nuclei were isolated as per the protocol described by Dorweiler et al. (2000) with minor modifications. In brief, 0.5 g of tissue was homogenized and thoroughly mixed with 20 ml of chilled nuclei isolation buffer (NIB; 20 mM Tris pH 7.8, 250 mM sucrose, 5 mM MgCl2, 5 mM KCl, 40% glycerol, 0.25% Triton X-100, 0.1% β-mercaptoethanol). The slurry was first filtered through four layers of Miracloth followed by a 53 µm nylon mesh. Nuclei were pelleted by centrifugation at 6000 g for 20 min at 4 °C followed by washing four times with 20 ml of NIB. Finally, the nuclear pellet was re-suspended in 1 ml of appropriate buffer (1× DNase buffer or sonication buffer).
DNase-seq and FAIRE-seq
DNase-seq was performed essentially as described by He et al. (2012) with minor modifications. In brief, isolated nuclei were digested with varying concentrations (0, 0.1, 0.25, 0.5, and 1 U) of DNase I (Roche, GmBH) for 5 min. The reaction was stopped by adding an equal volume of 50 mM EDTA. DNA was purified by phenol:chloroform:isoamylalcohol and precipitated with two volumes of ethanol, 0.3 M Na-acetate pH 5.2, and 20 µg of glycogen followed by two washes with 70% ethanol. The dried pellet was dissolved in nuclease-free water and treated with RNase A and proteinase K, electrophoresed on an agarose gel followed by gel extraction of 100–400 bp sized fragments. Samples digested with 0.1, 0.25, and 0.5 U of DNase I were pooled together and processed for library construction.
For FAIRE-seq, 10-day-old stress-treated and control seedlings were cross-linked with 1% formaldehyde as described by Kaufmann et al. (2010). FAIRE was performed as per the described protocol (Giresi et al., 2007) with minor modifications. In brief, nuclei isolated from 0.5 g of cross-linked tissue were suspended in sonication buffer (Kaufmann et al., 2010) followed by sonication to obtain 200–500 bp sized DNA fragments. Sonicated DNA fragments were further purified by phenol:chloroform and precipitated with two volumes of ethanol, 0.3 M Na-acetate, and 20 µg of glycogen followed by washing the DNA pellet twice with 70% ethanol and dissolving it in nuclease-free water. DNA was treated with RNase A and proteinase K followed by reverse cross-linking at 65 °C for 16 h. DNA was further purified using a PCR clean-up kit (Qiagen, USA) following the manufacturer’s instructions. For both FAIRE-seq and DNase-seq, 100 ng of purified DNA was employed for preparation of Illumina-compatible sequencing libraries as per the manufacturer’s instructions (NEB, USA) and sequenced for 50 bp single ends on an Illumina Hi-seq 2000 platform.
Data analysis of FAIRE-seq and DNase-seq reads
The raw sequencing reads were quality filtered and aligned to the A. thaliana reference genome (TAIR 10) using Bowtie2 with default parameters (Langmead and Salzberg, 2012). Aligned reads (BAM format) were further sorted and indexed using Samtools (Li et al., 2009). These processed BAM files for FAIRE-seq and DNase-seq were used to identify open chromatin peaks using MACS2 (Zhang et al., 2008) with default parameters except the no-model option. Annotations of open chromatin peaks were carried out using Homer (Heinz et al., 2010). Bedtools was used to perform analyses such as merging/intersecting bed files and finding overlaps between bed files (Quinlan and Hall, 2010). The chromatin accessibility score (CAS) was calculated by merging non-redundant DNase-hypersensitive sites (DHSs) and FAIRE-identified regions (FIRs) to generate a superset of 109 176 OCRs, followed by mapping of the merged DNase- and FAIRE-seq reads. The number of mapped reads in OCRs were depth- and tag-normalized. Finally, the CAS value was calculated by dividing the difference of the value between the stress and control sample by the sum total of control and stress-normalized values.
RNA-seq and microarray data analysis
For RNA-seq, total RNA was extracted from 100 mg of tissue using the Plant RNA extraction kit (Qiagen, USA), and the transcriptome library was prepared using the TruSeq RNA sample preparation kit (Illumina Inc., USA). Libraries were sequenced for 50 bp single ends on an Illumina HiSeq 2000. The raw reads obtained from sequencing run were aligned to the A. thaliana reference genome (TAIR 10) using the splice junction aligner (Tophat2) with default parameters (Kim et al., 2013). The Tophat output was analyzed by Cufflinks to quantify each transcript (Trapnell et al., 2010). The differential expression of genes was calculated using Cuffdiff (Trapnell et al., 2013). The RNA-seq data were validated by re-analyzing publicly available microarray data sets from AtGenExpress (https://www.arabidopsis.org/portals/expression/microarray/ATGenExpress.jsp) and GSE80114 (Kilian et al., 2007; Das et al., 2016). Data were downloaded for similar stress conditions and re-analyzed with the R package affy (Gautier et al., 2004). The raw microarray readings were converted to expression values using the robust multiarray expression measure (RMA) with default parameters (quantile normalization and background correction).
Digital footprinting and motif analysis
Digital footprinting using DNase-seq data was performed using the Wellington algorithm enabled in pyDNase (Piper et al., 2013). The DFPs were further validated by DNase I cut data using pyDNase script dnase_average_profile.py. The overlap of DFPs between control and stress-treated samples was determined using the R package ChIPpeakanno (Zhu et al., 2010). The enrichment of TF motifs was performed using Homer with the plant motif database (Heinz et al., 2010).
Epigenetic re-modulation of pericentric region in response to drought stress
Data sets published by Wang et al. (2015) under GEO accession number GSE68370 were re-analyzed. In brief, reads were quality-filtered and aligned to the Arabidopsis genome using Bowtie2 with default parameters (Langmead and Salzberg, 2012). A genome-wide coverage for each sample in 1000 bp bins was generated using deepTools (Ramírez et al., 2014). The coverage of normalized reads over SACRs (stress-activated chromatin regions) and SRCRs (stress-repressed chromatin regions) of samples subjected to drought was plotted using deepTools (Ramírez et al., 2014).
Results
Characteristic features of open chromatin regions (OCRs) in response to abiotic stresses
In the present study, two cross-validating methodologies—DNase-seq and FAIRE-seq—were employed to determine the accessible regions of chromatin in Arabidopsis seedlings that were subjected to either cold, heat, drought, or salt stress (Fig. 1A). Approximately 600 million reads from DNase-seq and 500 million reads from FAIRE-seq samples mapped to the Arabidopsis genome, thereby providing genome coverages of 15×–68× in different samples (Supplementary File 1 at the Zenodo Repository, https://zenodo.org/record/3843539#.XuOBJ-fTWUk ). High confidence OCRs [with a 1% false discovery rate (FDR)] were determined with MACS2 (Zhang et al., 2008) and their number ranged between 45 000 and 70 000 in DNase-seq and between 63 000 and 90 000 in FAIRE-seq (Table 1). From here on, individually, DNase-seq- and FAIRE-seq-identified regions are referred to as DHSs and FIRs, respectively. Since both DHSs and FIRs consists of accessible chromatin regions, they are jointly referred to as OCRs in the text. The list of DHSs and FIRs identified by MACS2 in different samples is given in Supplementary Files 2 and 3 at Zenodo, respectively. DHSs discovered from control seedlings were compared with two previously published studies (Zhang et al., 2012b; Sullivan et al., 2014). An overlap of 44% (Fisher’s exact test, P-value <8.19e–260) and 58% (Fisher’s exact test, P-value <1.007e–308) with DHSs identified by Zhang et al. (2012b) and Sullivan et al. (2014), respectively, was observed (Supplementary Fig. S1 at Zenodo). Even though the control seedlings used in the current study and previous studies were at different developmental stages, a reasonably high degree of overlap in DHSs indicates that a substantial part of the genome is constitutively accessible throughout seedling development in Arabidopsis.
Samples . | FAIRE-seq . | Genome coverage (%) . | Unique FIRs . | % of genomic bases covered by FIRs . | DNase-seq . | Genome coverage (%) . | Unique DHSs . | % of genomic bases covered by DHSs . |
---|---|---|---|---|---|---|---|---|
Control | 72 789 | 12.95 | 103 990 | 24.66% | 70 807 | 15.92 | 84,865 | 21.32% |
Heat | 85 366 | 16.78 | 57 736 | 13.18 | ||||
Cold | 63 060 | 10.44 | 64 324 | 12.67 | ||||
Salt | 90 465 | 19.42 | 55 784 | 13.58 | ||||
Drought | 92 034 | 19.56 | 45 737 | 9.36 |
Samples . | FAIRE-seq . | Genome coverage (%) . | Unique FIRs . | % of genomic bases covered by FIRs . | DNase-seq . | Genome coverage (%) . | Unique DHSs . | % of genomic bases covered by DHSs . |
---|---|---|---|---|---|---|---|---|
Control | 72 789 | 12.95 | 103 990 | 24.66% | 70 807 | 15.92 | 84,865 | 21.32% |
Heat | 85 366 | 16.78 | 57 736 | 13.18 | ||||
Cold | 63 060 | 10.44 | 64 324 | 12.67 | ||||
Salt | 90 465 | 19.42 | 55 784 | 13.58 | ||||
Drought | 92 034 | 19.56 | 45 737 | 9.36 |
Samples . | FAIRE-seq . | Genome coverage (%) . | Unique FIRs . | % of genomic bases covered by FIRs . | DNase-seq . | Genome coverage (%) . | Unique DHSs . | % of genomic bases covered by DHSs . |
---|---|---|---|---|---|---|---|---|
Control | 72 789 | 12.95 | 103 990 | 24.66% | 70 807 | 15.92 | 84,865 | 21.32% |
Heat | 85 366 | 16.78 | 57 736 | 13.18 | ||||
Cold | 63 060 | 10.44 | 64 324 | 12.67 | ||||
Salt | 90 465 | 19.42 | 55 784 | 13.58 | ||||
Drought | 92 034 | 19.56 | 45 737 | 9.36 |
Samples . | FAIRE-seq . | Genome coverage (%) . | Unique FIRs . | % of genomic bases covered by FIRs . | DNase-seq . | Genome coverage (%) . | Unique DHSs . | % of genomic bases covered by DHSs . |
---|---|---|---|---|---|---|---|---|
Control | 72 789 | 12.95 | 103 990 | 24.66% | 70 807 | 15.92 | 84,865 | 21.32% |
Heat | 85 366 | 16.78 | 57 736 | 13.18 | ||||
Cold | 63 060 | 10.44 | 64 324 | 12.67 | ||||
Salt | 90 465 | 19.42 | 55 784 | 13.58 | ||||
Drought | 92 034 | 19.56 | 45 737 | 9.36 |
![Properties of DHSs and FIRs. (A) Schematic representation of the experimental plan. Nuclei were extracted from 10-d-old Arabidopsis seedlings for DNase-seq and from seedlings cross-linked with formaldehyde for FAIRE-seq. Total RNA was isolated followed by mRNA enrichment for RNA-seq. (B) A bar plot representing the percentage distribution of OCRs, identified by FAIRE-seq (FIRs) and DNase-seq (DHSs), in various gene elements such as the promoter (1000 bp upstream of the TSS), 5'-UTRs, exons, introns, transcription termination site (TTS), 3'-UTRs, and the intergenic region. The percentage distribution of each of these gene elements in the Arabidopsis genome (TAIR10) was also plotted (Genome). (C) A box plot representing the length distribution of DHSs (in brown) and FIRs (in purple) in various elements of the genes. (D) A box plot representing expression of genes [log10(FPKM)] whose various structual elements fall in OCRs. (B)–(D) were plotted from non-redundant OCRs obtained after merging OCRs from control and stress-treated samples.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/jxb/71/17/10.1093_jxb_eraa286/1/m_eraa286f0001.jpeg?Expires=1747896251&Signature=Ali~VGhjszN2UQKd2pNh72Kzu8xBTddDpUjPiibUtFkRJwujaRAtkcmznSCbx4lrf~Ch3tlOJ0~lEVqd1z59gO-JakbCM92Or7m8STDFV1LkGm-wk4Zq9KZ89G75TYHnm4xOnH3jb4Qgz7fEu2zNtTrLwCxSsMi-Tgx-b53u9BJwBdSsi9HI-6IgJIlShSNpONwafCJGbF2J-8oc6L2xUnzExco-SpuJqBHbf5qWG3C~ym7fbLULYn070N1zEZYbMOBMcq0VaDdCsKEikbYZmI1Wrz-ACDhrF9FF-4-Y6InepUNFwiB-ESKUzvaiCLhfsgryBn4I8j~UBOM3khz71w__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Properties of DHSs and FIRs. (A) Schematic representation of the experimental plan. Nuclei were extracted from 10-d-old Arabidopsis seedlings for DNase-seq and from seedlings cross-linked with formaldehyde for FAIRE-seq. Total RNA was isolated followed by mRNA enrichment for RNA-seq. (B) A bar plot representing the percentage distribution of OCRs, identified by FAIRE-seq (FIRs) and DNase-seq (DHSs), in various gene elements such as the promoter (1000 bp upstream of the TSS), 5'-UTRs, exons, introns, transcription termination site (TTS), 3'-UTRs, and the intergenic region. The percentage distribution of each of these gene elements in the Arabidopsis genome (TAIR10) was also plotted (Genome). (C) A box plot representing the length distribution of DHSs (in brown) and FIRs (in purple) in various elements of the genes. (D) A box plot representing expression of genes [log10(FPKM)] whose various structual elements fall in OCRs. (B)–(D) were plotted from non-redundant OCRs obtained after merging OCRs from control and stress-treated samples.
In order to determine the features of DHSs and FIRs, peaks from control and stress samples were merged to obtain data sets that consisted of 84 865 and 103 990 non-redundant DHSs and FIRs, respectively. The percentage of the genome covered across different samples by DHSs and FIRs varied between 9% and 16% and between 10% and 20%, respectively. Collectively, the non-redundant DHSs and FIRs covered 21.32% and 24.66% of the Arabidopsis genome, respectively (Supplementary File 1 at Zenodo). With the exception in the pericentromeric region, the OCRs were homogenously distributed within and across all the five chromosomes (Supplementary Fig. S2 at Zenodo). Comparisons of the distribution of OCRs in different genic elements indicated that exonic regions were particularly enriched in both DHSs and FIRs (Fig. 1B). Across different samples, 39–50% and 41–49% of all the identified DHSs and FIRs were present within exons (Fig. 1B). Putative promoter regions also had a reasonable number of OCRs, as 20–30% of the DHSs and 22–27% of FIRs were present in the region 1 kb upstream of the transcription start site (TSS). Substantially fewer OCRs were present in the intergenic regions (~5%) and untranslated regions (UTRs; ~1%; Fig. 1B).
Not only the number but also the length of DHSs and FIRs was found to be variable within different elements of the genes and intergenic regions. The merged non-redundant data set of DHSs and FIRs was used to generate box plots for distribution of their lengths. The length of DHSs varied between 99 bp and 4223 bp, and that of FIRs varied between 99 bp and 5708 bp. Within the entire non-redundant sets, the average median length of DHSs and FIRs was 255 bp and 236 bp, respectively (Fig. 1C). The median length of DHSs and FIRs ranged between 211 bp and 275 bp in the genic regions; however, in the intergenic regions, the median length of DHSs (194 bp) and FIRs (178 bp) was significantly smaller (Mann–Whitney test, P-value <2.2e–16; Fig. 1C). With the exception of 3′-UTRs, the median length of DHSs was significantly greater than that of the FIRs (Mann–Whitney test, P-value <0.0001) across all the other gene elements.
To correlate the gene expression with the presence of OCRs in genic elements, non-redundant sets of DHSs and FIRs were used. As a non-redundant set of DHSs/FIRs was derived after merging DHSs/FIRs from all the samples, an average of absolute gene expression values [herein fragments per kilobase of transcript per million mapped reads (FPKM) values] of all the samples was used for deducing the correlation. In agreement with published reports, it was observed that higher gene expression was generally associated with the presence of OCRs in 5'-UTRs (Fig. 1D; Supplementary Fig. S3 at Zenodo). Further, the presence of OCRs in introns and promoters was also invariably associated with an overall higher expression of genes compared with the presence of OCRs in the intergenic regions (Mann–Whitney test; P<6.8e–04). It is important to note that although introns are not particularly enriched with OCRs, however, where OCRs were present within introns, expression of such genes was high (Fig. 1D). The observed correlation between OCRs and gene expression was further validated with the help of microarray data, available at AtGenExpress, and similar results were obtained (Supplementary Fig. S3 at Zenodo).
DNase-seq and FAIRE-seq capture similar genomic regions
To investigate whether DNase-seq preferentially captures certain genomic regions that are not captured by FAIRE-seq, and vice versa, we separated the overlapping OCRs (ovOCRs; genomic regions that were captured by both the approaches) from the non-overlapping regions, followed by cataloguing and studying their properties. OCRs identified by both DNase-seq and FAIRE-seq were considered as overlapping if the distance between the summit points of their peaks was ≤ ±75 bp (Fig. 2A). In the only previously published report on comparison of DNase-seq and FAIRE-seq in human cell lines, nearly 30–40% of OCRs were found to be similar (Song et al., 2011). Here a relatively higher percentage of overlap (35–60%) between DHSs and FIRs (Fig. 2A) was observed. Approximately 35% of ovOCRs were identified in the control sample, while in other samples a higher number of ovOCRs were identified, for example, 42, 52, 52, and 60% of DHSs and FIRs were overlapping in cold, high temperature, drought, and salt stress samples, respectively (Fig. 2A). The non-overlapping DHSs and FIRs were referred as unique DHSs (uDHSs) and unique FIRs (uFIRs), respectively. On analyzing the enrichment of ovOCRs, uDHSs, and uFIRs, it was observed that ovOCRs as well as uOCRs were prevalent in exonic regions (Fig. 2B). The analysis was further extended to search for whether the number of OCRs vary in different exons, and first exons were observed to be more enriched in ovOCRs and uDHSs. This trend was observed in all the samples, except in salt stress where uFIRs were more prevalent in the first exon than uDHSs or ovOCRs (Supplementary Fig. S4 at Zenodo). Additionally, differences in the number of DHSs and FIRs were also observed in 5'-UTRs; for example, uDHSs were present at a higher frequency in the 5'-UTRs than uFIRs and ovOCRs (Fig. 2B). Taken together, the data suggest that although many genomic regions captured by DNase- and FAIRE-seq were similar, technique-specific and condition-specific enrichment of OCRs was observed. Therefore, to generate a comprehensive map of open chromatin, use of more than one approach should be considered.

DNase-seq preferentially identifies 5'-UTRs. (A) A bar plot showing overlap of DHSs with FIRs for controls and samples subjected to stress. Overlap was scored if the peak summit point of DHSs was present within ±75 bp of the FIRs summit point. Overlapping peaks between DHSs and FIRs were designated as ovOCRs, whereas peaks unique to FAIRE-seq and DNase-seq were labeled as uFIRs and uDHSs, respectively. (B) A bar plot showing enrichment of various elements of genes in uFIRs, uDHSs, and ovOCRs. The x-axis shows log2 fold change enrichment of OCRs in various gene elements.
A large proportion of the genome is constitutively accessible
To understand the effect of stress on the chromatin state, two approaches were employed. In the first approach, the DHSs and FIRs that overlapped between control and stress samples were identified. The overlaps were identified on the basis of the distance between the summit points of the peaks. A DHS/FIR was considered to be present in both control and stress as long as the distance between the summit points was ≤ ±75 bp; otherwise, DHSs/FIRs were classified as non-overlapping. This analysis showed that a large number of OCRs overlapped between the control and samples subjected to stress. Nearly 45–67% of DHSs and 57–74 % of FIRs were present in both control and stress samples (Fig. 3A). The maximum number of DHSs (67%) overlapped between control and salt stress samples, whereas control and heat stress samples had the least number (45%) of overlapping DHSs (Fig. 3A). On the other hand, maximum overlaps in FIRs were observed between control and high temperature-stressed samples (74%), while minimum overlaps were seen between control and cold-stressed samples (57%; Fig. 3A). A high degree of overlap in OCRs between control and stressed samples indicated that a large proportion of the chromatin is accessible under both control and stress conditions.

A large proportion of the genome remains open. (A) A bar plot showing the distance between the summit point of OCR peaks in control samples and the summit point of peaks in stress samples for DHSs and FIRs. 0 indicates that the summit point of OCRs in the control is present within ±75 bp of the summit point in stress. (B) The line graph showing the effect of stress on OCRs. A chromatin accessibility score (CAS; see the Materials and methods) was plotted (x-axis) for each OCR in cold, heat, salt, and drought stress. OCRs with a CAS >0.2 were designated as stress-activated chromatin regions (SACRs), whereas OCRs with a CAS of < –0.2 were labeled as stress-repressed chromatin regions (SRCRs). OCRs with a score between +0.2 and –2 were designated as no response open chromatin regions (nrOCRs). (C) Horizontal bar plot showing the percentage of OCRs in no response (CAS –0.2 to 0.2), open (CAS >0.2), or closed (CAS < –0.2) in various stress samples. (D) Venn diagram showing overlap of chromatin regions for categories of nrOCRs (CAS –0.2 to 0.2), SACRs (CAS >0.2), or SRCRs (CAS < –0.2) after exposure to cold, heat, salt, and drought stress.
Although the above approach of finding the overlaps between control and stress provides a reasonable idea about the chromatin state, it only considers the presence or absence of the OCRs in control and stress conditions. Another dimension of the chromatin state can be determined by its ability to be accessible or inaccessible. Therefore, to quantify this ability, a CAS was derived by merging non-redundant DHSs and FIRs to generate a superset of 109 176 OCRs (see the Materials and methods). The CAS indicates the strength of the chromatin accessibility in stress with respect to control. Within the defined numerical values ranging from –1 to +1, increasing positive values of CAS indicate an increased tendency of the chromatin to be open in stress with respect to control conditions. On the contrary, decreasing negative values indicate an increased tendency of the chromatin to close under stress. After plotting the number of OCRs with respect to their CAS, it was observed that the majority of OCRs were around the ‘0’ CAS value, further indicating that stress has little effect on accessibility of these OCRs (Fig. 3B). In samples of temperature treatments, the peak of the bell-shaped curve was in the positive range (CAS >0.2), thereby indicating that chromatin had a greater tendency to open under cold and heat stress. Under salt stress, however, a substantial number of OCRs had a tendency to close, as indicated by the presence of the peak in the negative range (close to a CAS value of –0.2). Under drought stress, the peak of the curve was present between CAS –0.2 and 0.2, suggesting that the majority of the regions do not change their accessibility (Fig. 3B). The list of OCRs with their respective CAS is presented in Supplementary File 4 at Zenodo).
The OCRs with a CAS from 0 to ±0.2 had a lower propensity for opening or closing under stress and therefore these OCRs were considered as non-responsive to abiotic stresses (referred as nrOCRs). Their percentage varied from 50% to 83%, with the highest number being present in drought stress (83%) and the lowest in salt stress (50%; Fig. 3C). A relatively high percentage of nrOCRs further gives credence to our observation that under stress, the majority of the chromatin is unaffected and remains constitutively accessible for binding of regulatory proteins. Gene Ontology (GO) analysis of nrOCRs reveals that biological processes such as protein modification and post-translational modification were enriched significantly in salt stress. None of the biological processes was enriched in other stresses (Supplementary File 5 at Zenodo).
To identify chromatin regions that open or close on imposition of stress, OCRs having a CAS score greater or less than +0.2 and –0.2 were considered. OCRs having a CAS score of >0.2 were those which open during stress and are, therefore, designated as stress-activated CRs (SACRs), whereas those with CAS scores less than –0.2 close during stress and are referred as stress-repressed CRs (SRCRs). The percentage of SACRs varied from 1% to 37%, with heat stress having the greatest number and salt stress having the least (Fig. 3C). GO analysis of SACRs revealed enrichment of stress-associated and developmental processes in response to heat, cold, and drought stress. Under salt stress, biological processes such as translation, oxidation, phosphorylation, and respiratory electron transport chain were enriched. The percentage of SRCRs varied from 6% to 48% ,with the highest in salt stress and least in cold stress. The GO analysis of SRCRs revealed enrichment of both stress- and development-related GO terms in cold, salt, and drought stress. In heat stress, GO terms of cell wall modification and organization were enriched. The enriched gene ontological terms are listed in Supplementary File 5 at Zenodo.
Plants undergo a major change in gene expression to tackle abiotic stresses. Some of the changes in gene expression are stress specific, while others occur irrespective of the imposed stress. Therefore, overlapping and unique accessible chromatin regions defining the gene expression changes were identified from nrOCR, SACR, and SRCR subsets. This analysis revealed that ~30% of OCRs had a similar response to all the stresses, while the remaining OCRs had stress-specific responses. In the nrOCR category, 26 935 genomic regions were open in the control and all the stress conditions (Fig. 3D). GO analysis of genes associated with the nrOCRs revealed enrichment of processes such as post-embryonic development, protein amino acid phosphorylation, post-translational protein modification, and phosphorylation. Further, 2783 regions of chromatin, common to all the stresses and attaining a closed conformation irrespective of the imposed stress (SRCRs) were identified (Fig. 3D). Biological processes such as post-embryonic development, post-embryonic morphogenesis, and secondary metabolic process were enriched in this category. In contrast to a higher number of nrOCRs and SRCRs, only 281 regions of chromatin were identified whose conformation switched from closed to open by imposition of any stress (Fig. 3D). GO analysis revealed that process related to translation was significantly enriched (P-value <e–10) in SACRs. The GO enrichment for each class of OCRs is provided in Supplementary File 6 at Zenodo. Whereas nrOCRs and SRCRs common to all the stresses were distributed in exons, promoters, and TTSs, SACRs were mainly present in the promoters.
Non-responsive OCRs have digital footprints for binding of regulatory proteins
A large proportion of OCRs comprise constitutively open chromatin regions (termed nrOCRs) and represent regions on which regulatory proteins are hosted and evicted conditionally (Fig. 3A, B). In order to decipher the motifs within the nrOCRs and indirectly the TFs binding to them, DFPs were identified in both the control and stress samples, utilizing the Wellington algorithm at an FDR of <0.01 (Piper et al., 2013). Reduced numbers of DNase I cutting sites were observed at the center of DFPs, in a 100 bp window encompassing DFPs, thereby validating that high confidence DFPs were identified (Fig. 4A). The list of DFPs present in control and stress conditions is presented in Supplementary File 7 at Zenodo.

Dynamics of TF binding in nrOCRs upon stress exposure. (A) A line plot showing DNase I activity, over digital footprints (DFPs) calculated as the count of the first base of the sequencing reads (5') on the sense (red) and antisense (blue) strands. Average DNase I activity is calculated after extending the length of DFPs to ±100 bp in nrOCRs of cold, heat, salt, and drought stress conditions. (B) A Venn diagram showing the overlap of DFPs in nrOCRs between control and stress samples. A heat map was constructed to represent the TF binding sites of the evicted (C) and newly formed (D) DFPs after exposure to different abiotic stresses.
To identify the sites for hosting and eviction of regulatory proteins, DFP overlaps were assessed in individual pairs of control and stress samples (Fig. 4B). DFPs that are present in both control and stress conditions are those whose cognate regulatory protein(s) remain bound to the DNA in control conditions as well as on exposure to stress. DFPs that are identified only from the control samples include footprints from which regulatory proteins are evicted on exposure to stress. On the other hand, stress-specific DFPs include genomic regions which are occupied by TFs/regulatory proteins only on exposure to stress. Our results in seedlings exposed to high temperature, cold, and drought stress indicate that a large number of regulatory proteins are evicted from their motifs; for example, on imposition of high temperature and cold stress, regulatory proteins are evicted from >10 000 positions. Similarly, drought stress resulted in eviction from 7538 genomic positions. In contrast, only 1094 genomic positions lost footprints of binding factors upon exposure to salt stress. Analysis of motifs from the evicted TF-binding sites revealed significant enrichment (P-value <e–10) of 32 different motifs (Fig. 4C). Binding site of ESE3 (ethylene and salt inducible), an AP2-type TF, was enriched in both low and high temperature stress. In drought stress, the binding motif FUF1 (FYF up-regulating 321 factor 1), a single AP2-domain containing TF, was highly enriched whereas in salt stress, the motif SRS7 (Shi-related sequence 7), a ring finger-type zinc finger protein, was enriched.
Assessing the overlaps also led to identification of DFPs to which regulatory proteins bind only after exposure to stress. The least number of such DFPs (106) were identified in cold stress, while the maximum number were discovered in salt stress. None of the known motifs was enriched in the cold-specific DFPs. Within the 1297 high temperature-specific DFPs, binding sites of TFs having high similarity to the C2H2-type zinc finger TF were enriched. Additionally, binding motifs for CAMTA3 as well as the MYB family of TFs were enriched in heat stress-formed DFPs (Fig. 4D). On exposure to salt stress, 1690 DFPs were formed, which were enriched in binding motifs of BOS1 (Botrytis Susceptible 1), a MYB TF. The motif for binding of ARF3 (auxin response factor 3) was also enriched in salt stress. Similarly, among 813 DFPs formed during drought stress, enrichment of motifs for binding of MYB52, FHY, and MYB3R4 TFs was observed (Fig. 4D). Taken together, these results suggest that constitutively accessible chromatin regions are dynamic, serve as hotbeds of transcriptional activity, and witness consistent hosting and eviction of TFs.
A small fraction of the genome consists of hyperactive regions
To identify the OCRs that had the strongest tendency of opening and closing under abiotic stresses, the top 1% SACRs and bottom 1% SRCRs were each shortlisted. These chromatin regions are henceforth referred to as hyperactive SACRs (Ha-SACRs) and hyperactive SRCRs (Ha-SRCRs). The enrichment of Ha-SACRs within different gene elements varied with the imposed stress condition; for example, in seedlings that were exposed to either high temperature or cold stress, Ha-SACRs were substantially enriched within 5'-UTRs and promoters, while under drought stress, Ha-SACRs were more enriched in the intergenic and exonic regions than 5'-UTRs and promoters. Under salt stress, Ha-SACRs were only enriched in the intergenic regions. Interestingly, exons were enriched with Ha-SACRs only in drought stress, whereas in other stresses exons were largely devoid of Ha-SACRs. Similarly, 3'-UTRs were enriched with Ha-SACRs only in cold stress (Fig. 5A). The presence of Ha-SRCRs in different gene elements also varied with the stress; for example, in salt and drought stress, intergenic regions were particularly enriched in Ha-SRCRs. Ha-SRCRs were also enriched in intergenic regions in high temperature stress; however, in samples treated with cold stress, intergenic regions were completely devoid of Ha-SRCRs. Further, we correlated the expression profile of genes associated with HaCRs. To achieve this task, log2 fold change was calculated for genes associated with Ha-SACRs and Ha-SRCRs in stress samples with respect to the control. Comparison between Ha-SACRs and Ha-SRCRs was drawn by generating box plots of the log2 fold change in expression of genes having these regions. Overall, with the exception of salt stress, the presence of Ha-SACRs was significantly associated with a higher gene expression than the presence of Ha-SRCRs. In contrast, during salt stress, the presence of Ha-SACRs (with respect to the presence of Ha-SRCRs) was significantly associated with lower gene expression, although the significance levels were much lower than observed in other stresses (Fig. 5B). Since only a single replicate of RNA-seq was utilized to draw the above conclusions, we also used publicly available gene expression data sets (see the Materials and methods) to validate these findings. Results for association of Ha-SACRs with higher gene expression in high temperature, low temperature, and drought stress were similar to the observations made by us; however, in the case of salt stress, no significant differences were observed between average gene expression due to the presence of Ha-SACRs and Ha-SRCRs, respectively (Supplementary Fig. S5A–D at Zenodo). To find out the reason behind this discrepancy, we performed an additional analysis with multiple microarray data sets (250 mM NaCl for 3 h, 250 mM NaCl for 27 h, 500 mM NaCl for 3 h, and 500 mM NaCl for 27 h) and observed that in the RNA-seq data generated by us, 327 and 1013 genes are associated with Ha-SACRs and Ha-SRCRs, respectively. However, the microarray data from GSE80114 (Arabidopsis array for Agilent) had fewer genes/probe sets associated with Ha-SACRs and Ha-SRCRs (147 and 910, respectively). Therefore, for a reliable one to one comparison, and due to the absence of another publicly available RNA-seq data set (similar to the stress imposed by us), we carried out specific down sampling of our data and re-analyzed them by utilizing the lesser number of genes (similar to the probe sets present in the microarray). Association of Ha-SACRs and Ha-SRCRs showed that when a smaller data set was used, the association of Ha-SACRs with lower gene expression as compared with Ha-SRCRs was no longer significant (Supplementary Fig. S5E at Zenodo). The significant difference observed by us is due to the contribution from additional genes in the RNA-seq data, whose probe sets are absent in microarray data sets.

Response of the hyperactive OCRs. (A) Enrichment of various gene elements associated with Ha-SACRs and Ha-SRCRs in different stress samples. (B) A box plot showing relative expression of genes (log2 fold change) corresponding to Ha-SACRs (green) and Ha-SRCRs (purple). (C) A heat map showing hierarchical clustering of Ha-SACRs and Ha-SRCRs based on CAS scores indicating dynamics of highly responsive chromatin regions under different stresses.
In order to deduce the behavior of Ha-SACRs and Ha-SRCRs across all the four stresses, stress-wise hierarchical clustering was performed, and heat maps were generated on the basis of CASs. We found that Ha-SACRs and Ha-SRCRs identified from samples subjected to high temperature and cold stress clustered together, whereas those identified from salt- and drought stress-treated samples clustered in a separate clade, indicating that the majority of hyperactive CRs behaved similarly with respect to their accessibility in response to fluctuating temperatures (Fig. 5C). Similarly, the majority of Ha-SACRs and Ha-SRCRs identified from drought and salt stress samples displayed comparable accessibility. Taken together, we identified hyperactive regions of chromatin whose accessibility changes due to exposure to abiotic stresses. We also observed that temperature fluctuations caused similar changes in accessibility of chromatin. Furthermore, changes in chromatin accessibility due to drought as well as salt stress were similar. We also observed regions of the chromatin which have either increased (CAS >0.2) or decreased (CAS < –0.2) accessibility in all of the stresses studied. A total of 281 regions of the chromatin were found to have increased accessibility after exposure to any of the stresses studied. These regions were associated with 125 genes. Similarly, 1159 regions of the Arabidopsis genome showed decreased accessibility of the chromatin on exposure to any of the stresses studied. These regions were associated with 1098 genes. The list of these genomic regions with coordinate information and annotation details has been presented in Supplementary File 8 at Zenodo.
Epigenetic re-modulation of the pericentric region in response to drought stress
Our analysis revealed the presence of 16 707 OCRs in 4221 out of the 4827 annotated pseudogenes of the Arabidopsis genome (TAIR 10). To identify stress-responsive OCRs in the pseudogenes, nrOCRs were removed to derive a set of 8312 OCRs that were found to be associated with 3418 pseudogenes. The maximum number of SACRs and SRCRs (i.e. having CAS >0.2 or < –0.2) within pseudogenes were identified in heat stress (5249) followed by cold (3510), drought (2630), and salt (2469; Fig. 6A, B). More SACRs than SRCRs were observed in cold and drought, which was in contrast to salt and heat stress, where SRCRs had a larger contribution. Hierarchical clustering of these 8312 OCRs enriched in pseudogenes revealed that a majority of them have similar tendencies of opening and closing in heat, cold, and drought stress (Fig. 6C).

Distribution of OCRs in pseudogenes during different abiotic stresses and epigenetic re-modulation of OCRs in drought stress. (A) A bar chart showing the distribution of OCRs associated with pseudogenes in samples subjected to stress. A total of 16 707 OCRs were identified in 4221 annotated pseudogenes of the Arabidopsis genome, out of which 8312 OCRs were responsive to stress. (B) The percentage of stress-responsive OCRs in pseudogenes is plotted as a horizontal bar graph where OCRs with a CAS >0.2 were designated as SACRs, while SRCRs were OCRs with a CAS < –0.2 in various stress samples. (C) The hierarchical clustering of pseudogenes-associated OCRs was performed on the basis of CAS in different stresses. (D) Heat map of histone marks within ±3 kb region surrounding the Ha-OCRs in drought stress. The start position of the OCRs is represented as S and the end position is represented as E.
Further, when we looked at the HaCRs, it was observed that in drought stress, >30% of the Ha-SACRs were present within the pseudogenes, whereas only 3% of Ha-SRCRs could be linked to them. Interestingly, the majority of the pseudogenes having Ha-SACRs were localized close to centromeric regions (Supplementary Fig. S6 at Zenodo). Further, on mapping the drought stress-activated and repressed chromatin regions on all five chromosomes, it was observed that the majority of the drought-activated chromatin regions were present near the centromeres, whereas the drought stress-repressed CRs were mainly present in chromosome regions other than pericentromeric regions (Supplementary Fig. S6 at Zenodo). Our result suggests that drought stress triggers opening of chromatin regions of the pseudogenes located in the pericentromeric regions.
It is likely that under drought stress, the activation and repression of chromatin regions is epigenetically regulated, and there is recent evidence that suggests so. Wang et al. (2015) reported that phosphorylated H3 (H3T3Ph), a mark of heterochromatin, is modulated by polyethylene glycol (PEG)-induced drought stress and is dependent on MUT9-like kinases (MLK1 and MLK2). To understand the relationship of H3T3Ph with OCRs, the metaprofile of H3T3Ph marks was mapped on the top (Ha-SACRs) and bottom (Ha-SRCRs) 1% of drought-modulated OCRs. In a 3 kb region surrounding the drought-activated chromatin regions, a marked reduction in H3T3 phosphorylation was observed. Inversely, the H3T3 phosphorylation increased in the bottom 1% of drought-repressed OCRs. The pattern of the euchromatic histone marks such as H3K4me1 and H3K4me3 was opposite to that observed for H3T3Ph; that is, the frequency of these marks was higher in the top 1% drought-activated OCRs and lower in the bottom 1% drought-repressed OCRs (Fig. 6D). As can be seen from the height and slope of peaks, loss of MLK1 and MLK2 (the mlk1mlk2 double mutant) resulted not only in reduction of H3T3Ph marks, but also in spreading of marks within the 3 kb region surrounding the OCR peaks. Taken together, these results suggest that the OCRs in drought stress have epigenetic marks that correlate with the accessible and inaccessible chromatin.
Discussion
Imposition of abiotic stress leads to massive reprogramming of gene expression that is modulated by changes in chromatin accessibility, DNA methylation, and histone modifications. The extent of chromatin accessibility has been captured previously at specific stages of plant development (Zhang et al., 2012a, b; Qiu et al., 2016; Rodgers-Melnick et al., 2016). However, to date, a comprehensive study had not been undertaken for assessing the accessible chromatin regions in response to abiotic stresses. In this study, we demonstrate that changes in the environmental conditions (low and high temperature, salinity, and water deficit) reshape the landscape of chromatin architecture in Arabidopsis and lead to eviction and binding of specific families of TFs.
Two different approaches (DNase-seq and FAIRE-seq) were used to capture open chromatin and we were able to identify a larger proportion of the genome (up to 25%) in open conformation than was previously reported (Sullivan et al., 2014). We believe that this was due to the use of two independent approaches for capturing open chromatin. In fact, OCR peaks identified independently by DNase-seq and FAIRE-seq had lower genome coverage than their aggregated non-redundant data set. The accessible regions were largely found to be enriched within exons and promoters. Similar enrichments have also been reported previously in other plants species such as rice, maize, and tomato (Zhang et al., 2012a, b; Qiu et al., 2016; Rodgers-Melnick et al., 2016). CRE-mediated transcriptional regulation is facilitated by increased DNA accessibility at the promoter, thereby altering the gene expression. Reduced methylation at the first exon as compared with other regions of the gene body, suggesting differential DNA accessibility of gene body elements, has been observed previously (Brenet et al., 2011). In concordance, we too observed a higher enrichment of DHSs and FIRs in first exons, indicating that regions immediately upstream and downstream of the TSS are equally important for gene expression.
In contrast to the only report on human cell lines where OCRs were identified utilizing both DNase- and FAIRE-seq (Song et al., 2011), we identified a higher percentage of overlapping OCRs, which could be attributed to the differences in the complexity of the two genomes and higher sequencing depth in the current study. This was despite the fact that Song et al. (2011) categorized DHS and FIR peaks as overlapping even if they had a single base pair overlap. In the current study, we adopted a more stringent approach and overlaps were only considered if the distance between the center of the DHS and FIR peaks was within ±75 bp. In spite of substantial similarity, certain regions of the genome were identified exclusively by only one of these methods. This could be attributed to the differencein methods (enzymatic versus sonication) used for identificationof OCRs for DNase-seq and FAIRE-seq. Overall a higher number of FIRs were obtained as compared with DHSs, though the length of DHSs was greater than that of FIRs, thus compensating for the overall genome coverage. The number of FIRs identified in the case of salt and drought stress was substantially greater than the number of discovered DHSs in these conditions. Reports suggest that use of crowding agents results in chromatin compaction (Richter et al., 2007). Both high salt concentrations and cellular water deficit mimic the action of crowding agents by disturbing the water balance in a cell. Although we do not have direct experimental evidence, it is likely that salt and drought stress change the accessibility of chromatin to the DNase I, thereby resulting in a lower number of DHSs. On the other hand, the effect of water loss on chromatin can be negated owing to physical force provided by sonicaton during FAIRE-seq.
Approximately 5% of OCRs were discovered in the intergenic regions. Accessible chromatin regions along with overlapping signatures of histone modifications and non-coding RNAs have been used to predict enhancers in Arabidopsis (Zhu et al., 2015; Zhao et al., 2018). Many of the discovered OCRs in the intergenic regions display stress-specific accessibility and therefore can be used to determine stress-specific enhancers. A smaller percentage of OCRs was also discovered in introns, and previous evidence suggests that the introns that are retained during alternative splicing are more likely to have accessible chromatin (Ullah et al., 2018). The data presented in the current study could therefore be used to determine intron retention with respect to accessible chromatin during abiotic stresses. The average length of the OCRs in Arabidopsis is of the size of two nucleosomes, and this is in agreement with the previous observation for the length of OCRs in Bombyx mori (Zhang et al., 2017). The length of OCRs across various elements on the genome was found to be similar between DHSs and FIRs, with the exception of intergenic regions, where the length of OCRs was significantly smaller. Enrichment of uDHSs and uFIRs in different gene elements was similar except in the case of 5'-UTRs which showed enrichment of uDHSs and, contrastingly, depletion of uFIRs. Kumar et al. (2013) analyzed the sensitivity of DNase-seq and FAIRE-seq on the promoter and non-promoter regions and found that at equal sequencing depth DNase-seq is generally more sensitive than FAIRE-seq, and this difference is much more pronounced at the promoter regions. In our case, more sequencing depth was achieved in the majority of DNase-seq samples (Supplementary Table 1 at Zenodo) and therefore we observed more enrichment of uDHSs in 5'-UTRs.
A qualitative comparison of chromatin accessibility in OCRs between controls and samples subjected to stress revealed that the majority of OCRs do not display a change in their accessibility on stress imposition and are in a state poised for transcriptional activity, suggesting that these regions are fertile ground for binding of regulatory proteins. This was confirmed by the discovery of DFPs within the constitutively accessible chromatin. The presence of DFPs specifically in control samples suggested that these motifs were evicted by TFs during stress and were therefore not captured by DNase-seq. Enrichment analysis of the motifs revealed that these motifs present binding sites for at least 32 different TFs. Interestingly, the TFs binding to the highly enriched motifs were involved with hormone homeostasis, most notably of ethylene and auxin. Increasing evidence suggests profound hormonal regulation during abiotic stresses (Peleg and Blumwald, 2011; Verma et al., 2016), and chromatin accessibility data from the current study support this evidence. Motif enrichment from the DFPs that were exclusively present in nrOCRs of stress-treated samples led to the identification of 11 TFs that bind to these motifs specifically under stress. Notably, we identified AtMyb52 as one of the TFs binding to CREs under drought stress, which has previously been shown to enhance drought tolerance (Park et al., 2011).
Peaks of the curves (plotted with respect to the CASs) for cold and high temperature stress were in the positive range, exceeding the cut-off of 0.2, indicating that a larger proportion of chromatin regions were open under these stresses, whereas in the case of drought stress, the peak of the curve was in the no change category (between CAS –0.2 and 0.2), indicating that the majority of chromatin regions were not changing their accessibility in response to drought stress. In salt stress, the peak of the curve was near the CAS value –0.2 indicating that a large proportion of chromatin was closed. The electrostatic repulsion created by negatively charged DNA is a deterrent for compaction of chromatin. Positively charged histone counters this to some extent; however, charge stoichiometry between DNA and histone (1:0.5) makes chromatin overall negatively charged. External application of positively charged ions (in this case imposition of salt stress by NaCl) favors compaction of chromatin (Korolev et al., 2010; Allahverdi et al., 2015). Therefore, the greater tendency of chromatin to close under salt stress (reduction in the CAS) might be due to the effect of Na+ ions on chromatin. Except in salt stress, Ha-SACRs were associated with higher average gene expression as compared with Ha-SRCRs. In salt stress, higher gene expression was found for genes associated with Ha-SRCRs as compared with Ha-SACRs. This discrepancy might be due to the salt stress-induced post-transcriptional stabilization and localization of RNA to the liquid–liquid phase separating bodies such as P-bodies. A subset of RNAs has been shown to be retained in P-bodies and have increased RNA stability upon exposure to salt stress (Steffens et al., 2015). Therefore, compaction of chromatin and at the same time retention of transcripts in P-bodies upon exposure to salt stress are the likely reasons for observing association of Ha-SRCRs with higher gene expression.
The CASs led to identification of chromatin regions that were either highly or poorly accessible (Ha-SACRs and Ha-SRCRs). Recently, bivalent H3K4me3–H3K27me3 marks were shown to be associated with more accessible chromatin regions (Zeng et al., 2019), and it will be interesting to evaluate the presence of the bivalent histone marks in Ha-SACRs and Ha-SRCRs. Abiotic stresses are known to relieve repression of transposable elements and pseudogenes (Capy et al., 2000; Bucher et al., 2012; Makarevitch et al., 2015), and mapping of epigenetic marks on Ha-OCRs shows that Ha-SACRs are enriched with euchromatic H3K4 methylation marks and depleted of heterochromatic H3T3 phosphorylation marks in drought stress. It is likely that in other stresses as well, the OCRs have differential histone modifications that define not only their numbers, but also their abilities to be accessible. Overall, our study provides genome-wide abiotic stress-mediated modulation of accessible chromatin regions and could potentially be used to mine stress-specific enhancers and stress-specific alternatively spliced transcripts from within these regions. Further, this study also lays the ground for the discovery of motifs and their cognate TFs, which can be functionally validated in future studies and used for enhancing tolerance to abiotic stresses in plants.
Data availability
All the sequencing data (FAIRE-seq, DNase-seq, and RNA-seq) have been submitted to the Gene Expression Omnibus (GEO) repository under the accession number GSE145742. All the supplementary figures and tables are deposited at Zenodo under doi: 10.5281/zenodo.3843539 (Raxwal et al., 2020).
Acknowledgments
This research was supported by the Department of Biotechnology (DBT), India, an R&D grant from the University of Delhi, India, the Department of Science and Technology (PURSE grant), India, the Ministry of Education, Youth and Sports of the Czech Republic, the European Regional Development Fund-Project ‘REMAP’ (no. CZ.02.1.01/0.0/0.0/15_003/0000479), and the Czech Science Foundation (16-18578S). VKR is grateful for research fellowships from the Council of Scientific and Industrial Research (CSIR), India. We thank Dr Sridhar Sivasubbu, Dr Rajesh Pandey, and Dr Shamsudheen from the Institute of Genomics and Integrative Biology (IGIB), Delhi, India for helping with the sequencing.
Comments