-
PDF
- Split View
-
Views
-
Cite
Cite
Jingjing Gao and others, Toward an understanding of the detection and function of R-loops in plants, Journal of Experimental Botany, Volume 72, Issue 18, 30 September 2021, Pages 6110–6122, https://doi.org/10.1093/jxb/erab280
Close - Share Icon Share
Abstract
Although lagging behind studies in humans and other mammals, studies of R-loops in plants have recently entered an exciting stage in which the roles of R-loops in gene expression, genome stability, epigenomic signatures, and plant development and stress responses are being elucidated. Here, we review the strengths and weaknesses of existing methodologies, which were largely developed for R-loop studies in mammals, and then discuss the potential challenges of applying these methodologies to R-loop studies in plants. We then focus on recent advances in the functional characterization of R-loops in Arabidopsis thaliana and rice. Recent studies in plants indicate that there are coordinated relationships between R-loops and gene expression, and between R-loops and epigenomic signatures that depend, in part, on the types of R-loops involved. Finally, we discuss the emerging roles of R-loops in plants and directions for future research.
Introduction
The R-loop is a three-stranded nucleic acid structure consisting of two open antiparallel DNA strands, one of which forms a DNA–RNA hybrid with a complementary RNA strand present in the milieu while the other remains as single-stranded DNA (ssDNA). Although the mechanisms underlying R-loop formation are still subject to debate, R-loops are generally considered to be formed co-transcriptionally, a process that is influenced by local DNA topology, genomic context, and chromatin structures (Al-Hadid and Yang, 2016; Stolz et al., 2019). R-loops initially drew attention owing to detrimental effects on human genome integrity and stability caused by the accumulation of unscheduled R-loops, which have been linked to human diseases such as cancer, neurodegeneration, and inflammatory diseases (Mannini and Musio, 2011; Sollier and Cimprich, 2015). Currently, the proposed role for R-loops in undermining genome instability is based on the fact that R-loop formation results in stalling at DNA replication forks (Tuduri et al., 2009), DNA damage (Brambati et al., 2020), and chromosomal DNA rearrangements and recombination (Gan et al., 2011). Additional studies are needed to confirm that such changes are sufficient to induce genome instability, which generates enough additional mutations to cause catastrophic outcomes such as a large number of mutations and, subsequently, the diseases mentioned above.
Generally speaking, R-loops are abundant and well scheduled, covering approximately 5–10% of genomic regions in yeast, plants, and mammals (Sanz et al., 2016; Xu et al., 2017; Fang et al., 2019). This suggests that R-loops are well tolerated and that only a small subset of poorly defined unscheduled “toxic” R-loops may contribute to genome instability and disease (Costantino and Koshland, 2015).
Even when R-loops, especially “pathogenic” ones, impede transcription or DNA replication, they are not necessarily detrimental. Most organisms have developed different strategies to reduce or prevent transcription–replication conflicts, from genome organization favoring co-orientation of replication and transcription to specific mechanisms to avoid or resolve such collisions (Garcia-Muse and Aguilera, 2016). The important factors that diminish conflicts are the transcription machinery itself and mRNA-processing proteins, as well as factors that help or facilitate the progression of replication, such as DNA helicases and topoisomerases or chromatin-remodeling complexes (Garcia-Muse and Aguilera, 2016). These factors play roles in, for instance, the phosphorylation/dephosphorylation of transcription or replication complex proteins (Canal et al., 2018), removal of RNA polymerase II (Poli et al., 2016), and mediation of the nuclear RNA-binding ribonucleoprotein particle (Santos-Pereira et al., 2013). Moreover, most organisms have developed a network of cellular pathways that sense and repair DNA lesions caused by aberrant R-loops, known as DNA damage response pathways (Crossley et al., 2019), for example, base excision repair, nucleotide excision repair, mismatch repair, homologous recombination, and non-homologous end joining, allowing cells to repair DNA damage (Chatterjee and Walker, 2017). Therefore, R-loops do not necessarily have detrimental impacts in most organisms and only a small subset of R-loops may become aberrant in some unusual conditions, resulting in genome instability and mutations (Costantino and Koshland, 2015).
As genomic structures, R-loops have been continually evolving with DNA-related biological activities; it is therefore rational to conjecture that, in genomes where they occur, R-loops may have essential functions in DNA-dependent processes. Over time, studies regarding the roles of R-loops in gene regulation, DNA repair, and genome stability have intensified (Santos-Pereira and Aguilera, 2015;Garcia-Muse and Aguilera, 2019; Niehrs and Luke, 2020). While studies regarding the detection and functions of R-loops in mammals have been reviewed repeatedly in the past few years (Allison and Wang, 2019; Crossley et al., 2019; Garcia-Muse and Aguilera, 2019; Gomez-Gonzalez and Aguilera, 2019), a review of R-loops in plants is lacking. Such a review has been eagerly anticipated by the plant research community to clarify the status of R-loop studies, including appropriate detection methodologies, their functional roles, and current and future research directions.
Methodologies for mapping R-loops across the genome
The utilization of appropriate detection methods is critically important in studying R-loops. The methods for R-loop detection used in the past 50 years can be classified into two categories: structure-based methodologies (Milman et al., 1967; Thomas et al., 1976; Boguslawski et al., 1986b; Drolet et al., 1994; Daniels and Lieber, 1995; Yu et al., 2003; Szekvolgyi et al., 2007; Brown et al., 2008; Pohjoismaki et al., 2010; Skourti-Stathaki et al., 2011; Bhatia et al., 2014; Loomis et al., 2014; Skourti-Stathaki et al., 2014; Nadel et al., 2015), and sequence-based methodologies, including DNA immunoprecipitation with quantitative PCR (Skourti-Stathaki et al., 2011) and high-throughput methodologies (Yu et al., 2003; Ginno et al., 2012; Chan et al., 2014; El Hage et al., 2014; Jenjaroenpun et al., 2015; Lim et al., 2015; Sanz et al., 2016; Zeller et al., 2016; Wahba et al., 2016; Chen et al., 2017; Dumelie and Jaffrey, 2017; Xu et al., 2017; Kuznetsov et al., 2018; Fang et al., 2019; Yan et al., 2019; Malig et al., 2020) (Fig. 1).
Milestone methodologies for R-loop identification in the past 50 years. R-loop structure-based detection methods (upper panel) and sequence-based detection methods (lower panel) are shown. RNA PII, RNA polymerase II.
In the past 8 years, a combination of either immunoprecipitation or non-denaturing bisulfite treatment experiments with high-throughput technologies has revolutionized R-loop studies by enabling genome-wide profiling of R-loops. For example, a global R-loop profiling method that combines DNA:RNA immunoprecipitation (DRIP) with sequencing (known variously as DRIP-seq, RDIP-seq, S1-DRIP-seq, or DRIPc-seq) or chip (DRIP-chip) (Ginno et al., 2012; Chan et al., 2014; Nadel et al., 2015; Sanz et al., 2016; Wahba et al., 2016) and DNA:RNA in vitro enrichment sequencing (DRIVE-seq) was developed in humans (Ginno et al., 2012). Chromatin immunoprecipitation (ChIP)-based methods (Skourti-Stathaki et al., 2011; El Hage et al., 2014; Loomis et al., 2014; Chen et al., 2015; Wahba et al., 2016; Zeller et al., 2016), such as S9.6 ChIP-seq (El Hage et al., 2014), R-ChIP-seq (Chen et al., 2017), and CUT&RUN-related MapR (Yan et al., 2019), have been used for the detection of R-loops in humans and yeast. In addition, the non-denaturing bisulfite footprinting approach was initially applied to prove the existence of R-loops at endogenous immunoglobulin class switch regions in the human epithelial cell line 293/EBNA1 and mouse spleen B cells (Yu et al., 2003); it was then combined with DRIP-seq (bisDRIP-seq) to globally map R-loops in MCF-7 cells (Dumelie and Jaffrey, 2017). A more recent method, non-denaturing bisulfite treatment followed by single-molecule PacBio sequencing, also referred to as single-molecule R-loop footprinting (SMRF-seq), in NTERA-2 cells represents an orthogonal approach for profiling R-loops compared with DRIP-seq (Malig et al., 2020). ssDRIP-seq, a single-stranded DNA ligation-based library preparation technique, was used for genome-wide profiling of R-loops in Arabidopsis thaliana (Xu et al., 2017). At the same time, R-loop-forming sequences (RLFS) were created to permit the computational prediction of R-loops in humans (Kuznetsov et al., 2018).
Weaknesses and strengths of existing strategies for global characterization of R-loops
Three widely used sequence-based methods for R-loop detection are S9.6 antibody-based methods, RNase H-based methods, and non-denaturing bisulfite footprinting methods (Table 1). Owing to their use of different principles, each method (as used primarily in mammals) has its own intrinsic strengths and weaknesses (Vanoosthuyse, 2018). Here, we are primarily focused on the specific features of each existing method as it will be considered for direct use for, or adapted for use for, identifying and characterizing R-loops in plants, with a particular focus on the most recently applied methods, such as MapR in HEK293 and U87T cell lines (Yan et al., 2019), SMRF-seq in NTERA-2 cells (Malig et al., 2020), and BisMapR in E14 mouse embryonic stem cells (Wulfridge and Sarma, 2021, Preprint) in comparison with the early methods such as S9.6-based methodologies, DRIVE-seq in NTERA-2 cells, R-ChIP in HEK293T cells, HB-GFP in HeLa cells (Bhatia et al., 2014), and BisDRIP-seq in MCF-7 cells. The advantages and disadvantages of each high-throughput technology for mapping R-loops are summarized in Table 1.
Summary of high-throughput technologies for mapping R-loops
| Method . | Strategy . | Organism . | Advantages . | Disadvantages . | References . |
|---|---|---|---|---|---|
| DRIVE-seq | Catalytically deficient RNaseH1 | Human cell line | Detection of RNase H sensitive R-loops | Low capture efficiency, in vitro | Ginno et al., 2012 |
| DRIP-seq | S9.6-based detection | Human cell line | Consistent, reproducible, widely adapted | High input material required, low signal-to-noise ratio, low resolution, non-strand-specific, lengthy experiment times, poor sensitivity for dynamic R-loops | Ginno et al., 2012 |
| DRIP-seq | Human and yeast | Lim et al., 2015 | |||
| DRIP-seq | Caenorhabditis elegans | Zeller et al., 2016 | |||
| DRIP-chip | Yeast | Detection of R-loops in repetitive regions or in regions of interest | Low resolution, lower throughput relative to sequencing | Chan et al., 2014 | |
| S9.6 ChIP-seq | Yeast | Detection of stable or transient R-loops in vivo | Non-strand-specific, relatively low resolution, underestimation of in vivo R-loop binding by trans-acting factors | EI Hage et al., 2014 | |
| S1-DRIP-seq | Yeast | High resolution | Non-strand-specific | Wahba et al., 2016 | |
| DRIPc-seq | Human cell line | Strand-specific, high resolution | Higher amount of starting material compared with DRIP and MapR, potential binding of S9.6 to dsRNA | Sanz et al., 2016 | |
| ssDRIP-seq | Arabidopsis thaliana | Strand-specific, fewer steps required for library construction | Low resolution, special library preparation | Xu et al., 2017 | |
| DRIP-seq | Rice | Strand-specific, widely applied in other plants | Low resolution, high input material required, lengthy experiment times | Fang et al., 2019 | |
| bisDRIP-seq | Human MCF-7 cells | Strand-specific, high resolution, improved specificity by target both DNA:RNA hybrids and ssDNA | Excludes specific genomic regions that form RNA–DNA hybrids on both DNA strands | Dumelie et al., 2017 | |
| R-ChIP-seq | Catalytically deficient RNaseH1 | Human cell line | Strand-specific, high resolution, in vivo | Requires the generation of stable cell line expressing a catalytic mutant RNase H1 | Chen et al., 2017 |
| MapR | RNase H catalytic mutant fused to micrococcal nuclease (RHΔ-MNase) | Human cell line | Antibody-independent, low input requirements, high resolution, less time consuming, more sensitive for dynamic R-loops | Non-strand-specific | Yan et al., 2019 |
| BisMapR | Human cell line | Antibody-independent, high signal-to-noise ratio, high resolution, strand-specific | Potential loss of targeted R-loop DNA during non- denaturing bisulfite treatment | Wulfridge et al., 2021, Preprint | |
| SMRF-seq | Non-denaturing bisulfite footprinting | Human NTERA-2 cells | Antibody-independent, high resolution | Underestimates R-loops with highly methylated C and poly(A) tracts, relatively high false-positive “R-loops” | Malig et al., 2020 |
| Method . | Strategy . | Organism . | Advantages . | Disadvantages . | References . |
|---|---|---|---|---|---|
| DRIVE-seq | Catalytically deficient RNaseH1 | Human cell line | Detection of RNase H sensitive R-loops | Low capture efficiency, in vitro | Ginno et al., 2012 |
| DRIP-seq | S9.6-based detection | Human cell line | Consistent, reproducible, widely adapted | High input material required, low signal-to-noise ratio, low resolution, non-strand-specific, lengthy experiment times, poor sensitivity for dynamic R-loops | Ginno et al., 2012 |
| DRIP-seq | Human and yeast | Lim et al., 2015 | |||
| DRIP-seq | Caenorhabditis elegans | Zeller et al., 2016 | |||
| DRIP-chip | Yeast | Detection of R-loops in repetitive regions or in regions of interest | Low resolution, lower throughput relative to sequencing | Chan et al., 2014 | |
| S9.6 ChIP-seq | Yeast | Detection of stable or transient R-loops in vivo | Non-strand-specific, relatively low resolution, underestimation of in vivo R-loop binding by trans-acting factors | EI Hage et al., 2014 | |
| S1-DRIP-seq | Yeast | High resolution | Non-strand-specific | Wahba et al., 2016 | |
| DRIPc-seq | Human cell line | Strand-specific, high resolution | Higher amount of starting material compared with DRIP and MapR, potential binding of S9.6 to dsRNA | Sanz et al., 2016 | |
| ssDRIP-seq | Arabidopsis thaliana | Strand-specific, fewer steps required for library construction | Low resolution, special library preparation | Xu et al., 2017 | |
| DRIP-seq | Rice | Strand-specific, widely applied in other plants | Low resolution, high input material required, lengthy experiment times | Fang et al., 2019 | |
| bisDRIP-seq | Human MCF-7 cells | Strand-specific, high resolution, improved specificity by target both DNA:RNA hybrids and ssDNA | Excludes specific genomic regions that form RNA–DNA hybrids on both DNA strands | Dumelie et al., 2017 | |
| R-ChIP-seq | Catalytically deficient RNaseH1 | Human cell line | Strand-specific, high resolution, in vivo | Requires the generation of stable cell line expressing a catalytic mutant RNase H1 | Chen et al., 2017 |
| MapR | RNase H catalytic mutant fused to micrococcal nuclease (RHΔ-MNase) | Human cell line | Antibody-independent, low input requirements, high resolution, less time consuming, more sensitive for dynamic R-loops | Non-strand-specific | Yan et al., 2019 |
| BisMapR | Human cell line | Antibody-independent, high signal-to-noise ratio, high resolution, strand-specific | Potential loss of targeted R-loop DNA during non- denaturing bisulfite treatment | Wulfridge et al., 2021, Preprint | |
| SMRF-seq | Non-denaturing bisulfite footprinting | Human NTERA-2 cells | Antibody-independent, high resolution | Underestimates R-loops with highly methylated C and poly(A) tracts, relatively high false-positive “R-loops” | Malig et al., 2020 |
Summary of high-throughput technologies for mapping R-loops
| Method . | Strategy . | Organism . | Advantages . | Disadvantages . | References . |
|---|---|---|---|---|---|
| DRIVE-seq | Catalytically deficient RNaseH1 | Human cell line | Detection of RNase H sensitive R-loops | Low capture efficiency, in vitro | Ginno et al., 2012 |
| DRIP-seq | S9.6-based detection | Human cell line | Consistent, reproducible, widely adapted | High input material required, low signal-to-noise ratio, low resolution, non-strand-specific, lengthy experiment times, poor sensitivity for dynamic R-loops | Ginno et al., 2012 |
| DRIP-seq | Human and yeast | Lim et al., 2015 | |||
| DRIP-seq | Caenorhabditis elegans | Zeller et al., 2016 | |||
| DRIP-chip | Yeast | Detection of R-loops in repetitive regions or in regions of interest | Low resolution, lower throughput relative to sequencing | Chan et al., 2014 | |
| S9.6 ChIP-seq | Yeast | Detection of stable or transient R-loops in vivo | Non-strand-specific, relatively low resolution, underestimation of in vivo R-loop binding by trans-acting factors | EI Hage et al., 2014 | |
| S1-DRIP-seq | Yeast | High resolution | Non-strand-specific | Wahba et al., 2016 | |
| DRIPc-seq | Human cell line | Strand-specific, high resolution | Higher amount of starting material compared with DRIP and MapR, potential binding of S9.6 to dsRNA | Sanz et al., 2016 | |
| ssDRIP-seq | Arabidopsis thaliana | Strand-specific, fewer steps required for library construction | Low resolution, special library preparation | Xu et al., 2017 | |
| DRIP-seq | Rice | Strand-specific, widely applied in other plants | Low resolution, high input material required, lengthy experiment times | Fang et al., 2019 | |
| bisDRIP-seq | Human MCF-7 cells | Strand-specific, high resolution, improved specificity by target both DNA:RNA hybrids and ssDNA | Excludes specific genomic regions that form RNA–DNA hybrids on both DNA strands | Dumelie et al., 2017 | |
| R-ChIP-seq | Catalytically deficient RNaseH1 | Human cell line | Strand-specific, high resolution, in vivo | Requires the generation of stable cell line expressing a catalytic mutant RNase H1 | Chen et al., 2017 |
| MapR | RNase H catalytic mutant fused to micrococcal nuclease (RHΔ-MNase) | Human cell line | Antibody-independent, low input requirements, high resolution, less time consuming, more sensitive for dynamic R-loops | Non-strand-specific | Yan et al., 2019 |
| BisMapR | Human cell line | Antibody-independent, high signal-to-noise ratio, high resolution, strand-specific | Potential loss of targeted R-loop DNA during non- denaturing bisulfite treatment | Wulfridge et al., 2021, Preprint | |
| SMRF-seq | Non-denaturing bisulfite footprinting | Human NTERA-2 cells | Antibody-independent, high resolution | Underestimates R-loops with highly methylated C and poly(A) tracts, relatively high false-positive “R-loops” | Malig et al., 2020 |
| Method . | Strategy . | Organism . | Advantages . | Disadvantages . | References . |
|---|---|---|---|---|---|
| DRIVE-seq | Catalytically deficient RNaseH1 | Human cell line | Detection of RNase H sensitive R-loops | Low capture efficiency, in vitro | Ginno et al., 2012 |
| DRIP-seq | S9.6-based detection | Human cell line | Consistent, reproducible, widely adapted | High input material required, low signal-to-noise ratio, low resolution, non-strand-specific, lengthy experiment times, poor sensitivity for dynamic R-loops | Ginno et al., 2012 |
| DRIP-seq | Human and yeast | Lim et al., 2015 | |||
| DRIP-seq | Caenorhabditis elegans | Zeller et al., 2016 | |||
| DRIP-chip | Yeast | Detection of R-loops in repetitive regions or in regions of interest | Low resolution, lower throughput relative to sequencing | Chan et al., 2014 | |
| S9.6 ChIP-seq | Yeast | Detection of stable or transient R-loops in vivo | Non-strand-specific, relatively low resolution, underestimation of in vivo R-loop binding by trans-acting factors | EI Hage et al., 2014 | |
| S1-DRIP-seq | Yeast | High resolution | Non-strand-specific | Wahba et al., 2016 | |
| DRIPc-seq | Human cell line | Strand-specific, high resolution | Higher amount of starting material compared with DRIP and MapR, potential binding of S9.6 to dsRNA | Sanz et al., 2016 | |
| ssDRIP-seq | Arabidopsis thaliana | Strand-specific, fewer steps required for library construction | Low resolution, special library preparation | Xu et al., 2017 | |
| DRIP-seq | Rice | Strand-specific, widely applied in other plants | Low resolution, high input material required, lengthy experiment times | Fang et al., 2019 | |
| bisDRIP-seq | Human MCF-7 cells | Strand-specific, high resolution, improved specificity by target both DNA:RNA hybrids and ssDNA | Excludes specific genomic regions that form RNA–DNA hybrids on both DNA strands | Dumelie et al., 2017 | |
| R-ChIP-seq | Catalytically deficient RNaseH1 | Human cell line | Strand-specific, high resolution, in vivo | Requires the generation of stable cell line expressing a catalytic mutant RNase H1 | Chen et al., 2017 |
| MapR | RNase H catalytic mutant fused to micrococcal nuclease (RHΔ-MNase) | Human cell line | Antibody-independent, low input requirements, high resolution, less time consuming, more sensitive for dynamic R-loops | Non-strand-specific | Yan et al., 2019 |
| BisMapR | Human cell line | Antibody-independent, high signal-to-noise ratio, high resolution, strand-specific | Potential loss of targeted R-loop DNA during non- denaturing bisulfite treatment | Wulfridge et al., 2021, Preprint | |
| SMRF-seq | Non-denaturing bisulfite footprinting | Human NTERA-2 cells | Antibody-independent, high resolution | Underestimates R-loops with highly methylated C and poly(A) tracts, relatively high false-positive “R-loops” | Malig et al., 2020 |
S9.6-based methodologies
S9.6 antibody is a mouse monoclonal antibody, which was generated against a ΦX174 bacteriophage-derived synthetic DNA–RNA antigen (Boguslawski et al., 1986a). The S9.6 antibody binds to DNA–RNA hybrids with a high affinity (Konig et al., 2017; Phillips et al., 2013) but not to the entire three-stranded R-loop structure in vivo or in vitro. Moreover, S9.6 exhibits sequence-context-dependent affinity for DNA–RNA hybrids (Konig et al., 2017) and is sensitive to treatment with different ribonucleases (Smolka et al., 2021). Furthermore, S9.6 has a relatively high affinity for RNA–RNA hybrids (Konig et al., 2017; Hartono et al., 2018), suggesting a potential sequence-related bias for R-loop recognition by this antibody. Technically speaking, current DRIP and DRIP-derivative methods are capable of specifically identifying RNase H-sensitive R-loops in a genome because these methods usually incorporate RNase H pre-treatment of a portion of the same sample, as a negative control (Sanz et al., 2016; Fang et al., 2019; Xu et al., 2020; Liu et al., 2021). Thus, this method is not suitable for identifying some R-loops that are insensitive to RNase H cleavage. Moreover, R-loop profiling is inconsistent among different studies, which is frequently caused by experimental variations, as discussed previously (Halasz et al., 2017; Vanoosthuyse, 2018; Brambati et al., 2020). The combination of S9.6 with other approaches, such as development of an alternative monoclonal antibody for specifically recognizing R-loops or generation of transgenic plants with modified RNase H fused with tags, will enable additional substantiation of the existing DRIP results in plants.
RNase H-based methodologies
MapR is a sensitive and convenient method for mapping R-loops in human cell lines (Yan et al., 2019). MapR combines the binding of modified RNase H to DNA–RNA hybrids with CUT&RUN, a chromatin immunocleavage-based genome-wide mapping method in which antibody-targeted controlled cleavage by micrococcal nuclease (MNase) releases specific protein–DNA complexes into the supernatant for sequencing (Skene and Henikoff, 2017), for R-loop identification. MapR therefore avoids the issues related to variable expression levels of RNase H and affinity purification, and it can be scaled up or down depending on the amount of initial input. MapR can also detect R-loops formed with transient RNAs, such as enhancer RNA (Yan et al., 2019). However, the current MapR does not allow strand-specific mapping of R-loops. MapR targets DNA–RNA hybrids and releases intact and broken three-stranded R-loops via the activation of MNase by adding calcium. Consequently, it is unlikely that the library preparation of MapR can fully capture DNA–RNA hybrids only for sequencing, although this might be ameliorated if additional library preparation procedures were incorporated to recruit ssDNA and DNA–RNA hybrids. The efficacy of the current MapR method completely relies on the binding affinity of the recombinant protein to DNA–RNA hybrids and the subsequent cleavage efficiency of MNase. Thus, MapR could potentially be affected by the binding affinity and specificity of mutated RNase H for DNA–RNA hybrids and the enzymatic activities of MNase. In addition, the reproducibility of MapR with cell numbers less than 1×105 still needs to be improved. Needless to say, when it is used for multicellular organisms like plants, the penetration of both RHΔ–MNase (a catalytically inactive RNase H mutant fused to MNase) and calcium into the nuclei of all cells can become a problem.
BisMapR combines MapR with non-denaturing bisulfite treatment for global mapping of strand-specific R-loops in mouse embryonic stem cells (Wulfridge and Sarma, 2021, Preprint). To some extent, it can be considered as an improved version of MapR integrated with a non-denaturing treatment for strand-specific library preparation. It is still unknown whether MapR can directly combine with strand-specific library preparation as described by Nadel et al. (2015); more effort to test this is merited. The success in accomplishing such an advance would enable us to modify MapR to produce similar outcomes to those achieved by BisMapR. Such an advance would allow us to skip the non-denaturing bisulfite treatment procedure used in BisMapR and thus avoid the potential loss of targeted R-loop DNA during non-denaturing bisulfite treatment.
Non-denaturing bisulfite footprinting-related methodologies
SMRF-seq is an orthogonal methodology independent of S9.6 and RNase H. It enables footprinting of DNA strands within R-loops in NTERA-2 cells at a resolution of nearly one nucleotide (Malig et al., 2020). However, this approach suffers from additional limitations in addition to those already discussed in the literature (Malig et al., 2020). Since SMRF-seq depends on the conversion of unpaired, unmethylated cytosines into uracils in a non-denaturing manner, it is not suitable for the detection of R-loops within highly methylated C and poly(A) tracts (Wahba et al., 2016). An earlier study had shown that the C-to-T conversion rate under non-denaturing conditions is 86% (Malig et al., 2020), which is lower than the >99% C-to-T conversion rate under denaturing conditions. At present, it is still unclear whether the lower conversion rate causes an underestimation of R-loops, particularly R-loops with lower C content. The non-denaturing bisulfite assay primarily targets ssDNA and detects stable, non-helical secondary DNA structures in mammals and plants (Raghavan et al., 2006; Gentry and Hennig, 2016; Amparo et al., 2020) in addition to R-loops, G-quadraplexes, and cruciform DNA. Therefore, this approach may identify excessive false-positive “R-loops” and should be combined with other approaches to exclude non-R-loop loci.
In addition to the intrinsic weaknesses of each method, as mentioned above or previously (Vanoosthuyse, 2018), plants have some unique features that may pose some challenges for the direct application of these methods to plants. For example, the presence of chloroplast genomic RNA or DNA-related R-loops may increase background or artifacts for mapping nuclear R-loops in plants. Plant tissues, for example, roots and stems, are highly heterogeneous, containing mixed cell types and/or various developmental stages, thus affecting the representation of R-loops in plants. In addition, current RNase H-based methods use RNase H enzyme from Escherichia coli (Yan et al., 2019) or humans (Chen et al., 2017). The sequence and functional divergence of genes encoding RNase H have been documented in plants (Majorek et al., 2014; Kucinski et al., 2020). Thus, the binding specificity and efficiency reported for RNase H need to be validated in plants. Moreover, RNase H may need to be optimized to facilitate R-loop mapping in plants. Some plant-related issues are discussed in the next section.
In summary, there is no one method that can accurately and quantitatively map R-loops in all circumstances. For profiling R-loops in plants, SMRF-seq, DRIP, and its relatives can be directly applied to wild-type plants, whereas the mutated RNase H-related approaches such as R-ChIP require the generation of transiently or stably transformed plants. Generally, DRIP is best for mapping stable R-loops, R-ChIP is ideal for detecting highly dynamic RNase H-sensitive R-loops, and SMRF-seq is best suited to the detection of R-loops with unmethylated and GC-rich content. Thus, the use of DRIP in combination with either R-ChIP or SMRF-seq will enable the comprehensive characterization of genome-wide R-loops.
The challenges of using existing methodologies for mapping R-loops in plants
To date, DRIP is the only method that has been utilized to globally profile R-loops in A. thaliana (Xu et al., 2017; Xu et al., 2020) and rice (Fang et al., 2019). As in mammals, it is necessary to employ complementary approaches (i.e. RNase H- and non-denaturing bisulfite footprinting-related methodologies) in parallel to provide comprehensive information about R-loop stability at specific loci or across a genome. The R-loops reported were primarily derived from whole plants or specific tissues under normal or stress conditions; as a result, they contained a mixture of cell types and various cell-cycle profiles. Without doubt, any of the methods described above can identify R-loops in one or more tissues with heterogeneous cell populations, but they may underestimate those R-loops present in only a small portion of cell populations. Due to their low abundance, it will be more demanding to characterize cell-, tissue-, or growth-condition-specific R-loops in plants. The development of new methodologies with high sensitivity for either a single cell or a small number of cell populations will be instrumental for elucidating the role of R-loops in cell differentiation and tissue-specific development in plants. Identifying R-loops with novel functions in plant development will be a valuable step forward.
When R-ChIP is implemented in plants, it will require transient or constitutive expression of mutated RNase H proteins in target cells or tissues. Inconsistent expression of RNase H between different cells or cell types can cause mapping biases for R-loops. However, the development of cell-type-specific expression of RNase H will enable R-ChIP to be used for mapping cell-type-specific R-loops. It must be noted that mammals primarily have a CG context (Bird, 2002), but there are three distinct types of cytosine contexts—CG, CHG, and CHH (where H represents A, T, or C)—in the plant genomes (Gruenbaum et al., 1981), and R-loops display distinctly different methylation levels in different contexts. In particular, certain R-loops in plants are associated with highly methylated cytosines (Fang et al., 2019); some of these R-loops could be missed when using non-denaturing bisulfite footprinting-related methods in plants, since they largely rely on the conversion rate of unmethylated cytosines to uracils. These methods would underestimate R-loops with methylated cytosines in plant genomes. As stated earlier, different mapping approaches should be implemented in parallel to provide complementary profiles of R-loops in plants.
Conservation and divergence of R-loop distribution across mammals and plants
The DRIP method and its derivatives have been widely applied to profile R-loops in both mammalian and plant genomes. Conservation of R-loop formation has been documented within human cell types and across species in mammals (Sanz et al., 2016). However, DRIP studies in mammals have shown inconsistent outcomes between different laboratories (Ginno et al., 2012; Nadel et al., 2015; Grunseich et al., 2018). There are two possible explanations for these discrepancies: (i) a substantial portion of R-loops are dynamic, that is, R-loops are formed in a cell-type- and species-dependent manner, reflecting divergence between cell types and species; or (ii) variations in experimental parameters between different studies, including sample preparation and treatment or sequencing depth and data analysis. Furthermore, technical variability can result in deviations in R-loop detection. With the large number of R-loops that are naturally formed in the genome, it is difficult to declare that any one method is absolutely superior to another.
R-loops are prevalent across the entire genome of both mammals and plants, but variation in the subgenomic distribution of R-loops in different species and cell types is perceptible. Representative results are summarized in Table 2. Compared with mammals, plants have far fewer R-loops in the terminal regions of genes; this may be due to the much shorter average length of 3′ untranslated regions in plant genes than in mammalian genes. Approximately 42–47% of R-loops are distributed in intergenic regions in rice; this proportion is much higher than in the A. thaliana, human, and mouse genomes. A higher percentage of R-loops (>50%) is present in promoters in human K562 and HEK293T cells than in other human cell types, mouse cells, A. thaliana, and rice, indicating that R-loops in K562 and HEK293T cells may be closely associated with transcriptional activity. Similar percentages of sense and antisense R-loops occur in A. thaliana and rice genomes, whereas sense R-loops are dominant (>90%) in the human genome. Thus, subgenomic divergence of R-loops is perceivable between mammals and plants.
Summary of subgenomic distribution of R-loops in mammalian and plant genomes
| Species . | Cell type/tissue . | Methodology . | Genomic distribution of R-loops (%) . | Genic R-loops (%) . | References . | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | Promoter . | Gene body . | Terminal . | Intergenic . | Promoter or terminal . | Antisense . | Sense . | Antisense . | . |
| Human | K562 | DRIP-seq | 24.5 | 33.3 | 19.7 | 12.4 | 10.1 | NA | NA | NA | Sanz et al., 2016 |
| NTERA-2 | DRIP-seq | 24.7 | 35.8 | 20.8 | 9 | 9.7 | NA | NA | NA | Sanz et al., 2016 | |
| Fibroblast | DRIP-seq | 21.9 | 40.5 | 20.6 | 8 | 9 | NA | NA | NA | Sanz et al., 2016 | |
| NTERA-ra2 | DRIPc-seq | 12.8 | 51.5 | 19.4 | 8.3 | 3.6 | 4.7 | >90 | <10 | Sanz et al., 2016 | |
| HEK293T | R-ChIP | 59.3 | 17.2 | 6.6 | 16.9 | NA | NA | NA | NA | Chen et al., 2017 | |
| K562 | R-ChIP | 53.3 | 18.4 | 5.9 | 22.4 | NA | NA | NA | NA | Chen et al., 2017 | |
| Mouse | E14 | DRIP-seq | 15.8 | 49.9 | 18.1 | 12.4 | 3.8 | NA | NA | NA | Sanz et al., 2016 |
| NIH3T3 | DRIP-seq | 20.5 | 44.7 | 22 | 7.8 | 5 | NA | NA | NA | Sanz et al., 2016 | |
| Arabidopsis | Seedling | ssDRIP-seq | 16.9 | 57 | 10.3 | 15.8 | NA | NA | 54.2 | 45.8 | Xu et al., 2017 |
| Rice | Seedling | DRIP-seq | 21.7 | 22.7 | 8.9 | 46.7 | NA | NA | 45.8 | 54.2 | Fang et al., 2019 |
| Callus | DRIP-seq | 16.5 | 35 | 6 | 42.5 | NA | NA | 48.1 | 51.9 | Fang et al., 2019 | |
| Species . | Cell type/tissue . | Methodology . | Genomic distribution of R-loops (%) . | Genic R-loops (%) . | References . | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | Promoter . | Gene body . | Terminal . | Intergenic . | Promoter or terminal . | Antisense . | Sense . | Antisense . | . |
| Human | K562 | DRIP-seq | 24.5 | 33.3 | 19.7 | 12.4 | 10.1 | NA | NA | NA | Sanz et al., 2016 |
| NTERA-2 | DRIP-seq | 24.7 | 35.8 | 20.8 | 9 | 9.7 | NA | NA | NA | Sanz et al., 2016 | |
| Fibroblast | DRIP-seq | 21.9 | 40.5 | 20.6 | 8 | 9 | NA | NA | NA | Sanz et al., 2016 | |
| NTERA-ra2 | DRIPc-seq | 12.8 | 51.5 | 19.4 | 8.3 | 3.6 | 4.7 | >90 | <10 | Sanz et al., 2016 | |
| HEK293T | R-ChIP | 59.3 | 17.2 | 6.6 | 16.9 | NA | NA | NA | NA | Chen et al., 2017 | |
| K562 | R-ChIP | 53.3 | 18.4 | 5.9 | 22.4 | NA | NA | NA | NA | Chen et al., 2017 | |
| Mouse | E14 | DRIP-seq | 15.8 | 49.9 | 18.1 | 12.4 | 3.8 | NA | NA | NA | Sanz et al., 2016 |
| NIH3T3 | DRIP-seq | 20.5 | 44.7 | 22 | 7.8 | 5 | NA | NA | NA | Sanz et al., 2016 | |
| Arabidopsis | Seedling | ssDRIP-seq | 16.9 | 57 | 10.3 | 15.8 | NA | NA | 54.2 | 45.8 | Xu et al., 2017 |
| Rice | Seedling | DRIP-seq | 21.7 | 22.7 | 8.9 | 46.7 | NA | NA | 45.8 | 54.2 | Fang et al., 2019 |
| Callus | DRIP-seq | 16.5 | 35 | 6 | 42.5 | NA | NA | 48.1 | 51.9 | Fang et al., 2019 | |
NA, not applicable.
Summary of subgenomic distribution of R-loops in mammalian and plant genomes
| Species . | Cell type/tissue . | Methodology . | Genomic distribution of R-loops (%) . | Genic R-loops (%) . | References . | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | Promoter . | Gene body . | Terminal . | Intergenic . | Promoter or terminal . | Antisense . | Sense . | Antisense . | . |
| Human | K562 | DRIP-seq | 24.5 | 33.3 | 19.7 | 12.4 | 10.1 | NA | NA | NA | Sanz et al., 2016 |
| NTERA-2 | DRIP-seq | 24.7 | 35.8 | 20.8 | 9 | 9.7 | NA | NA | NA | Sanz et al., 2016 | |
| Fibroblast | DRIP-seq | 21.9 | 40.5 | 20.6 | 8 | 9 | NA | NA | NA | Sanz et al., 2016 | |
| NTERA-ra2 | DRIPc-seq | 12.8 | 51.5 | 19.4 | 8.3 | 3.6 | 4.7 | >90 | <10 | Sanz et al., 2016 | |
| HEK293T | R-ChIP | 59.3 | 17.2 | 6.6 | 16.9 | NA | NA | NA | NA | Chen et al., 2017 | |
| K562 | R-ChIP | 53.3 | 18.4 | 5.9 | 22.4 | NA | NA | NA | NA | Chen et al., 2017 | |
| Mouse | E14 | DRIP-seq | 15.8 | 49.9 | 18.1 | 12.4 | 3.8 | NA | NA | NA | Sanz et al., 2016 |
| NIH3T3 | DRIP-seq | 20.5 | 44.7 | 22 | 7.8 | 5 | NA | NA | NA | Sanz et al., 2016 | |
| Arabidopsis | Seedling | ssDRIP-seq | 16.9 | 57 | 10.3 | 15.8 | NA | NA | 54.2 | 45.8 | Xu et al., 2017 |
| Rice | Seedling | DRIP-seq | 21.7 | 22.7 | 8.9 | 46.7 | NA | NA | 45.8 | 54.2 | Fang et al., 2019 |
| Callus | DRIP-seq | 16.5 | 35 | 6 | 42.5 | NA | NA | 48.1 | 51.9 | Fang et al., 2019 | |
| Species . | Cell type/tissue . | Methodology . | Genomic distribution of R-loops (%) . | Genic R-loops (%) . | References . | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| . | . | . | Promoter . | Gene body . | Terminal . | Intergenic . | Promoter or terminal . | Antisense . | Sense . | Antisense . | . |
| Human | K562 | DRIP-seq | 24.5 | 33.3 | 19.7 | 12.4 | 10.1 | NA | NA | NA | Sanz et al., 2016 |
| NTERA-2 | DRIP-seq | 24.7 | 35.8 | 20.8 | 9 | 9.7 | NA | NA | NA | Sanz et al., 2016 | |
| Fibroblast | DRIP-seq | 21.9 | 40.5 | 20.6 | 8 | 9 | NA | NA | NA | Sanz et al., 2016 | |
| NTERA-ra2 | DRIPc-seq | 12.8 | 51.5 | 19.4 | 8.3 | 3.6 | 4.7 | >90 | <10 | Sanz et al., 2016 | |
| HEK293T | R-ChIP | 59.3 | 17.2 | 6.6 | 16.9 | NA | NA | NA | NA | Chen et al., 2017 | |
| K562 | R-ChIP | 53.3 | 18.4 | 5.9 | 22.4 | NA | NA | NA | NA | Chen et al., 2017 | |
| Mouse | E14 | DRIP-seq | 15.8 | 49.9 | 18.1 | 12.4 | 3.8 | NA | NA | NA | Sanz et al., 2016 |
| NIH3T3 | DRIP-seq | 20.5 | 44.7 | 22 | 7.8 | 5 | NA | NA | NA | Sanz et al., 2016 | |
| Arabidopsis | Seedling | ssDRIP-seq | 16.9 | 57 | 10.3 | 15.8 | NA | NA | 54.2 | 45.8 | Xu et al., 2017 |
| Rice | Seedling | DRIP-seq | 21.7 | 22.7 | 8.9 | 46.7 | NA | NA | 45.8 | 54.2 | Fang et al., 2019 |
| Callus | DRIP-seq | 16.5 | 35 | 6 | 42.5 | NA | NA | 48.1 | 51.9 | Fang et al., 2019 | |
NA, not applicable.
Similarly, the majority of R-loops are well conserved between different tissues in rice (seedling versus callus) and A. thaliana (seedling versus root) (Fang et al., 2019; Xu et al., 2020). R-loop profiles are also conserved in different vegetative developmental stages (young and old leaves) and under different environmental conditions in A. thaliana (Xu et al., 2020). However, some dynamic R-loops are involved in developmental transitions, including the transition from the vegetative to the reproductive stage, seed germination, and recovery after exposure to environmental stresses (Xu et al., 2017). In addition, distinct genomic distributions of R-loops have been observed in rice (Fang et al., 2019) compared with A. thaliana (Xu et al., 2017), indicating species-related divergence in plants.
Functions of R-loops in plants
Although R-loop study in plants is in its infancy, the findings thus far regarding their basic functions in plants are similar to those in humans. First, R-loops in plants have both negative and positive regulatory roles. Second, R-loops in plants play fundamental regulatory roles. In most cases, R-loops occur co-transcriptionally and undergo dynamic turnover (Sanz et al., 2016). Moreover, mounting evidence indicates that R-loops play crucial roles in plant-specific developmental processes. A comparison of R-loop functions documented in humans and plants is given in Fig. 2.
Comparison of R-loop functions in humans (left) and in plants (right). All functions listed in the diagram have been documented. Pink check marks indicate conserved functions of R-loops reported in both humans and plants; brown check marks indicate R-loop functions reported in humans; green check marks indicate R-loop functions reported in plants.
As in mammals (Crossley et al., 2019), R-loops in plants are directly or indirectly involved in maintaining genome stability; in plants, this includes both the chloroplast (Yang et al., 2017; Yang et al., 2020) and nuclear (Yuan et al., 2019) genomes. AtRNH1C, an R-loop-removing enzyme, plays key roles in maintaining chloroplast genome stability (Yang et al., 2017). The bacterial Rho-like DNA–RNA helicase, RHON1, can function individually or coordinately with RNase H1 to mitigate chloroplast genome stress caused by aberrant R-loop accumulation in A. thaliana (Yang et al., 2020). Depletion of the R-loop reader proteins AtALBA1 and AtALBA2 destabilizes the A. thaliana genome but does not affect R-loop accumulation (Yuan et al., 2019), suggesting that both proteins are not directly involved in R-loop formation in vivo; they may function as a bridge to coordinate functions of R-loops. AtNDX, a homeodomain-containing protein in A. thaliana, can stabilize the COOLAIR R-loop by binding to the displaced ssDNA, functioning as an important factor regulating the expression of the long non-coding (lncRNA) COOLAIR (Sun et al., 2013). Moreover, RNase H1 also antagonizes the formation of R-loops and appears to be conserved from prokaryotes to eukaryotes (Posse et al., 2019). These findings suggest that plants have evolved mechanisms to regulate R-loop homeostasis.
R-loops are implicated in plant growth and development by their presence in critically important developmental genes. Knockdown of DNA topoisomerase 1 (OsTOP1) expression leads to misregulation of the auxin signaling gene OsARF19 and the transporter genes OsABCB14 and OsABCB13; plants with defects in each of these genes had elevated R-loops, resulting in morphological defects in rice roots (Shafiq et al., 2017). This indicates that DNA topology modifiers can modulate gene expression and development through R-loops. Non-coding RNAs (ncRNAs) such as lncRNA and single-stranded, covalently closed circular RNA (circRNA) form R-loops that participate in the regulation of flowering time (Sun et al., 2013) and floral homeotic phenotypes (Conn et al., 2017) in A. thaliana. For example, expression of the FLC gene is self-regulated by a COOLAIR-formed R-loop in A. thaliana (Sun et al., 2013), whereas a circRNA derived from SEPALLATA3 back-splicing modulates alternative splicing of its cognate mRNA in A. thaliana (Conn et al., 2017). In addition, the involvement of R-loops formed with stem-differentiating xylem-related circRNA in alternative splicing has also been reported in Populus trichocarpa (Liu et al., 2021). The lncRNA APOLO (AUXIN-REGULATED PROMOTER LOOP) modulates in cis the expression of its neighboring gene PINOID (PID) through forming an R-loop that decoys the plant Polycomb Repressive Complex 1 (PRC1) component LHP1 to alter the local chromatin three-dimensional conformation. APOLO also acts in trans to interact with WAG2 and AZG2 through R-loop formation; these genes are involved in auxin-directed lateral root development in A. thaliana (Ariel et al., 2014; Ariel et al., 2020; Mas and Huarte, 2020). APOLO initiates R-loop formation using the key TTCTTC boxes to pair with its complementary DNA. More recently, an APOLO lncRNA was reported to be capable of recognizing multiple independent distant loci encoding auxin-responsive genes in the A. thaliana genome, forming R-loops by sequencing complementarity (Ariel et al., 2020). APOLO recognizes a locus upstream of ROOT HAIR DEFECTIVE 6 (RHD6), where it interacts with the transcription factor WRKY42 and modulates its binding to the RHD6 promoter (Moison et al., 2021), resulting in root hair growth. This is the first report in plants to show that the formation of an R-loop can facilitate the binding of a transcription factor to the promoter of a developmental gene through altering the three-dimensional chromatin conformation. Additionally, integration of R-loop data with ncRNA sequencing data confirms the global involvement of ncRNAs in R-loop formation in both A. thaliana (Xu et al., 2017) and rice (Fang et al., 2019).
Correlation analysis indicates a functional divergence between antisense-only (ASO)-R-loops and sense-only (SO)-R-loops, corresponding to promotional and repressive roles of R-loops, respectively, in regulating the expression of overlapping genes in rice (Zhang et al., 2019). In addition, the higher fraction of antisense (AS)-R-loops in plants than in mammals may be caused by differences in directional selection in evolution. A recent study indicated that R-loops are neutral during plant vegetative stages or in response to environmental stimuli, but R-loops may play a vital role in the transition from the vegetative to the reproductive state or in seed germination in A. thaliana (Xu et al., 2020). Results from other species or from other orthogonal approaches will be helpful to corroborate current observations.
Although R-loops play conserved roles and their dynamics are required for the normal plant life cycle and heat stress responses (Xu et al., 2020), studies to date have failed to find a correlation between R-loops and RNA dynamics in different cell types/tissues or in response to stress in plants; in particular, direct experimental evidence supporting the function of R-loops in heat stress is still lacking (Fang et al., 2019; Xu et al., 2020). There are several possible explanations: (i) cell- or tissue-specific or stress-response-related R-loops are highly dynamic and are not easily identified using the DRIP method, which is suitable only for capturing stable R-loops; (ii) to date, R-loop data have primarily been acquired from whole seedlings or tissues that contain mixed cell types; (iii) the peaks of cell-type-specific R-loops are generally lower than those common R-loops in mixed-cell-type tissues, and therefore, a lower cut-off threshold may be required to identify cell-type-specific R-loops; or (iv) most RNA species have a short lifetime (average 2 min) (Baudrimont et al., 2017).
Further locus-specific experimental validation is required to exemplify various biological functions of R-loops in plants; this will provide a better understanding of the breadth of R-loop functions in gene regulation. Moreover, cell-type- or cell-cycle-specific mapping of R-loops under normal developmental or stress conditions with combinations of different orthogonal methods will advance our understanding of R-loop functions in plants. Therefore, the thrust of R-loop studies in plant species should focus on in-depth characterization and uncovering novel functions, as well as on how to take advantage of R-loops to genetically engineer plants for greater benefit to humans.
R-loops and epigenetic regulation in plants
In mammals, R-loops either prevent a DNA methyltransferase, DNMT3B1 or DNMT1, from accessing promoter CpG islands (Ginno et al., 2012; Grunseich et al., 2018) or promote the recruitment of TET, a DNA demethylase (Arab et al., 2019), to promoter CpG islands, resulting in demethylation in promoters. Currently, little is known about the mechanisms underlying R-loop-dependent DNA methylation in plants. In both A. thaliana (Xu et al., 2017) and rice (Fang et al., 2019), integration analysis showed that there is less CG methylation, but more CHG and CHH methylation, in R-loop regions compared with R-loop-deficient regions. Overall, a positive correlation between total DNA methylation levels (5mC and 6mA) and the number of R-loops was recently revealed in rice (Fang et al., 2019). The relationship between R-loops and DNA methylation becomes more complex when associations with transposable element (TE) and non-TE genes are considered (Fang et al., 2019). Therefore, additional detailed studies are needed to elucidate the complex relationships between R-loops and methylation in various genomic contexts.
Connections between R-loops and chromatin patterning have been revealed in mammals (Sanz et al., 2016). For instance, R-loops facilitate the recruitment of Enhancer of zeste homolog 2 (EZH2) and RING1B, resulting in chromatin compaction and silencing of related genes (Skourti-Stathaki et al., 2019). On the other hand, Mdm2 and PRC1, Polycomb repressive complex (PRC)-related chromatin modifiers, prevent R-loop formation (Klusmann et al., 2018), suggesting conditional connections between R-loops and methylation at H3K27 in the genome. In addition, the histone deacetylase mSin3A prevents the accumulation of aberrant R-loops (Salas-Armenteros et al., 2017), indicating a connection between R-loops and histone acetylation. In mammals, arginine methyltransferase 5 (PRMT5) facilitates the elimination of R-loops (Mersaoui et al., 2019). R-loops also play key roles in recruiting heterochromatin-related components such as the G9a histone lysine methyltransferase at transcription termination regions, resulting in enrichment of the repressive mark H3K9me2 and the heterochromatin protein 1γ (HP1γ) (Skourti-Stathaki et al., 2014). The suppression of G9a leads to H3K9 demethylation, leading to the unscheduled production of R-loops at rDNA loci in humans (Zhou et al., 2020). METTL8, a 3-methylcytidine-specific RNA methyltransferase, forms a large SUMOylated nuclear RNA-binding protein complex (~0.8 MDa) that induces R-loop accumulation (Zhang et al., 2020). R-loops foster methylation at H3K4 and H3K36 in nucleosomes flanking R-loops by recruiting SET1/MLL, H3K4 methyltransferase family members (Tomson and Arndt, 2013).
In contrast to mammals, the interactions between R-loops and histone marks have not been well studied in plants, and therefore, the underlying mechanisms remain largely unknown. It has been reported that R-loops and the repressive histone mark H3K27me3 are co-localized at the loci recognized by the lncRNA APOLO in A. thaliana (Ariel et al., 2020), which agrees well with the global enrichment of R-loops and the H3K27me3 marker in genic regions in both rice and humans, but disagrees with that in the gene-body regions in A. thaliana (as shown in Fig. 3), suggesting that a subset of R-loops in A. thaliana may exhibit differential epigenetic features compared with others in the same genome. APOLO can modulate local chromatin conformation around its target genes, such as PID, WAG2, and AZG2, through decoying the PRC1 component LHP1 (Ariel et al., 2020). Furthermore, a more recent report showed that interplay among R-loop stabilization, m6A modification, and co-transcriptional RNA processing can trigger chromatin silencing at the FLC locus in A. thaliana (Xu et al., 2021). This process is mediated by several trans-acting factors, such as FCA, and FY/WDR33 and other factors involved in 3′-end processing and H3K4me1 demethylation (Xu et al., 2021). Based on limited information from genome-wide studies on the association between R-loops and histone marks in A. thaliana (Xu et al., 2017) and rice (Fang et al., 2019), we have summarized the similarities and differences in histone marks associated with R-loops in rice, A. thaliana, and humans (Fig. 3).The creation of chromatin-modifier-related mutations will help unravel how histone marks affect R-loop formation in plants. Attention needs to be paid to the fact that the majority of R-loop data in mammals was derived from relatively homogeneous cells/cell types, while plant data were acquired from plant tissues with multiple cell types and mixed cell-cycle stages. Plant data therefore represent an average case of mixed cells. In the future, single-cell technology will permit the acquisition of data from more synchronized epigenetic landscapes and consistent R-loop distributions.
Heat map showing the enrichment of histone marks associated with R-loops in rice, Arabidopsis thaliana, and humans. The heat map was produced from published data in rice (Fang et al., 2019), A. thaliana (Xu et al., 2017), and humans (Sanz et al., 2016). NS, not significant; NA, not applicable for the mark.
Conclusion and future perspectives
The goals of R-loop studies in plant species are to advance our understanding of gene-regulatory mechanisms and to identify effective approaches to manipulate phenotypes as a means of enhancing biotic/abiotic stress tolerance and increasing biomass and productivity. Toward these objectives, R-loop formation in various genomic contexts (haploids, diploids, and polyploids) and the roles of R-loops in many plant-specific biological processes (e.g. photosynthesis, flowering, seeding, fruiting, nodulation, biotic and abiotic stress responses) are of interest (Fig. 4).
Since plant research involves fewer ethical issues and restraints compared with studies in humans and other mammals, plants lend themselves well to these studies and can be used to answer fundamental questions pertaining to R-loops. Possible future directions for plant R-loop research are proposed in Fig. 4. These include their formation and regulatory functions, and interactions with DNA/histone methylation and with DNA and nucleosome proteins. Plants can be genetically transformed to test numerous genes, and a large number of the corresponding transgenic lines can be propagated to enable DRIP-seq and other experiments that require a significant amount of material to profile genome-wide R-loops and to conduct various molecular characterizations. In addition, since some plants are haploid, diploid, and polyploid, and also have variations in heterozygosity, this will permit the investigation of R-loop formation in different genomic contexts to advance our understanding of their functions.
In summary, R-loop studies in plants can clarify not only fundamental aspects of R-loops, including R-loop formation and their interactions with R-loop-binding proteins and epigenetic marks, but also the roles of R-loops in plant-specific developmental processes and adaptation to environmental conditions and stresses. The elucidation of these mechanisms will advance our understanding of the regulation of plant development and differentiation, and open new avenues for the genetic manipulation of gene regulation toward even greater benefit. The adaptation of RNase H-based methods, MapR/BisMapR, and SMRF-seq for use in plants could provide more exhaustive profiling of plant R-loops, thereby advancing the understanding of R-loop biology in plants.
Acknowledgements
We were unable to cite all the R-loop-related literature due to space constraints, and apologize to any colleagues whose work may have been omitted. We thank Dr Jennifer Sanders for helpful discussion and scientific editing. We thank the Bioinformatics Center at Nanjing Agricultural University for providing facilities to assist with primary sequencing data analysis. This work was supported by research grants from the National Natural Science Foundation of China (32070561 and U20A2030).
Author contributions
WLZ and JJG conceived the layout of the article; JJG collected and summarized the key information; HRW and WLZ wrote the manuscript; PYZ analyzed the data and drew the figures; XXL prepared Table 1 and modified Fig. 2; WQW conducted proofreading.
References
Author notes
These authors contributed equally to this work.




Comments