G-quadruplex–R-loop interactions and the mechanism of anticancer G-quadruplex binders

Abstract Genomic DNA and cellular RNAs can form a variety of non-B secondary structures, including G-quadruplex (G4) and R-loops. G4s are constituted by stacked guanine tetrads held together by Hoogsteen hydrogen bonds and can form at key regulatory sites of eukaryote genomes and transcripts, including gene promoters, untranslated exon regions and telomeres. R-loops are 3-stranded structures wherein the two strands of a DNA duplex are melted and one of them is annealed to an RNA. Specific G4 binders are intensively investigated to discover new effective anticancer drugs based on a common rationale, i.e.: the selective inhibition of oncogene expression or specific impairment of telomere maintenance. However, despite the high number of known G4 binders, such a selective molecular activity has not been fully established and several published data point to a different mode of action. We will review published data that address the close structural interplay between G4s and R-loops in vitro and in vivo, and how these interactions can have functional consequences in relation to G4 binder activity. We propose that R-loops can play a previously-underestimated role in G4 binder action, in relation to DNA damage induction, telomere maintenance, genome and epigenome instability and alterations of gene expression programs.


INTRODUCTION
The cell genome is constituted by B-form duplex DNA wrapped around histone octamers forming the highly conserved nucleosomal structure. Nevertheless, genomic DNA and cellular RNA can form a variety of non-B secondary structures, including G-quadruplex (G4), which can play major roles in the regulation of nucleic acid functions and genome stability in living cells. A G4 is formed by two or more stacked guanine tetrads held together by Hoogsteen hydrogen bonds and stabilized by K + and Na + (Figure 1). G4s are structurally polymorphic as guanines can come from the same or different strands (intra-strand and interstrand G4s, respectively), different numbers of nucleotides can separate the tetrad guanines forming loops of different length and the strand orientation can be either parallel, antiparallel or a mix of them ( Figure 1). Interestingly, G-rich sequences have been shown to adopt alternative conformational structures in vitro (1,2), raising the possibility that conformational changes of G4s are critical for regulation of cellular functions.
These structures can primarily form in G-rich stretches of the genome, such as in CpG islands, microsatellite and telomeric repeats, as well as in G-rich segments of RNAs (3). Convincing evidence primarily comes from genetic investigations of G4 functions (3,4), evolutionary conservation of potential G4-forming sequences (PQS) (5,6), the discovery of several G4-binding proteins in cells and viruses (3,4,7), NMR studies (8) and the visualization and genome mapping of G4s by chemical probes or specific antibodies (3,9). Bioinformatic tools have been developed to scan entirely prokaryotic and eukaryotic genomes to predict PQS (9,10). Even though the numbers of PQS can vary in a given genome, they are consistently enriched at key regulatory sites in eukaryotes, notably replication origins, gene promoters, untranslated exon regions, short sequence repeats and telomeres (11,12).
Research on G4 structures and functions is highly interrelated with an equally intense search for specific G4 binders endowed with therapeutic activity, in particular anticancer effects. Hundreds of compounds able to bind and stabilize G4s ( Figure 2) have been developed with the aim to specifically target telomeric or oncogene promoter G4s in cancer cells (13)(14)(15)(16). The general rationale was based on the observation that cancer cells can be addicted to activated driver oncogenes and/or to a proper regulation of telomeres to prevent senescence. Thus, specific downregulation of driver oncogenes or telomere destabilization could cause cell death or at least cell proliferation inhibition resulting in anticancer activity. Nevertheless, despite the high number of known G4 binders, such a selective molecular activity has not been definitely established in cancer models. In addition, certain G4 binders can interact with i-motifs (17) (a C-rich-strand non-canonical DNA structure (1)) raising questions about target specificity in living cells. More recent studies also showed that a nuclear enzyme, DNA Topoisomerase II, may be involved in the action of some G4 binders (18,19). Although Topoisomerase II-dependent molecular mechanisms remain to be fully defined, it is noteworthy that the enzyme may contribute to cell-killing activity of pyridostatin but not PhenDC3 (19), two structurally-different ligands ( Figure 2). Thus, as few G4 binders have entered early phases of clinical trials and none has shown good efficacy in cancer patients yet (10)(11)(12), a deeper understanding of the mechanism of action of G4-interacting agents is needed to provide a strong rational for the development of ligands effective in cancer patients.
Here, we discuss published data that point to a different molecular action of G4 binders as these compounds may exert their biological activity through a more general mechanism rather than the inhibition of specific oncogenes or impairment of telomere maintenance. General mechanisms can be the induction of DNA damage and replication stress, an overall impairment of transcription and translation regulations, and the trigger of genomic and epigenetic instabilities. Such a mode of action should not be seen as a disadvantage in the discovery of effective anticancer G4 binders, as several FDA-approved drugs have pleiotropic effects at cellular and molecular levels, and still they get a specific phar-macological action in killing cancer cells. Here, we do not discuss important topics of G4 functions, enzymes resolving G4s and nanotechnological developments as they have been the subjects of recent excellent reviews (1)(2)(3)(4)(9)(10)(11)(12)(20)(21)(22)(23).

Lack of specific recognition of G4 structures by known ligands
Many of the known G4 binders have a planar aromatic moiety ( Figure 2), which binds to G4 structures viainteractions with a terminal G-quartet (11,16). The ligand-G4 complex is further stabilized by electrostatic attraction between the G4 backbone phosphates and the protonated groups of the ligand. This type of non-specific molecular recognition allows an effective binding to a number of G4 targets, however most ligands do not discriminate among different conformational classes of G4 structures. Efforts have been reported to modulate the molecular recognition of small molecule towards a higher selectivity of G4-ligand interactions (14), however the design of compounds targeting only one or few G4 topologies or sequences is challenging as in vivo G4 polymorphisms are not fully determined and multiple G4 folds may be present at single sites in living cells.
Computational analyses of genomes have shown that PQS are widespread and non-randomly distributed in several different species (10,12). A feature that underlies the importance of G4s in biological processes is the high conservation of PQS in yeast species (5,6) and mammals (24)(25)(26). PQS are more conserved than expected by chance and nucleotides required to promote G4 formation are more conserved than surrounding nucleotides. More than 370 000 sequences have been predicted to form a G4 in the human genome (25,26). A recent study of genomic PQS showed that single-nucleotide-loop PQS (such as G3NG3NG3NG3) are most abundant in the genome of several species and are also conserved significantly in the human genome (6). The most conserved sequence in vertebrates (G3AG3AG3AG3) forms the in vitro least stable G4 and is the least prone to induce genomic instability in cells. Its frequency is higher than expected specifically in mammals (6). These findings show that most stable PQS are negatively selected in favour of least stable G4s to reduce their effects on genome stability while maintaining G4 structures with a biological function.
A different approach based on a polymerase-stop assay and Illumina sequencing was used to identify G4 structures in the human genome (27). The approach allowed the identification of PQS at sites of polymerase stops caused by G4 folding of the template DNA in the presence of pyridostatin ( Figure 2). Interestingly, the authors found >700 000 PQS in the human genome, 63% of which was not predicted by bioinformatic analyses by Quadparser (24,25), including long-looped and bulged structures. The experimental mapping approach was recently improved and applied to identify PQS in 12 different species (28). Interestingly, PQS are enriched at control genic regions, such as promoters, in human and mouse genomes, but not in other species, suggesting a potential role in transcription regulation that seems to be specific for mammals and a few of other distant species (28). Moreover, the authors showed that pyridostatin stabilizes many different DNA G4 structures. In particular, G4s with only two guanine quartets were highly enriched in ligand-treated samples (28). Overall, pyridostatin can stabilize several different G4 structures in the human genome, including bulged, long-looped and two-quartet structures. Thus, the data indicate that a G4-interacting compound can bind and stabilize several different G4 assemblies, and this may likely result into low target selectivity in living cells.

G4 formation: strand separation is necessary but likely not sufficient
G4 formation in living cells is under complex regulation mechanisms likely governed by protein factors and physical conditions of nucleic acids. The impact of helicases, endonucleases, polymerases and other specific factors on G4 regulation has been discussed recently (4,10,20,21). In living cells, a first constraint to G4 formation is the presence of chromatin structure, which likely poses a prerequisite for G4 formation: the removal of nucleosomes at PQS, as supported by the high G4 density observed at DNase I-hypersensitive sites (24,29). In addition, G4s have been mapped at nucleosome-free DNAs adjacent to fixed nucleosomes, suggesting a role of G4s in nucleosome positioning (30). Thus, a nucleosome at a PQS would counter G4 folding of the sequence, and repressive chromatin structures can prevent dangerous consequences such as G4-induced genomic instability (3,10).
Binding interactions of G4s with specific ligands have mostly been studied using single-stranded DNA substrates. However, with the exception of 5 -TTAGGG repeats at the 3 of telomeres, G-rich sequences are usually paired with the complementary C-rich sequences throughout the genome. Many G4s can readily form in free single-stranded DNAs, nevertheless in vitro analyses showed that G-rich sequences prefer to form duplex rather than G4 under physiological conditions (31)(32)(33)(34). Thus, with the exception of telomeres, the melting of a DNA duplex is the second prerequisite in living cells for the subsequent formation of G4s or other non-B structures. The energy for the melting of the two strands of a duplex can likely come from negative torsional tension generated by an elongating RNA polymerases, as predicted by the twin-supercoiling domain model (35). Behind the RNA polymerase, negative supercoils can be transformed into strand separation and this would allow the formation of G4s in G-rich stretches of melted strands. However, detailed analyses of G4 formation in relaxed and negatively-supercoiled plasmid DNAs have been reported showing that negative superhelicity is not sufficient to drive the formation of G4 in plasmids in vitro (30,36). Even though the local density of negative torsional tension may accumulate at higher levels in chromatin than in a plasmid in vitro (37), these findings are in contrast with results showing a ready formation of other non-B DNA structures (Z-DNA, cruciforms etc.) in negatively-supercoiled plasmids (38,39). The formation of non-B structures may however follow distinct mechanisms depending on the nature of the specific structure, in particular G4 assembly likely proceeds via a slower reaction constituted by discrete pre-folded intermediates (40). In fact, in comparison with other non-B secondary structures, G4 folding may be characterized by a higher kinetic barrier as the melting of a relatively longer duplex region is required for even the initial G-quartet formation (36).Therefore, negative torsional tension can be insufficient to drive G4 formation through strand separation.
In contrast with the above investigations, another study showed that a c-myc promoter G4 can readily form in negatively-supercoiled but not relaxed plasmids (41). At this specific promoter sequence, however, G4 assembly is coupled with the concomitant formation of an unusually stable i-motif on the opposite strand under physiological conditions (41). An i-motif is a non-canonical structure of a Crich DNA strand containing C-C+ base pairs that requires slightly acidic conditions in vitro (1,11). Thus, the data document that the i-motif on one strand can stabilize the G4 on the other strand, which can explain the ready formation of G4s in a supercoiled plasmid (41). Interestingly, a different study reported that short peptide nucleic acids (PNA) able to bind a C-rich strand region of the human BCL2 promoter induced the formation of G4 in the opposite G-rich strand, and viceversa, the invasion of a PNA into the promoter DNA duplex required G4 formation (42). Collectively, these studies are therefore consistent with the hypothesis that the formation of a G4 in living cells requires both the negative torsional tension of the DNA duplex as well as a concomitant stabilization of the complementary strand, which may help in overcoming the kinetic barrier of G4 folding.
Contrasting results have been shown in the case of a microsatellite DNA found at about 350 bp from the insulin gene (insulin-linked polymorphic region) where either G4 or i-motif, but not both, can form in a linear duplex DNA under acidic conditions (43). The results may therefore depend on the experimental conditions that were not physiological in this case. However, non-B structures can compete with each other, regulating transcription as shown for the BCL2 gene (44,45). In nuclear chromatin, topological domains can dynamically change and supercoiling density is likely not uniform along a given genomic region (46). Nearby non-B structures can compete to absorb the mechanical energy stored in negative supercoils generated by the transcriptional apparatus (47). Thus, in living cells, the formation of any given G4 may also be dependent on the competing folding of nearby non-B structures, particularly in negatively-supercoiled DNA regions.

Interplay among G4s, R-loops and DNA supercoiling
A single-strand DNA in a promoter can be stabilized by several means, including binding to specific transcription factors (48), formation of a secondary structure (such as imotif) or binding to a nucleic acid (DNA or RNA) other than the original complementary DNA strand. Among all the possibilities, the formation of a hybrid DNA:RNA duplex on the opposite strand can occur in case of R-loop, another non-canonical secondary structure. R-loops are three-strand structures wherein the two strands of a DNA duplex are melted and one of them is annealed to an RNA, forming a hybrid duplex, while the other strand is displaced out ( Figure 3A). R-loops are favored by G-rich sequences on the coding strand as established by thermodynamic measures of the stability of DNA:DNA and DNA:RNA duplexes showing that the hybrid duplex is favored mainly in the presence of Gs in the RNA strand (49,50). Thus, in genomic regions with a G-rich coding strand, the formation of an R-loop would lead to a displaced strand that is rich in Gs therefore allowing the assembly of a G4 structure. It is worth considering that the length of PQS is in the order of few tens, whereas an R-loop can extend for hundreds or thousands of bases as shown in genome mapping studies (51,52) (see also Figure 3B). Therefore, interactions between the two structures do not likely involve the entire length of the displaced strand of R-loops.
As DNA:RNA hybrid formation also requires strand separation of the original DNA duplex, most important factors for R-loop formation are the negative torsional tension of the DNA duplex and the presence of a complementary RNA transcript, which naturally occurs in transcribed genes. The close structural relationship among negative DNA supercoiling, transcription and R-loops dates back to the 1970s of last century. As in vitro nascent transcripts were found in a nuclease-resistant complex with DNA depending on the superhelical density of the DNA template, Richardson (53) clarified that the nuclease-resistant transcript was bound to the template strand by base pairing forming a hybrid duplex, and that the formation of such hybrid is markedly affected by the superhelical tension of the studied bacteriophage PM2 DNA. This early observation established that in vitro transcription-dependent R-loop formation dramatically depends on negative supercoils of the template.
The relationships between R-loops and DNA topology, and the biological consequences of R-loop formation were then investigated in details in living cells by the rigorous work of M. Drolet and collaborators in the 1990s (54,55), anticipating the high and broad interests on R-loop biology of last two decades. Working with plasmids and Escherichia coli, they established that R-loops are most sensitive to the template DNA supercoiling level both in vivo and in vitro, and that altering the equilibrium balance of Rloop levels in the genome can lead to severe growth and segregation defects in bacteria (56,57). In agreement with this knowledge, bacterial DNA topoisomerase I, an enzyme able to reduce negative torsional tension of the DNA template, is a critical factor in modulating R-loop levels in E. coli cells (58,59).
More recent works have extended our knowledge on Rloop biology and DNA topology providing data showing that R-loops are a common non-canonical structure widely present in bacterial and eukaryotic nuclear and mitochon- drial genomes (52,(60)(61)(62). R-loops are dynamically formed at highly transcribed genes, in particular at their 5 and 3 ends. In mammalian cells, negative supercoiling can accumulate at transcription start sites of active genes thus favoring strand separation of the DNA template (30,37,63,64). Even though a low level of R-loops is present at any given locus at steady-state, however R-loops can cover 3-5% of the genome as determined in many organisms by several groups (65)(66)(67)(68)(69)(70)(71). In agreement with findings in bacterial cells, DNA topoisomerase I, which is very active in reducing negative supercoils, is a main player in modulating R-loop levels in nuclear chromatin of yeast (72), plants (73) and human cells (51). Consistently, human DNA topoisomerase I has been shown to be recruited along transcribed genes and is activated by the elongating RNA polymerase II to relax DNA supercoils generated by the enzyme, thus achieving an efficient transcription process (74). The impact of negative DNA torsional tension on DNA structure is not limited to R-loops and G4s, as dynamic topological changes can affect the formation of other non-canonical structures of the DNA template (30). Notably, the interplay between DNA supercoiling and non-canonical structures can have a critical function in regulating the transcription of the c-myc gene (37,46). R-loop mapping along the genome provided information on R-loop length showing that they can extend for several hundreds of bases at each locus (64), therefore providing the melting of long sequences that is required for G4 formation on the opposite strand. An early demonstration that this can occur has come from N. Maizels' laboratoy (75) working in in vitro systems as well as in bacterial cells. By using electron microscopy and biochemical assays, they showed that high transcription rates induced the formation of a novel structure termed G-loop, constituted by G4 structures on the G-rich non-template strand and a stable RNA/DNA hybrid on the template. Formation of Gloops was dependent on transcription, G-richness of the non-template strand, which can also reflect a higher stability of hybrid duplexes, and negative supercoiling of the template DNA. In agreement with these findings, a recent report showed that an intramolecular DNA G4 and a hybrid duplex promptly form upon in vitro transcription of a plasmid and, once formed, the two structures persist for several hours at physiological temperature even after tran-Nucleic Acids Research, 2020, Vol. 48, No. 21 11947 scription was stopped (76). Moreover, formation and persistence of the G-loop strongly depended on the hybrid duplex (76). Atomic force microscopy was used to analyze G-loops generated by transcription of a murine immunoglobulin Sc3 switch region cloned into a plasmid,showing that the structure was dependent on the presence of the hybrid duplexes as they disappear upon RNaseH1 treatment (77). Another study showed that the structural organization of the nontemplate strand is a fundamental feature of R-loops even though the structural characteristics of the non-template strand were not clearly defined (78). Interestingly, depletion of telomeric RNA transcripts (TERRA) in human cells resulted in a decrease in telomeric G4 structures suggesting that R-loops may affect G4 formation at telomeres (79). Overall, the findings are consistent with the hypothesis that the displaced strand of an R-loop can fold into G4 structures also in chromatin and that G4s and R-loops can likely assemble at the same time at highly active genes in living cells.
Interestingly, ssDNA-binding proteins have been reported to bind the displaced strand of R-loop. In particular, RPA (replication protein A), a heterotrimeric protein complex, can bind and stabilize ssDNA segments playing a critical role at replication forks to coordinate sensing of excess ssDNA, activation of DNA damage checkpoints, replication and recombination (80,81). RPA therefore can sense and stabilize the displaced strand of an R-loop and then recruit different factors to regulate R-loops during transcription preventing genome instability (82). Consistently, the ssDNA-binding protein of Arabidopsis AtNDX can also bind to the displaced strand and stabilize the R loop at the COOLAIR promoter resulting in the inhibition of COOLAIR transcription (83). Therefore, altogether the findings support that the formation of a G4 in one strand is highly favored by a hybrid duplex in the opposite strand, and viceversa, the formation of an R-loop is highly favored by the stabilization of the displaced strand by G4s or ssDNA-binding proteins ( Figure 3A).

Hybrid G4s and R-loops
G4 structures are heterogenous and many distinct conformations have been reported (1,2,11,84), however their roles in the mechanism of action of G4 binders is not known. A peculiar inter-strand G4 structure has been described in the conserved sequence block II (CSB II) of human mitochondrial genome (85,86). CSB II is a G-rich sequence that critically regulates the initiation of leading-strand replication starting from an RNA primer generated by the transcriptional apparatus. During transcription of CSB II, a long-lived R-loop can form likely due to the formation of a stable hybrid G4 constituted by the non-template DNA strand and the nascent RNA. The stability is likely due to the RNA, which is annealed to the DNA template with its 5 portion while forming the hybrid G4 with its 3 portion. The peculiar structure will lead to the stop of transcript synthesis therefore allowing the use of the RNA as a primer for DNA synthesis in human mitochondria. The hybrid G4 can therefore play a main role in transcription inhibition and regulation of mitochondrial DNA replication (85,86). A recent paper showed that RHPS4, a known G4 binder, is unable to induce DNA damage in the nucleus at low doses, however it can trigger mitochondrial respiratory-complex depletion by impairing specifically mitochondrial transcription (87). Although the paper did not identify the target of the RHPS4, the authors suggest a specific impairment of transcription regulation at the CSB II locus, raising the question of whether RHPS4 may specifically interact with the hybrid G4 structure involving the CSB II sequence.
The discovery of R-loop/hybrid G4 structure emphasizes the high conformational potential of non-B secondary structures and their specific functions. Hybrid G4s may also be present at many loci of the nuclear genome, as predicted by bioinformatic analyses, the in vitro detection of such a structure at the human NRAS promoter (88,89) and at human telomeres (90,91). Interestingly, hybrid G4s may also occur at the immunoglobulin heavy-chain (IgH) locus during class switch recombination (CSR) (92). The IgH locus has G-rich sequences at the switch region likely forming G4 structures, which can in turn bind and recruit the AID enzyme promoting DNA mutations (93). The switch region can also forms R-loops, which trigger the recombination mechanism leading to the change of the constant region of immunoglobulins (94). New findings showed that DEADbox RNA helicase 1 (DDX1) is needed for CSR in murine B lymphocytes. Interestingly, DDX1 interacts with RNA G4s formed in the transcript from the switch region, promoting their resolution and the annealing of the same RNA sequence to the template DNA strand forming an R-loop over the switch region (92). Consistently, stabilization of the RNA G4 with an RNA-specific ligand prevented the formation of the R-loop. As the same PQS is present in both the transcript and the DNA, the authors proposed that hybrid DNA:RNA G4s might form at the IgH locus switch region involving the non-coding RNA and the displaced strand of the R-loop. The new DDX1-dependent mechanism therefore explains the recruitment of AID to the displaced strand of the R-loop, thus allowing the mutation of the DNA strand by AID (92,95). It is noteworthy that nontoxic doses of RHPS4 ( Figure 2) have been reported to inhibit CSR in murine B cells in culture and in animals, showing a therapeutic effect on allergic inflammation (96).
Thus, the findings overall provide evidence of the occurrence in living cells of DNA:RNA hybrid G4s and their close interplay with R-loops. Interestingly, R-loop formation may be required to recruit RNA G4s at specific sites, such as the mitochondrial CSB-II motif and IgH locus switch region. Structures constituted by hybrid G4s and Rloops are exciting novel themes in the G4 field as they may offer more structural opportunities to design target-specific compounds.

Dynamics of G4s and R-loops in human cells
The formation of G4s and R-loops can be visualized in cultured cells by immunofluorescence microscopy (IF) using the specific antibodies BG4 and S9.6. BG4 is a useful tools to detect G4s in living cells as it specifically recognizes intramolecular and inter-molecular DNA and RNA G4 with high affinity (K d = 0.5-2 nM) (97,98). S9.6 is a mouse monoclonal antibody developed by Bogluslawski et al. (99) that binds to DNA:RNA hybrids with nanomolar affinity. How-ever, as it can also targets double-stranded RNA (100), appropriate controls are always needed in S9.6-based assays to detect R-loops in cells (51,64,71).
G4 foci were commonly observed after long time (24 h) of treatment with G4 binders (97), nevertheless, under these conditions it remains undetermined whether G4s were directly stabilized by the binder or indirectly mediated by cellular mechanisms. Recent kinetic analyses provided strong evidence that a number of structurally-unrelated G4 binders can stabilize G4 structures and increase the level of nuclear G4 foci in cultured cells at short time in a transient manner (71,101). G4 stabilization in cells follows a biphasic kinetic with an immediate (5 minutes) increase of G4 foci and a second phase of G4 level reduction, as detected by IF with the BG4 antibody (71). Cellular kinetics are very rapid as the number of G4 foci returns to initial levels in 20-30 min. The immediate induction of G4 foci is a proof that the studied compounds can act at their targets and stabilize G4 structures in nuclear chromatin of cancer cells.
Very similar kinetics were observed for R-loop levels in cells treated with G4 ligands (71) and with Topoisomerase I poisons (102,103). An immediate increase of R-loop levels, as detected with S9.6 antibody specific for DNA:RNA hybrids, followed by a marked decrease of R-loops. In the case of Topoisomerase I poisons, the biphasic kinetics paralleled the levels of Top1ccs (DNA-enzyme cleavage complexes, which are the molecular signature of poison activity), consistently with the hypothesis that poisoning of Top1 can directly cause the R-loop increase (102). Immediate molecular perturbations by topoisomerase I poisons at cellular levels are not restricted to R-loop levels and have been discussed previously (104).
Both G4s and R-loops can form at transcribed genes in living cells (52,60), nevertheless they are highly dynamic as several helicases, RNA-binding factors, endonucleases and DNA topoisomerases are active in cells to dissolve the structures restoring B-DNA duplexes and nucleosomes. The mechanisms of the biphasic curves remain to be established, however a simple hypothesis is that helicases or other enzyme able to dissolve G4/R-loop structures are promptly activated to respond to the raise of the structures. A steadystate equilibrium is generally set at low levels in cells and is likely a balanced outcome of molecular activities that induce the formation of G4s and R-loops, on one hand, and factors promoting their dissolution, on the other. Thus, the global levels of G4s and R loops are perturbed in a dynamic manner by external agents and cellular regulatory mechanisms promptly respond to restore the initial overall levels. Such dynamical R-loops / G4s interplay can have significant biological consequences as discussed for replication and DNA repair in a recent review (105).

G-loop role in the induction of DNA damage by G4 binders
It is now established that chemical G4 stabilization can trigger genome-wide DNA double-stranded breaks (DSB) and genome instability, nevertheless the mechanisms of damage is not yet fully clarified, even though different molecular pathways leading to DSB formation have been proposed. One mechanism can be the direct cleavage of DNA at G4 structures, which may occur in some circumstances. G4a are recognized by several protein factors but are generally resistant to nucleases without a prior resolution of the secondary structure. Interestingly, the multifaceted DNA2 enzyme, critical for telomere stability, has been shown to have such an activity. DNA2 can bind and unwind telomeric G4s in vitro. In addition, its nuclease activity is activated by interactions with RPA, which is a determinant factor for specific G4 cleavage by DNA2 (106). The nuclease has been proposed to function during telomere replication and processing of Okazaki fragment at replication forks (106,107), possibly explaining G4-mediated DNA cleavage at telomeres and elsewhere in the genome during S phase. Interestingly, DNA2 depletion was shown to increase fragile telomeres induced by two G4 binders, TMPyP4 and telomestatin ( Figure 2) (107), however, the role of DNA2 and other endonucleases in the induction of DNA cleavage by G4 binders needs to be established in relation to basic nuclear processes and genomic regions.
An important mechanism of DNA damage production can be the replication fork stalling and collapse occurring at a G4 structure on the template strand. G4 structures can constitute a physical impediment to replication progression at leading and lagging strands (3,21), and unresolved replication barriers trigger recombination-dependent restart of DNA synthesis and DSB formation likely by structurespecific endonucleases (for instance, MUS81 and XPF), active at yet undefined replication intermediates (23). DSB are usually investigated by determining the formation of ␥ H2AX and p53BP1 foci, DSB molecular markers, and by assessing the ligand cellular effects such as the cell cycle arrest at G2/M phase and the activation of the DNA damage response pathway (DDR). The induction of replicationdependent DNA damage by G4 binders was firstly established at telomeres using different ligands causing a fragile telomere phenotype (108)(109)(110). However, it was soon established that G4 binders can cause DNA damage across the entire genome in cultured cells showing a high dependence on S-phase and DNA replication (111). G4 binders may have a stronger effect in cells deficient for the homologous recombination DSB repair (HRR) pathway, supporting a main role of the mechanism in G4 binder-induced DSBs. The knowledge came from observations in cancer cells deficient for BRCA1/2 genes, critical players of HRR, as G4 binders were shown to be more active in DSB accumulation and persistent checkpoint activation in these cell types (112). In addition, G4 binders were also more active in reducing cell proliferation and inducing chromosomal aberrations. As BRCA1/2 gene mutations have prognostic value in cancer patients, the findings can have a high impact for the development of G4-interacting compounds effective in clinical settings.
DNA breaks caused by a replication-blocking G4 have been investigated in the model organism C. elegans, revealing peculiar effects of G4 stabilization and a main errorprone repair mechanism leading to genetic instability (113). Genetic evidence showed that a persistent G4 structure can be transmitted through multiple mitotic divisions to daughter cells. As the persistent G4 can block replication causing a strand break gap in the annealed strand, the structure then leads into a DSB at the next round of DNA duplication in the daughter cell. The DSB is then repaired by an error-prone mechanism based on DNA polymerase theta (POLQ) leading to specific DNA deletions (113). Interestingly, translesion synthesis DNA polymerases were found to protect C. elegans from genomic deletions caused by G4induced replication blocks (114). As molecular and genetic differences exist between Caenorhabditis elegans and mammalian cells, these mechanisms remain to be defined in human cells.
Another important mechanism of G4 binder-induced DNA damage is more related to the transcription process, as transcribing PQS can be challenging and can lead to genome instability. This mechanism has been suggested by investigating DNA damage caused by pyridostatin ( Figure  2). Treatment of cancer cells with the ligand resulted in a fraction of ␥ H2AX foci that was transcription dependent, and DSBs induced by pyridostatin were mapped at transcribed genes enriched for PQS, such as ribosomal genes and the SRC oncogene, but not at inactive genomic regions (111). Some insights into the transcription-dependent mechanism was reported recently. Stabilization of G4s by three structurally unrelated ligands (pyridostatin, FG and Braco-19, Figure 2) in cancer cells was shown to increase the global levels of nuclear G4/R-loop structures (71). The genome-wide mapping of R-loops in cells treated with either one of two binders (pyridostatin or FG) showed that G4 binder-induced R-loop peaks were commonly found in transcribed genes and were longer than correspond-ing peaks in untreated cells ( Figure 3B) (71). The findings supported a simple mechanistic model in which binderstabilized G4s in the displaced strand of an R-loop can result in an overall stabilization of G4/R-loop structures with a longer hybrid duplex at transcribed genes (Figure 3), in agreement with transcription-dependent G-loop formation in E. coli and plasmids (75). These findings were recently confirmed with monohydrazone derivatives that specifically bind to G4s (101).
Interestingly, G4/R-loop increase preceded the formation of DNA damage, as shown by formation of ␥ H2AX foci, and overexpression of RNaseH1 in cells abolished DNA damage and cell death induced by the studied G4 binders (71). Moreover, G4 binders caused the generation of micronuclei at later times of cell growth in an Rloop-dependent manner, and BRCA2 silencing enhanced the overall effect (71). Micronuclei are chromatin fragments enveloped by a nuclear double-layer membrane and are generated following cell division defects including missegregation of chromosomes or chromosome portions. As micronuclei constitute a clear marker of genome instability, these recent findings show that G4 binders can increase genome instability in cancer cells through a mechanism dependent on R-loops and recombination repair (Figure 4). Collectively, the findings thus support a role for G-loops in the induction of DNA cleavage and micronuclei by G4 binders. The mechanism of DSB formation is unknown, The list indicates whether an R-loop role has been established in molecular mechanisms reported for the G4 binders discussed in the text.
however R-loops can be substrate of structure-specific endonucleases such as XPF and XPG, which are main players of transcription-coupled repair pathways (115)(116)(117)(118). The cleavage of both DNA strands by XPF/XPG at the boundary of the hybrid may then lead to DSB (Figure 4), however whether these or other nucleases (117) can process G-loops remain to be established (Table 1).

Genome stability is challenged by perturbations of steadystate levels of G-loop structures
G-loops may occur in nuclear chromatin more often than previously recognized, and G4 binders may exert their biological activity through interference with these complex structures. An extensively-studied genomic site is the murine IgH locus where G-loops form during immunoglobulin class switch recombination (CSR) (94). R-loops and G4s form on the template and non-template strands, respectively, during transcription of switch regions to trigger DNA breaks and the recombination pathway. Interestingly, one function proposed for G-loops is related to the link between CSR and cell replication. Replication origins are present in switch sequences and G4s can function as a loading substrate for the recruitment of origin recognition complex (ORC) (119-121). R-loops have been shown to favor the physical proximity (synapsis) of replication origins firing at multiple sites within the 3-12 kb-long recombining switch regions. Therefore, R-loops may promote DSB resolution by regulating long-distance origin interactions, thus explaining the dependence of CSR on S phase and cell proliferation (122). The mechanisms of transcription-induced genome instability has been investigated in yeast using the murine S switch region (123,124), which can readily form G-loops during transcription. Using genetic screens, Topoisomerase I was identified as a critical factor in suppressing gross chromosomal rearrangements and loss of heterozygosity at the PQS of the switch region. It is known that Topoisomerase I can regulate R-loops likely by reducing negative supercoils of the template (51,72), however the authors documented that this enzyme function is not sufficient to fully suppress genome instability at PQS (124). Interestingly, Topoisomerase I candirectly bind to G4 structures in vitro (125,126), localize to telomeres in yeast (127), cleave telomeric G-rich repeats (128). In addition, the enzyme activity is inhibited by G4 aptamers (reviewed in (129)). Thus, the findings in yeast on the S switch region may suggest a complex mechanism of G-loop regulation by topoisomerase I during transcription, which may involve the enzyme binding to G4 to recruit other factors, such as helicases, to resolve the structure and prevent gene rearrangements (124).
Unbalanced G4 stabilization is a consequence of mutation or cellular reduction of G4 resolvases, such as Pif1 and the RecQ-type Bloom (BLM) (4). Fibroblasts from BLM patients show significant genome instability in comparison to healthy fibroblasts, including a 10-fold increase in sister chromatid exchanges, which often occur at G4promoting sequences and at fragile sites of transcribed genes (130). This may suggest that recombination at active genes might involve R-loops and is a major contributor to genome instability in BLM-deficient cells. Pif1, and related family members, are among the most active helicases in resolving G4 structures and have been shown in yeast to prevent replication-dependent genetic and epigenetic alterations (131,132). In these studies, the authors showed that stabilized G4s caused replication fork arrests in a much wider region than the PQS. As Pif1 is also very active in resolving DNA:RNA hybrids (4), it is therefore possible that G4 structures may contribute to the formation of longer G-loops, explaining the reported regional rather than sitespecific fork arrests and the consequent impediment to the replication machinery (131).
Eukaryotic telomeric sequences can assemble into Rloops and G4s, suggesting that a functional interplay can occur between the two structures for telomere maintenance. Replication rate of telomeric DNAs is slower in the presence of PhenDC3, a G4 stabilizer (133), or BLM depletion (133). BLM-deficient cells exhibited more G4 struc-tures in telomeres than BLM-proficient cells, and the data indicated that the RecQ helicases allowed telomere replication by modulating G4s during G-rich strand synthesis (133). Telomeric G4s may either compete or cooperate with R-loops depending on the mechanism. For instance, an RNA:DNA hybrid occurs at telomeres due to telomerase, a reverse transcriptase that uses an RNA molecule as substrate to extend the telomeric DNA. Then, telomeric repeat-containing RNA (TERRA) can form R-loops in trans at telomeres to trigger the repair of short telomeres via RAD51-mediated homology repair, as seen by DRIPseq genome mapping (134)(135)(136). Interestingly, it has been demonstrated that ATRX (a G4-binding chromatin remodeling factor) can be recruited at tandem repeats with a high GC content, including telomeric repeats in a transcription dependent-manner (137). ATRX is recruited at telomeres to suppress replication stalling, DNA damage and recombination pathways. The report showed that the absence of ATRX causes an increase of co-transcriptional G4/R-loops at telomeres which can impair the proper maintenance of telomere structure (137). One can speculate that ATRX may be recruited at telomeres by telomeric G4s stabilized by the DNA:RNA hybrid on the opposite strand as in G-loops.
RTEL1 (regulator of telomere length) is a critical helicase for maintenance and regulation of telomere length (4). Its depletion impairs the disassembly of the telomereprotecting T-loop assembly (a lasso-like telomere organization) and enhances murine telomere fragility induced by the G4 binder TMPyP4 (138). Recent reports have extended the role of RTEL1 in protecting not only telomeres but the whole genome from replication/transcription conflicts at G-loop-forming regions (139,140). Interestingly, RETL1 helicase activity has a critical role in completing DNA duplication during mitosis at fragile loci prone to form Gloops. In the absence of RTEL1, cells accumulate G-loops at those sites, which remain under-replicated causing marked genome instability in daughter cells (140).

G4/R-loop disturbance of replication can impair epigenetic memory
Recent findings connect DNA and RNA G4s to methylation of DNA and histone post-translational modifications, revealing novel functions of G4 structures in different mechanisms of chromatin and epigenetic regulation (reviewed in (3)). Interestingly, stabilization of G4s has been shown to impair epigenetic memory by gene silencing in a manner dependent on DNA replication. In the BU-1 locus of chicken DT40 cells, G4s can arrest processive replication at the leading strand causing an impairment of the coupling of DNA synthesis and histone recycling/nucleosome reassembly failing to propagate precisely the parental pattern of histone modifications (141,142). As several specialized DNA helicases (such as PIF1, WRN, and BLM) and TLS polymerases can replicate through G4 structures, it has been shown that the helicase FANCJ and the polymerase Rev1 and PrimPol are critical to prevent leading strand replication block by G4 structures at the chicken BU-1 locus (141,142). In particular, PrimPol can bind to G4s and is then able to reprime DNA synthesis at few bases downstream of the G4 structure. As PrimPol deletion causes epigenetic in-stability and loss of gene expression, G4s may often form impediments to leading strand replication (142). Interestingly, specific G4 binders can affect epigenetic memory at the chicken BU-1 locus leading to the proposal to use such compounds in epigenetic reprogramming therapies (143). G4 binders could increase replication inhibition specifically at the locus, triggering local epigenetic instability by causing a loss of H3K4me3 and subsequently an increase of DNA methylation (143). This emphasizes the possibility to exploit G4 binders as modulators of the epigenetic memory of specific somatic tissues in pathological contexts.
Epigenetic instability has also been reported to depend on R-loop levels at the same chicken BU-1 locus. In particular, increased genome-wide R-loop levels were detected upon PrimPol deletion and replication impediments caused by purine-rich repeat (GAA 10 ) or PQS during S phase (144). RNaseH1 overexpression resulted in a marked reduction of epigenetic instability showing that R-loop formation enhanced the G4-dependent block of leading strand replication, while PrimPol repriming activity inhibited unscheduled R-loop formation (144). Although the structural relationships between the DNA:RNA hybrid and G4, or other secondary structures, needs to be fully clarified, it is possible that a G-loop may form during replication at the BU-1 locus of chicken DT40 cells. An elucidation of G4/Rloop structures at this locus in S-phase cells may also be relevant to better understand replication/transcription conflicts caused by oncogene-induced high transcription rates or by cell treatments with G4 binders and other antitumor drugs in cancer cells (60,145).

G4 binders as triggers of altered gene expression programs
The stabilization of G4/R-loop structures by G4 interacting compounds may lead not only to replication arrest, DNA damage and genomic instability, but also to alterations of gene transcription. R-loops have been shown to pause RNA polymerases at different steps of the elongation process (62,64) and to regulate epigenetic mechanisms such as DNA methylation (60,146). As G4 binders can have a stabilizing effect on R-loops at active genes, this action may therefore explain transcription rate alterations observed with G4 binders. Interestingly, a recent report has investigated the structural and functional interplays of G4, R-loops and T7 RNA polymerase transcription using biophysical assays in vitro (147). The data show that transcription elongation efficiency depends on the relative orientation of PQS. In particular, when G4s form in the non-template strand, they increase the final RNA product level due to the formation of co-transcriptional R-loops. R-loop formation in turn favors the next round of T7 RNA polymerase binding to the promoter and, hence, transcription (147). On the other hand, G4 formation in the template strand can directly block RNA polymerases. As the mechanism of how G4 binders can modulate gene transcription is not yet fully defined in living cells, it is therefore of high interest to define R-loop roles in transcriptional effects of G4 binders in future studies.
Although G4s can likely form at a multitude of active gene promoters, fewer studies have determined the genomewide effects of G4 stabilization. Depletion of BLM or WRN helicases, which likely leads to G4 stabilization, cause not only genome instability as discussed above but also specific alterations of gene expression profiles (148)(149)(150). This has mechanistically been ascribed to G4 regulation of transcription as reduced expression was observed mainly at genes harboring PQS. By studying the effects of a naphthalene diimide derivative, CM03 (Figure 2), using an RNA-seq approach on two pancreatic cancer cell lines, Marchetti et al. (151) showed after 6 h treatments a large downregulation of many genes, which are rich in PQS and involved in essential pathways of pancreatic cancer cell survival and tumor progression. In another study (152), transcriptional expression profiles affected by the G4 binder AQ1 (Figure 2), an anthraquinone derivative developed to target the KIT promoter, were determined using an RNAseq approach in canine and human cell lines. The findings confirmed the KIT gene expression down-regulation but also the down-regulation of MYC-related pathways and upregulation of p53, apoptosis and hypoxia-response pathways in both species. Beauvarlet et al. (153) investigated the effects of the triarylpyridine derivative, 20A (Figure 2), on growth arrest of cancer cells and antitumor activity in mice. Gene expression profiles were altered by the ligand in a manner dependent on G4 density of genes. However, the ligand was able to induce global DNA damage activating an ATM-dependent response and autophagy pathways, which would affect gene expression as well. The authors showed that ATM depletion could markedly reduced autophagy and senescence leading cancer cells to death (153).
RNA G4s have been implicated in splicing regulation and several mechanisms of alternative splicing alterations due to intronic and/or exonic RNA G4 have been proposed at selected genes, such as hTERT, TP53 and FMR1 (154)(155)(156). G4 binders may also affect splicing regulation by interacting with G4s folded in the pre-mRNA during maturation. Studying emetine (Figure 2), recent bioinformatic analyses showed this G4 binder could have a global effect in regulating RNA G4-dependent alternative splicing. In particular, the authors found that 60% of emetine-regulated exon skipping events contained potential G4 structures proximal to the splice site (157).
Overall, these investigations demonstrate that genetic or chemical G4 stabilization can lead to reduced gene expression by inhibiting transcription or splicing, suggesting that G4 structures can constitute a barrier to RNA polymerase elongation. Steady-state levels of transcripts were often determined after long times of treatment, therefore the data can not distinguish the contribution of cell response mechanisms from the direct effects of the studied ligands on transcription elongation. Even though further investigations are needed to fully clarify the mechanisms, the findings overall show that G4 binders might exert a pharmacological activity by specifically altering gene expression programs of cancer cells.
Gene expression can also be altered by interfering with functions and/or stability of mRNAs and protein synthesis by ribosomes. Initially, studies on G4s focused on DNA, however these secondary structures can also form in RNA strands that are conformationally, more stable as the ribose 2 -hydroxyl groups establish new intramolecular interactions (158,159). The single-stranded nature of tran-scripts may likely favor G4 folding of G-rich sequences of RNAs in vivo. A critical role of RNA G4 in cellular process has been supported by computational analyses showing their enrichment at regulatory 5 -and 3 -UTRs, enzymatic and/or chemical foot-printing, largely used to reveal G4s in transcripts, and RNA G4 structure visualization in living cells (98,(160)(161)(162). Many studies focusing at single gene transcripts, such as FMRP, KRAS, NRAS, BCL-2 and VEGF, provided data supporting ligand inhibition of translation by stabilizing G4 folding in the studied mRNA (163)(164)(165)(166)(167). High-throughput RNA sequencing technologies allowed RNA G4 mapping throughout the entire transcriptome (168), emphasizing a role in translation regulation (169), mRNA localization, turnover and metabolism (170,171). By using a specific G4-RNA chemical pull-down followed by a sequencing, it has been shown that transient G4 formation can occur in the human transcriptome and that two distinct G4 binders (Braco-19 and RHPS4, Figure 2) can influence the global G4 transcriptome landscape (172), documenting that they may exert a biological activity also by interfering with RNA G4-dependent regulation mechanisms.
Moreover, as for DNA G4s, RNA G4 helicases can play a role in mRNA regulation. By addressing genome-wide effects, the inhibition by a natural compound of the helicase activity of eukaryotic initiation factor 4A (eIF4A) resulted into the translational down-regulation of a subset of genes harboring PQS at their 5 -UTRs, which included oncogenes and super-enhancers-associated transcription factors (173). Other specific RNA helicases can also affect protein synthesis such as the cytoplasmic DHX36 helicase, which has been shown to promote mRNA translation (3).
Collectively, the findings show that DNA and RNA G4s can regulate mRNA structure, translation and overall gene expression, and G4 binders may thus alter gene expression programs of cancer cells. A common hallmark of cancer is the impairment of transcriptional regulation mechanisms that can generate a dependency of cancer cells to altered transcriptional processes (174). Moreover, cancer cells may be characterized by an overall enhanced level of gene expression, mainly due to a high transcription rate driven by c-myc oncogene amplification or overstimulation (175). In this context, on one hand, G4 and R-loop levels can be higher at cancer genes due to enhanced transcriptional activity (12,30). On the other, the cancer transcriptional addiction may be exploited to develop G4 binders more effective in down-regulating overall the cancer transcriptional program by targeting many G4 structures at gene as well as transcript levels.

CONCLUSION
The double-stranded nature of the human genome is not permanent but changes dynamically to allow fundamental processes. This offers an opportunity to modulate cellular activity by small compounds able to bind specifically to non-canonical nucleic acid secondary structures, such as G4s or even R-loops. Evidence shows that G4s are not only structurally compatible with R-loops but they can form contextually in in vitro systems and living cells. R-loop levels can be increased by G4 binders in cancer cells and Nucleic Acids Research, 2020, Vol. 48, No. 21 11953 are needed for the induction of DNA damage, cell killing and genome instability. Even though non-R-loop-mediated pathways of DNA damage induction by G4s and/or G4 binders are known (Figure 4), R-loops may play a critical role in the mechanism of action of G4 binders and novel insights are likely to provide a new rationale to discover clinically-effective new anticancer G4 binders.
Several thousands G4s can likely form in DNA and RNA strands folding in several different conformations, however known G4 binders cannot bind to only one or few G4s, but rather they bind and stabilize different conformations and many genome-wide G4s. Thus, it is unlikely that known ligands can act at very few genomic loci. Several published data support a more general mechanism of action of known G4 binders that can induce DNA damage, genome and epigenome instabilities and modification of gene expression programs, thus exerting biological activities including cell death and proliferation arrest. Therefore, future efforts need to focus on specific targets, mechanisms of action and global ligand effects in the context of specific cell tissue types to get new insights for anticancer G4 binder discovery.