NMR structure of a G-quadruplex formed by four d(G4C2) repeats: insights into structural polymorphism

Abstract Most frequent genetic cause of amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), is a largely increased number of d(G4C2)n•(G2C4)n repeats located in the non-coding region of C9orf72 gene. Non-canonical structures, including G-quadruplexes, formed within expanded repeats have been proposed to drive repeat expansion and pathogenesis of ALS and FTD. Oligonucleotide d[(G4C2)3G4], which represents the shortest oligonucleotide model of d(G4C2) repeats with the ability to form a unimolecular G-quadruplex, forms two major G-quadruplex structures in addition to several minor species which coexist in solution with K+ ions. Herein, we used solution-state NMR to determine the high-resolution structure of one of the major G-quadruplex species adopted by d[(G4C2)3G4]. Structural characterization of the G-quadruplex named AQU was facilitated by a single substitution of dG with 8Br-dG at position 21 and revealed an antiparallel fold composed of four G-quartets and three lateral C–C loops. The G-quadruplex exhibits high thermal stability and is favored kinetically and under slightly acidic conditions. An unusual structural element distinct from a C-quartet is observed in the structure. Two C•C base pairs are stacked on the nearby G-quartet and are involved in a dynamic equilibrium between symmetric N3-amino and carbonyl-amino geometries and protonated C+•C state.

2 Figure S1. Imino and aromatic regions of the 1 H NMR spectra of sl21 in the presence of K + ions. Folding conditions are indicated next to individual spectrum. A) Fast annealing in the presence of 30 or 120 mM KCl and pH 5.8. B) Slow annealing in the presence of 30 or 120 mM KCl and pH 7.2. Spectra were recorded at 800 (A) or 600 MHz (B), 25 °C in 10% 2 H2O, 90% H2O, 30 or 120 mM KCl, pH 5.8 or 7.2 (20 mM K-phosphate buffer) and oligonucleotide concentrations around 0.1 mM. Figure S2. The H6/H8-H2′/H2′′ region of NOESY (τm=200 ms) spectra and sequential walk for AQU adopted by wt22 (A) and sl21 (B). Assignments are shown next to the intraresidual H6/H8(n)-H2'(n) cross-peaks. The lines that connect the aromatic H6/H8-H2' cross-peaks are depicted with solid (between sequential residues) and dotted red lines (between non-sequential residues). Assignments corresponding to syn guanines are shown in pink, anti guanines in red and cytosine residues in black. G10H2'-C12H6 cross-peak visible at higher vertical scale is marked with a letter X. NOESY spectra were recorded at 600 MHz, 5 °C in 100% 2 H2O, 30 mM KCl, pH 5.8 (20 mM K-phosphate buffer) and oligonucleotide concentration of 0.4 mM. 5 Figure S3. Assignment of guanine residues in syn glycosidic conformation and H5-H6 cross-peaks of cytosine residues in AQU adopted by sl21. Aromatic-sugar region of A) NOESY (τm=80 ms) and B) TOCSY (τm=60 ms) spectra of sl21. Assignments of guanine residues in syn conformation are shown next to intraresidual H8-H1' cross-peaks. Assignments marked with rectangles denote H5-H6 cross-peaks of cytosine residues in AQU.
The H5-H6 cross-peaks that correspond to cytosine residues of NAN, which is present as a minor species, are indicated by stars. Spectra were recorded at 800 (A) and 600 MHz (B), 25 °C in 100% 2 H2O, 30 mM KCl, pH 5.8 (20 mM K-phosphate buffer) and oligonucleotide concentration of 0.4 mM.
8 Figure S6. The H6/H8-H1′/H3' region of NOESY (τm=300 ms) spectrum of AQU adopted by sl21 recorded at 45 °C. Assignments are shown next to the cross-peaks. The 1D 1 H NMR spectrum of sl21 is shown above the 2D plot with assignment of H6 and H8 signals corresponding to AQU. Signals of NAN, which is present as a minor species, are indicated with stars. NOESY spectrum was recorded at 800 MHz, 45 °C in 10% 2 H2O, 90% H2O, 30 mM KCl, pH 5.8 (20 mM K-phosphate buffer) and oligonucleotide concentration of 1.0 mM.

Assignment of C5, C6, C17 and C18 with the help of 5Me-dC substitutions
Symmetry, spectral overlap and signal broadening made the assignment of C5, C6, C17 and C18 difficult.
However, most of the signals corresponding to C5, C6, C17 and C18 protons could be assigned (Table S1).
The following signals were not visible in any of the recorded spectra of sl21: i) amino NH2 protons of C5, C6, C17 and C18 ii) aromatic H5 proton of C5 and C17 iii) sugar H4' of C6 and C18. The assignment of C5, C6, C17 and C18 was confirmed by comparison between the NOESY spectra of sl21 and oligonucleotides with dC to 5Me-dC substitutions (Table S2).
*Modified residue 5Me-dC is represented as Me C. Br G represents G21 which is substituted with 8Br-dG in all of the sequences.

10
In 5Me-dC, the H5 on the base moiety is replaced by a methyl group. Consequently, the H5-H6 cross-peak in NOESY spectrum is not visible when dC is replaced by 5Me-dC in the sequence. NOESY spectra of oligonucleotides with dC to 5Me-dC substitutions show that one of the H5-H6 cross-peaks disappears only when C6 and C18 are simultaneously replaced by 5Me-dC ( Figure S7). This indicates that a single H5-H6 cross-peak corresponds to overlapped cross-peaks of C6 and C18. The H5-H6 cross-peak of C5 and C17 was not visible in the NOESY and TOCSY spectra of sl21. In addition, the H5-H6 cross-peak of C5 is missing in the NOESY spectra of the methylated analogues of sl21, in which the C5H5-H6 would be expected ( Figure S7).  Table S2. NOESY spectra were recorded at 25 °C in 10% 2 H2O, 90% H2O, 30 mM KCl, pH 5.8 (20 mM K-phosphate buffer) and oligonucleotide concentration around 0.3 mM (5Me-dC substituted oligonucleotides) and 0.4 mM (sl21).
Assignment of the H6 signals of C5, C6, C17 and C18 was further corroborated by spectral similarity of double 5Me-dC substituted oligonucleotides. The NOESY spectra of sl21 analogues with dC to 5MedC substitutions are very similar to the NOESY spectrum of sl21, which suggests that the general fold of AQU remains very similar upon methylation ( Figures S7, S8 and S9). In sl21, the expected C5H6-G4H2',H2", C17H6-G16H2',H2", C5H6-C5H2',H2", and C17H6-C17H2',H2" cross-peaks in the H6/H8-H2'/H2" region of NOESY (τm=300 ms) spectrum are not visible since they are broadened to the baseline (panel A of Figure S4). Similarly, spectral analysis of oligonucleotides with 5Me-dC substations was complex, since some of the expected cross-peaks were not visible. For example, the NOE cross-peaks on the H6 resonance line of C5, C17, Me C5 and Me C17 in the aromatic H6/H8-H2'/H2" region of NOESY (τm=250 ms) spectra of the 5Me-dC substituted oligonucleotides are missing ( Figure S8). However, the expected H6(n)-H2'/H2"(n) cross-peaks of C6, C8, Me C6 and Me C18 are visible ( Figure S8). By comparing the NOESY spectra of different methylated analogues of sl21, we could confirm the C2-axis of symmetry in the C5-C6 and C17-C18 loops and unequivocally show that C5 and C17, as well as C6 and C18 are isochronous. This is clearly demonstrated in Figure S8, where it can be seen that only certain combinations of double dC to 5Me-dC substitutions in the C5-C6 and C17-C18 loops break the apparent C2-symmetry. In sl21, there is a single set of overlapped NOE cross-peaks along the resonance line which corresponds to C6H6 and C18H6. Breaking of symmetry is expected to display as appearance of two distinct sets of cross peaks. Symmetry was retained when C6 and C18 were simultaneously replaced with 5Me-dC, which is seen as a single set of cross-peaks on the H6 resonance line of Me C6 and Me C18. Similarly, a single set of cross-peaks was observed for C6 and C18, when residues C5 and C17 were simultaneously   Table S2. Orange rectangles are shown around the CH2',H2"(n-1)-CH6(n) and CH2',H2"(n)-H6(n) cross-peaks on the resonance line of H6

13
In 5Me-dC, there is a methyl group instead of the H5 which is present at position 5 in residue dC.
Oligonucleotides where residues C5, C6, C17 and C18 were replaced one-by-one or in pairs with 5Me-dC did not display methyl-H6 cross-peaks in their corresponding NOESY (τm=250 ms) spectra ( Figure S8) Dynamics of the C5-C6 and C17-C18 loops is possibly affected by the presence of the methyl groups on 5Me-dC residues. When C5, C6, C17 and C18 are simultaneously replaced by 5Me-dC in sl21[C5, C6, C17, C18], two weak methyl-H6 cross-peaks are visible in the NOESY (τm=400 ms) spectrum recorded at 25 °C ( Figure   S9). One of the methyl-H6 cross-peaks can be assigned to residues Me C6 and Me C18 and one to the residues Me C5 and Me C17. The methyl-H6 cross-peaks corresponding to NAN, which is present as a minor species at around 20% population, are also visible in the spectrum. In addition, several NOE cross-peaks involving the methyl protons (MeH) of Me C5, Me C6, Me C17 and Me C18 could be assigned, namely a) Me C5MeH-G4H8, Me C17MeH-G16H8, b) Me C6MeH-H6, Me C18MeH-H6, c) Me C5MeH-G4H2", Me C17MeH-G4H2", d) Me C6MeH-Me C5H2", Me C18MeH-Me C17H2", e) Me C5MeH-H6, Me C17MeH-H6 ( Figure S9). If we consider that the methyl group in 5Me-dC is analogous to the H5 proton in dC, the observed NOE contacts involving the methyl protons in the NOESY spectrum of sl21[C5, C6, C17, C18] correlate well with the position of the loop residues C5, C6, C17 and C18 in the structure of AQU adopted by (unmethylated) sl21.

Sugar puckering
The H2/H2" proton resonances were stereospecifically assigned using DQF-COSY and NOESY (80 ms mixing time) spectra. Sugar puckering was assessed through analysis of DQF-COSY and TOCSY spectra and revealed that all guanine residues and C12 exhibit large 3 JH1'H2' coupling constants, which is consistent with Stype sugar conformation. C11 displays a large 3 JH3'-H4' coupling constant and 3 JH1'H2" larger than 3 JH1'H2', which is consistent with bias towards the N-type sugar conformation. Cross-peaks of medium and weak intensity in 2D NOESY spectrum with a mixing time of 60 ms were classified as strong (1.8-3.6 Å) and medium (2.6-5.0 Å), respectively. Cross-peaks that were observed with medium intensity at 150 ms were also classified as medium (2.6-5.0 Å). Cross-peaks that appeared in 2D NOESY spectrum with a mixing time of 300 ms were classified as weak (3.5-6.5 Å). NOE contacts that involved protons of residues C5, C6, C17 and C18 were applied in calculations as distance restraints with looser upper 22 boundaries: (1.8-5.0 Å) for cross-peaks classified as strong and medium and (2.6-6.5 Å) for cross-peaks classified as weak. Torsion angle χ around the glycosidic bonds was restrained to a range between 25 and 95° for residues that were assigned a syn conformation and between 200 and 280° for residues that were assigned an anti conformation. Torsion angle χ around the glycosidic bond for residues C11 and C12 was restricted between 170 and 310°, the glycosidic torsion angles of C5, C6, C17 and C18 were left unrestrained. The pseudorotation phase angle (PPA) was used to restrict the sugar conformation into S-type with PPA between 162.0 and 180.0 and N-type with PPA values between 0.0 and 18.0. All guanine residues as well as C12 were restricted to S-type sugar conformation, while C11 was restricted to N-type. Sugar conformation for C5, C6, C17 and C18 was left unrestrained.

SA protocol
All calculations were initiated with random velocities. Generalized Born implicit model was used to account for solvent effects. The cut-off for non-bonded interactions was 20 Å and the SHAKE algorithm for hydrogen atoms was used with the tolerance of 0.0005 Å. In the first 50 ps of SA the temperature was raised from 300 to 1000 K. Molecules were held at constant temperature of 1000 K for 20 ps and then cooled to 300 K in the next 30 ps, after which the temperature was scaled down to 0 K in the last 30 ps. Restraints were included with following force constants: 40 kcal mol -1 Å -2 for hydrogen bond restraints, 20 kcal mol -1 Å -2 for NOE distances, 100 kcal mol -1 rad -2 for sugar pseudorotation phase angle restraints, 150 kcal mol -1 rad -2 for torsion angle χ and 20 kcal mol -1 Å -2 for G-quartet base planarity restraints and 10 kcal mol -1 rad -2 for chirality restraints.
Planarity restraints for G-quartets were excluded in the last 30 ps of SA. All 100 structures were minimized with a maximum of 20 000 steps of energy minimization. Planarity restraints were omitted in the minimization steps.