Binding of hairpin pyrrole and imidazole polyamides to DNA: relationship between torsion angle and association rate constants

N-methylpyrrole (Py)-N-methylimidazole (Im) polyamides are small organic molecules that bind to DNA with sequence specificity and can be used as synthetic DNA-binding ligands. In this study, five hairpin eight-ring Py–Im polyamides 1–5 with different number of Im rings were synthesized, and their binding behaviour was investigated with surface plasmon resonance assay. It was found that association rate (ka) of the Py–Im polyamides with their target DNA decreased with the number of Im in the Py–Im polyamides. The structures of four-ring Py–Im polyamides derived from density functional theory revealed that the dihedral angle of the Py amide carbonyl is 14∼18°, whereas that of the Im is significantly smaller. As the minor groove of DNA has a helical structure, planar Py–Im polyamides need to change their conformation to fit it upon binding to the minor groove. The data explain that an increase in planarity of Py–Im polyamide induced by the incorporation of Im reduces the association rate of Py–Im polyamides. This fundamental knowledge of the binding of Py–Im polyamides to DNA will facilitate the design of hairpin Py–Im polyamides as synthetic DNA-binding modules.


INTRODUCTION
N-Methylpyrrole (Py)-N-methylimidazole (Im) polyamides are small organic molecules that can recognize specific DNA sequences in the minor groove of B-form DNA, according to DNA recognition rules (1,2). Py favours T, A and C bases, whereas Im favours a G base. A lone electron pair on N-3 in Im forms a hydrogen bond with the 2-amino hydrogen of guanine (G). Thus, antiparallel pairings of Im/Py and Py/Im specify GC and CG, respectively, and antiparallel pairings of Py/Py specify AT or TA degenerately (1,2). Aliphatic b-alanine (b) can be substituted for Py. It has been used effectively when molecules have more than five consecutive Py or Im residues, by adjusting the pitch between amide bonds of Py-Im polyamides and the accepting residue of the minor groove. Antiparallel pairings of Py/b and b/Py specify AT or TA degenerately, and antiparallel pairings of Im/b and b/Im specify GC and CG, respectively (3,4).
As Py-Im polyamides can bind to DNA with sequence specificity comparable with DNA-binding proteins, they can be substituted for the DNA-binding domain of a transcription factor. Py-Im polyamides that can bind to a promoter region have been designed to inhibit gene expression (5)(6)(7)(8)(9). Furthermore, Py-Im polyamides have been conjugated with a peptide or a small organic molecule to create synthetic transcriptional activators that stimulate gene expression (10)(11)(12)(13). The dissociation equilibrium constants (K D s) of these Py-Im polyamides with their target DNA sequences were extensively determined, by DNase I footprinting, by the Dervan group. However, only a few of their corresponding association rate constants and dissociation rate constants have been reported (14)(15)(16). It may be crucial for the design of a synthetic DNA-binding module to determine not only K D s but also the association rate constant (k a ) and the dissociation rate constant (k d ) of Py-Im polyamides because the association/dissociation rate constants are contingent on respective transcription factors (17,18).
A zinc finger is a representative transcription factor DNA-binding domain that binds to GC-rich sequences (19). Many eukaryotic genes contain highly GC-rich sequences in the promoter region (20); thus, it is potentially valuable to synthesize Py-Im polyamides that recognize the GC-rich promoter region. However, 5 0 -GCGC-3 0 and 5 0 -CGCG-3 0 have been identified as difficult recognition sequences of Py-Im polyamides (3,21). It is not known why these sequences are difficult to recognize.
In this study, we synthesized five 8-ring Py-Im polyamides 1-5 with different numbers of Im rings, and we measured the k a and k d values of the DNA-Py-Im polyamide complexes by surface plasmon resonance (SPR) assay. We also estimated the structures of four-ring Py-Im polyamides by density functional theory (DFT) and ab initio quantum chemical calculation. Our SPR data and the calculated structures elucidated the relationship between the structure of the Py-Im polyamides and the association rate of the Py-Im polyamides with their target DNA.
Electrospray ionization time-of-flight mass spectrometry (ESI-TOFMS) was carried out on a BioTOF II (Bruker Daltonics) mass spectrometer to determine the molecular weight of Py-Im polyamides 1-5.

Polyamide synthesis
Py-Im polyamides 1-5 were synthesized in a stepwise reaction using a previously described Fmoc solid-phase protocol (21). Syntheses were performed using a pioneer peptide synthesizer (PSSM-8, Shimadzu) with a computer-assisted operation system on a 36 mM scale (100 mg of Fmoc-b-alanine Wang resin). After the synthesis, Dp was mixed with the resin for 4 h at 55 C and the mixture was shaken at 550 r.p.m. to detach the Py--Im polyamides from the resin. Purification of Py--Im polyamides 1-5 was performed using a high-performance liquid chromatography (HPLC) PU-2080 Plus series system (JASCO), using a 10 Â 150 mm ChemcoPak Chemcobond 5-ODS-H reverse-phase column in 0.1% TFA in water, with acetonitrile as eluent, at a flow rate of 3 ml/min and a linear gradient elution of 20-60% acetonitrile >20 min, with detection at 254 nm. Collected fractions were analysed by ESI-TOFMS.

SPR assay
All SPR experiments were performed on a BIACORE X instrument at 25 C as described previously (21,22).
The sequences of biotinylated hairpin DNAs containing target sequences are shown in Figure 2 and Supplementary Figure S1. The hairpin DNAs were immobilized on a streptavidin-coated SA sensor chip at a flow rate of 20 ml/min to obtain the required immobilization level (up to $1400 resonance units (RU) rise). Experiments were carried out using HBS-EP (10 mM 4-(2-hydroxyethyl)-1piperazineethanesulfonic acid (HEPES), 150 mM NaCl, 3 mM ethylenediaminetetraacetic acid and 0.005% Surfactant P20) buffer with 0.1% DMSO at 25 C, pH 7.4. A series of sample solutions were prepared in HBS-EP buffer with 0.1% DMSO and were injected at a flow rate of 20 ml/min. To measure association and dissociation rate constants (K D , k a and k d ), data processing was performed with an appropriate fitting model using the BIAevaluation 4.1 program. The sensorgrams of all data were fitted by using the 1:1 binding model with mass transfer. The values of K D , k a and k d for all data are summarized in Table 1 and Supplementary Table S1.

Structural model calculation
DFT and Hartree-Fock calculations were performed with the Gaussian 09 software package. Structural energy was first minimized by means of a Parameterized Model number 3 (PM3) model, followed by DFT and a 6-311+G* polarization basis set or the Hartree-Fock method and a 6-31G* polarization basis set.

Hairpin eight-ring Py-Im polyamide synthesis
To investigate the binding properties of Py and Im in hairpin Py-Im polyamides, we designed and synthesized five hairpin eight-ring Py-Im polyamides 1-5 ( Figure 1) by the Fmoc-chemistry solid-phase synthesis method. Two b-alanines were attached to the N-terminal of 1-5 ( Figure 1) for the optional construction of a fluorescence Py-Im polyamide conjugate (Han et al., unpublished data). We purified 1-5 by reverse-phase HPLC, and then confirmed that the purity of 1-5 was >95% by analytical HPLC and ESI-TOFMS. The b-Dp linker at the Table 1. Binding affinity of 1-5 C-terminal has $100-fold steric preference for AT or TA relative to GC or CG (23). Based on the recognition rule of polyamides, the target DNA sequences of 1-5 are 5 0 -WWWWWW-3 0 , 5 0 -WWWWCW-3 0 , 5 0 -WWWGCW-3 0 , 5 0 -WWCGCW-3 0 and 5 0 -WGCGCW-3 0 , respectively, and we prepared five 5 0 -biotinylated hairpin DNAs (ODN1-5) ( Figure 2). However, because of two b-alanines attached to the N-terminal of 1-5, steric hinderance between the tails of 1-5 and the DNA minor groove may suppress the steric preference for AT or TA relative to GC or CG. To characterize the effect of the two b-alanines, we also prepared five 5 0 -biotinylated hairpin DNAs (ODN6-10) (Supplementary Figure S1).
Previously, Crothers and coworkers reported the k a and k d values of ImPyPyPy-g-ImPyPyPy-b-Dp, and the k d value was 0.002 s À1 , which is consistent with our data. However, the k a value was 7.0 Â 10 7 M À1 s À1 and 46-fold times higher than that of 1. Dervan and coworkers (24) suggested that the Im located at the C-terminal end of each four-ring Py-Im polyamide subunit is somehow less capable of strong hydrogen bond formation than the N-terminal residues. Therefore, the binding affinity of ImPyPyPy-g-ImPyPyPy-b-Dp may be relatively high, like that of ImImPyPy-g-ImImPyPy-b-Dp as discussed later in the text.

Structures of four-ring Py-Im polyamide estimated by density functional theory
To obtain insight into the differences among the observed K D values, we calculated the model structures of five 4-ring Py-Im polyamides, PyPyPyPy, ImPyPyPy, PyPyImPy, ImPyImPy and ImImImIm by DFT, as described in the 'Material and Methods' section. Py-Im polyamide 1 contains two PyPyPyPy, 2 contains ImPyPyPy and PyPyPyPy, 3 contains ImPyPyPy and PyPyImPy, 4 contains ImPyImPy and PyPyImPy and 5 contains two ImPyImPy. For comparison, the structure of ImImImIm was also calculated. The optimized structures and selected torsion angle of four-ring Py-Im are shown in Figure 4.
It was found that PyPyPyPy has a helical structure, and the dihedral angles of the Py carbonyl of the amide N1-C2-CO-O are $17 , with a slightly smaller dihedral angle at the C-terminal Py. This large dihedral angle of the Py residue is derived from a large steric hindrance between the H of C-3 in Py and the H of the contiguous amide. In clear contrast, the corresponding dihedral angles of the Im residue in the four-ring Py-Im tetramers are small (<1-2 ) because of the lack of steric hindrance, and the planarity of the molecules is increased. As a result, ImImImIm becomes almost planar (Figure 4). Because of the double bond between C2 and N3 in the Im, a lone electron pair of N3 face a H of the contiguous amide, and a hydrogen bond may form between N3 in the Im and H in the contiguous amide, resulting in the shorter distance of C-N in the amide and between two Ns in the contiguous amides and the more acute angle of the two ring-to-amide bonds, compared with Py, and the coplanarity of the Im ring and the contiguous amide (Supplementary Figure S3). The angles of the two ring-to-amide bonds in each Py and Im ring were $147 and 138 , respectively (Supplementary Figure S3), which is consistent with a previous result (25).
We also determined the N-to-N distances at both ends of each of the four-ring Py-Im polyamides. The N-to-N distances for PyPyPyPy, ImPyPyPy, PyPyImPy, ImPyImPy and ImImImIm were 15.59, 14.83, 13.72, 12.91 and 10.64 Å , respectively (Figure 4). We also calculated the four-ring structures by ab initio quantum chemical calculation. The calculated structures were almost the same as those derived by DFT (Supplementary Figure S4). The calculated structural data suggest that 1 is the least curved Py-Im polyamide, and 5 is the most curved Py-Im polyamide among 1-5.
Implication of binding of hairpin Py-Im polyamide to DNA minor groove It has been pointed out by the Dervan group that the twisted shape of Py-Im polyamides fits well into the minor groove of DNA. In a DNA-cyclic polyamide complex, relatively large torsion angles were observed in Im residues and Py residues (26,27). These results clearly indicate that a large conformational change is necessary for Im residues on DNA binding. The fact that the angles of the two ring-to-amide bonds in each Py and Im ring were $147 and 138 , respectively, which indicates that Py has less curvature, and Im has too much curvature. This also suggests that the conformational change of Im residues could lead to a match to the regular B-form DNA, and 5 requires more energy than the other four Py-Im polyamide 1-4 to change the structure for binding to the target DNA, resulting in the slowest association rate of 5 with the target DNA among 1-5.
As reported previously, replacement of Py with an aliphatic b-alanine can increase binding affinity and provide flexibility in the polyamide structures, and the binding affinity of Im-b-ImPy-g-Im-b-ImPy-b-Dp that recognizes 5 0 -GCGC-3 0 was 100-fold over that of ImPyImPy-g-ImPyImPy-b-Dp (3). Measurement of K D , k a and k d values of Py-Im polyamides containing Py/b and/or Im/b pairs is important for the next step of Py-Im polyamide design. We have also replaced two Py in 5 by b-alanine, resulting in construction of b-b-Im-b-ImPy-g-Im-b-ImPy-b-Dp, and we measured the K D , k a and k d of b-b-Im-b-ImPy-g-Im-b-ImPy-b-Dp. Interestingly, the k a and k d of b-b-Im-b-ImPy-g-Im-b-ImPy-b-Dp were improved by $10-fold, compared with those of 5 for ODN5 or ODN10 (Y.-W. Han et al., unpublished data). Further analysis of Py-Im polyamides containing Py/b and/or Im/b pairs is now in progress.

CONCLUSION
In this study, using SPR assays, we measured the K D , k a and k d of Py-Im polyamides 1-5 to characterize Py and Im in hairpin Py-Im polyamides in more detail. Because k a and k d of some transcription factors have been determined and were contingent on the respective transcriptional factors, the measurement of k a and k d of Py-Im polyamides is also crucial for the design of a Py-Im polyamide as a synthetic DNA-binding module of a transcription factor. SPR data demonstrated that the k d values of 1-5 were between 0.0039 and 0.014 s À1 . The k a values of the Py-Im polyamides decreased as the number of Im in the Py-Im polyamides increased. DFT calculations suggest that an increase in planarity, induced by the incorporation of Im, reduced the association rate of Py-Im polyamides. These data indicate that the number and the position of Im in Py-Im polyamides influence the k a but not the k d of the Py-Im polyamides; thus, enabling us to estimate the DNA binding kinetics of Py-Im polyamides.
We synthesized Py-Im polyamide 1-5 which contained two b-alanines at the N-terminal in this study, and SPR data also demonstrated that the b-Dp linker at the C-terminal of 1-5 had a slight steric preference for AT or TA relative to GC or CG.