-
PDF
- Split View
-
Views
-
Cite
Cite
Hatim T. Allawi, John SantaLucia, Jr, Thermodynamics of internal C·T mismatches in DNA, Nucleic Acids Research, Volume 26, Issue 11, 1 June 1998, Pages 2694–2701, https://doi.org/10.1093/nar/26.11.2694
Close -
Share
Abstract
Thermodynamics of 23 oligonucleotides with internal single C·T mismatches were obtained by measuring UV absorbance as a function of temperature. Results from these 23 duplexes were combined with three measurements from the literature to derive nearest-neighbor thermodynamic parameters for seven linearly independent trimer sequences with internal C·T mismatches. The data show that the nearest-neighbor model is adequate for predicting thermodynamics of oligonucleotides with internal C·T with average deviations for ΔG°37, ΔH°, ΔS° and Tm of 6.4%, 9.9%, 10.6%, and 1.9°C respectively. C·T mismatches destabilize the duplex in all sequence contexts. The thermodynamic contribution of C·T mismatches to duplex stability varies weakly depending on the orientation of the mismatch and its context and ranges from +1.02 kcal/mol for GCG/CTC and CCG/GTC to +1.95 kcal/mol for TCC/ATG.
Introduction
DNA mismatches occur as a result of errors during replication (1), due to heteroduplex formation during homologous recombination (2) and mutagenic chemicals and ionizing radiation or spontaneous deamination (3). Mismatches also occur in the secondary structures of single-stranded DNA viruses (4–6). In addition to stable canonical Watson-Crick base pairs (G·C and A·T) there are eight possible mispairs of varying stability and structure, namely A·A, A·C, C·C, C·T, G·G, G·A, G·T and T·T. In order to understand the origins of various mismatch occurrences and to help in our interpretation of mismatch recognition and repair mechanisms, thermodynamics and structures of these mismatches need to be determined.
Several molecular biological techniques require accurate predictions of matched versus mismatched hybridization thermodynamics, such as PCR (7), sequencing by hybridization (8), gene diagnostics (9) and antisense oligonucleotide probes (9–11). In addition, recent developments of oligonucleotide chip arrays as means for biochemical assays and DNA sequencing requires accurate knowledge of hybridization thermodynamics and population ratios at matched and mismatched target sites (8,12,13).
We and others showed that a nearest-neighbor model is sufficient to accurately predict the stability and thermodynamics of DNAs with Watson-Crick pairs (14–20). Thereafter, we derived nearest-neighbor thermodynamic parameters for internal G·T and G·A mismatches and showed that, when combined with the thermodynamics of Watson-Crick pairs, accurate predictions of thermodynamics of duplexes with G·T and G·A mismatches can be determined with average standard deviations for ΔG°37, ΔH°, ΔS° and Tm of 5%, 8%, 8%, and 1.5°C respectively (17,21). To add to our mismatch parameter database and to test whether the nearest-neighbor model is applicable to unstable mismatches, such as C·T mismatches (22–24), we obtained thermodynamic measurements of 28 DNA duplexes containing internal C·T mismatches and combined them with three literature values (24,25) to derive nearest-neighbor parameters for internal C·T mismatches in DNA. The availability of internal C·T mismatch nearest-neighbor parameters along with Watson-Crick nearest-neighbors allows reliable prediction of duplex stability from sequence.
Materials and Methods
Absorbance versus temperature melting curves
Oligonucleotides were synthesized on solid supports using standard phosphoramidite techniques (26) and deblocked and purified as described previously (17). Absorbance versus temperature profiles were determined using an AVIV 14DS UV-vis spectrophotometer with a heating rate of 0.8°C/min as described previously (16). Oligonucleotides were dissolved in 1.0 M NaCl, 20 mM sodium cacodylate and 0.5 mM Na2EDTA, adjusted to pH 7.0 or 5.0 with 1.0 M HCl. Prior to the beginning of each melt, the samples were annealed and degassed by raising the temperature to 85°C for 5 min and then slowly cooling the samples to−1.5°C. While at high temperature, oligonucleotide absorbances at 260 nm were recorded and used to calculate single-strand total concentrations (CT) using extinction coefficients calculated for dinucleoside monophosphates and nucleosides (27). Absorbance melting curves for each duplex were measured at 260 and 280 nm from 0 to 85 or 90°C at 8–10 different concentrations.
Data analysis
Sequences design and rationale
Sequences were designed to have a melting temperature between 30 and 55°C and to minimize the potential of forming alternative competing secondary structures (i.e. hairpins or ‘slipped’ duplexes), which maximizes the likelihood of observing two-state transitions. Throughout this paper nearest-neighbors are represented in an antiparallel fashion with a slash separating the two stands and an underline indicating the position of C·T mismatches. For example, the sequence AC/TT means 5′-AC-3′ paired with 3′-TT-5′. In this study, the eight different C·T mismatch-containing dimers are evenly represented and occur with the following frequencies: AC/TT = 6, AT/TC = 9, CC/GT = 9, CT/GC = 8, GC/CT = 10, GT/CC = 9, TC/AT = 9, TT/AC = 8. In addition, all 16 possible Watson-Crick surrounding contexts are represented at least once in the data set.
Determination of C·T mismatch contribution to duplex stability
Determination of thermodynamics of linearly independent sequences with C·T mismatches
Linear regression analysis of C·T mismatch nearest-neighbors
Error analysis of the data
Resampling analysis of the data
To independently evaluate the error in the obtained C·T mismatch nearest-neighbors and to point out sequences that are either outliers in the fit or that have a substantial effect on the solution obtained by SVD analysis, we performed a resampling analysis of the data. The solution obtained by performing SVD analysis on all 26 sequences is over-determined (i.e. 26 equations with seven unknowns). This resampling analysis has the advantage that it can determine the uncertainties of C·T mismatch nearest-neighbors separate of any previous assumption made about the errors in the measurements (17,36). The resampling analysis was performed for ΔG°37, ΔH° and ΔS°. We performed 30 resampling trials in which eight randomly selected sequences were removed. For each resampling trial, the number of non-zero singular values was confirmed to be seven. For each nearest-neighbor, the 30 resampling trials were averaged and standard deviations determined. The averaged nearest-neighbors from resampling trials were within round-off error of the values obtained for an SVD analysis with all 26 sequences. The standard deviations from resampling agree with the errors propagated in SVD.
1H-NMR spectroscopy
Oligomers were dissolved in 90% H2O and 10% D2O with 1 M NaCl, 10 mM disodium phosphate and 0.1 mM Na2EDTA at pH 7.0 or 5.0. Duplex concentrations were between 0.2 and 1.0 mM. 1H-NMR spectra were recorded using a Varian Unity 500 MHz NMR spectrometer. One-dimensional exchangeable proton NMR spectra were recorded at 10°C using the WATERGATE pulse sequence with ‘flip-back’ pulse to suppress the water peak (37,38). Spectra were recorded with the carrier placed at the solvent frequency and with high power and low power pulse widths of 10.0 and 1800 µs, a sweep width of 12 kHz and a gradient field strength of 10.0 G/cm and duration of 1 ms. 512−1024 transients were collected for each spectrum. Data were multiplied by a 4.0 Hz line broadening exponential function and Fourier transformed with a Silicon Graphics Indigo2Extreme computer with Varian VNMR software. No baseline correction or solvent subtraction was applied. 3-Trimethylsilyl propionic-2,2,3,3-d4 acid (TSP) was used as the internal standard for chemical shift reference. One-dimensional NOE difference spectra were acquired as described above, but with selective decoupling of individual resonances during the 1 s recycle delay. Each resonance was decoupled with a power sufficient to saturate <80% of the signal intensity, so that spillover artifacts would be minimized. The spectra were acquired in an interleaved fashion in blocks of 16 scans to minimize subtraction errors due to long term instrument drift. 3200–6400 scans were collected for each FID.
Results
Thermodynamics of DNA duplexes with C·T mismatches
Plots of Tm−1 versus lnCT for all the duplexes in this study were linear (correlation coefficient >0.99; not shown). Thermodynamic parameters for helix to coil transitions for 28 sequences using averages of the fits of melting curves and Tm−1 versus lnCT plots are listed in Table 1. A widely used method for determining applicability of the two-state model to melting curves is comparison of the ΔH° values obtained from the averages of the fits and the Tm−1 versus lnCT plots. If the ΔH° parameters from both methods agree within 15%, the duplex to random coil transition is assumed to be two-state (16,17,20). However, melts that exhibit agreement of ΔH° values of 15% do not necessarily rule out non-two-state behavior (17,31). Twenty three of the sequences in Table 1 have a ΔH° agreement from the two methods of ≤15% and showed monophasic transitions, indicating bimolecular two-state behavior. Five duplexes in Table 1 melt with non-two-state transitions. The non-two-state behavior of these duplexes is manifested in the >15% disagreement in ΔH° values derived by the two methods. These non-two-state sequences may have the ability to form alternative conformations, such as hairpins or slipped duplexes, during the duplex to random coil transition. For duplexes with two-state transitions, the thermodynamics obtained from the fits and the Tm−1 versus lnCT plots are equally reliable (16,17,20,21) and thus their averages are the experimental values listed in Table 2.
Nearest-neighbor thermodynamics of unique trimer sequences with internal C·T mismatches
Table 3 lists thermodynamic parameters obtained using SVD analysis for all 16 unique trimer sequences with internal C·T mismatches. According to the nearest-neighbor model, seven of these trimer sequences are linearly independent and can be used in linear combination to obtain parameters for the other nine trimer sequences. The errors listed in Table 3 are the standard deviations from resampling analysis of the data (see Materials and Methods). These errors are the same as the errors obtained by propagating the experimental and Watson-Crick nearest-neighbor errors in the SVD analysis. The parameters listed in Table 3, along with Watson-Crick nearest-neighbor and initiation parameters (17), predict the thermodynamics of all 26 duplexes with two-state thermodynamics (Table 2) with average deviations for ΔG°37, ΔH°, ΔS° and Tm of 0.45 kcal/mol, 5.9 kcal/mol, 18.0 e.u., and 1.9°C respectively.
Thermodynamics of duplex formation of oligonucleotides with internal C·T mismatchesa
Thermodynamics of duplex formation of oligonucleotides with internal C·T mismatchesa
Experimental and predicted thermodynamics of oligonucleotides with C·T mismatchesa
Experimental and predicted thermodynamics of oligonucleotides with C·T mismatchesa
Non-unique C·T mismatch nearest-neighbor thermodynamics
As stated previously, analysis of internal C·T mismatches in terms of dimer sequences results in eight nearest-neighbors that are not a unique solution. The non-uniqueness of these dimer sequences results from having all C·T mismatches located internally (17,33). Table 4 lists nearest-neighbor parameters for dimer sequences with C·T mismatches obtained by fitting the data to eight parameters. The eight dimer parameters listed in Table 4 are an alternative representation of the seven trimer parameters listed in Table 3. However, in the SVD analysis of eight dimer sequences, the number of non-zero singular values is seven, indicating that the stacking matrix is rank deficient and that the parameters are non-unique. To clarify the non-uniqueness of the parameters in Table 4 one could show that a linear combination of the parameters in Table 4 can be used to derive parameters for the seven linearly independent trimer sequences in Table 3, but not vice versa unless an eighth parameter is given (SVD assumes the eighth parameter is zero) (14). Nonetheless, the parameters in Tables 3 and 4 result in the same predictions and, thus, one could use either representation of the data, keeping in mind that both apply only to internal C·T mismatches.
Nearest-neighbor thermodynamic parameters for 16 trimer sequences with internal C·T mismatches in 1 M NaCla
Nearest-neighbor thermodynamic parameters for 16 trimer sequences with internal C·T mismatches in 1 M NaCla
Thermodynamics of C·T mismatches at pH 5.0
To test the thermodynamic effects of protonation of a C·T mismatch (i.e. C+·T versus C·T) thermodynamic measurements were made on four duplexes with C·T mismatches at pH 5.0 and pH 7.0. The pKa of protonation for cytosine in the context of a C·T mismatch has been reported to be ∼5.7 (39), thus, at pH 5.0, ∼66% of C·T mismatches should be protonated. Four sequences were selected to represent different C·T mismatch nearest-neighbor contexts. On average, for the four C·T mismatch-containing sequences tested for pH effects, changing the pH from 7.0 to 5.0 decreased the stability of the duplex by 0.3 kcal/mol for ΔG°37 and 1.1°C for the Tm. The data obtained for these four sequences suggest that the thermodynamics of C·T mismatches at pH 5.0 are slightly less stable than at pH 7.0.
NMR and pairing geometry of C·T and C+·T mismatches
C·T mismatches have been proposed to form at least four different structures depending on sequence context and solution conditions (Fig. 1; 39–42). To determine the pairing geometry for C·T mismatches in this study, one-dimensional exchangeable proton NMR spectra of five DNA duplexes with different C·T mismatch contexts were acquired at pH 7.0 and 5.0. Figures 2 and 3 show a representative imino region (9–15 ppm) of two of the duplexes studied containing C·T mismatches at pH 7.0 and 5.0. Resonances between 12–13 and 13–15 ppm are usually the imino protons of canonical Watson-Crick G·C and A·T pairs. At pH 7.0, an imino peak is observed around 11.5 ppm (Figs 2a and 3a) which broadens out at pH 5.0 (Figs 2b and 3b). Irradiation of this resonance did not show any observable NOEs (not shown), probably due to rapid chemical exchange with water. Previous structural studies on C·T and C·U mismatches in DNA and RNA showed that at neutral pH these mismatches can pair with two hydrogen bonds, one of which, due to the repulsion of the carbonyl groups of the cytosine and thymine (43), is possibly mediated via a water molecule (Fig. 1b; 39–42). Our data are most consistent with NMR observations on C·T mismatches at neutral pH and, thus, we tentatively assign the resonance at 11.5 ppm as the imino proton of thymine hydrogen bonded to N3 of cytosine via a water molecule (39,40,42). At pH 5.0, the protonation of N3 of cytosine results in a change in the pairing geometry of the C·T mismatch which broadens the imino resonance of the thymine in the C·T mismatch (11.5 ppm). This broadening of the imino resonance might be a result of chemical exchange between protonated and non-protonated C·T mispairs. Previous structural studies of C·T mismatches under acidic conditions suggest that the imino proton of thymine becomes hydrogen bonded to the carbonyl group of cytosine, possibly via a water molecule, making it exchange faster with water (Fig. 1c and d; 39,40). In contrast, C·C and A·C in RNA (44) and in DNA (H.T.Allawi and J.SantaLucia Jr, unpublished results) are often stabilized at acidic pH.
Discussion
Applicability of the nearest-neighbor model to internal C·T mismatches
Table 2 compares experimental results of 26 duplexes with C·T mismatches with predictions made by the parameters listed in Table 3 (or Table 4) and Watson-Crick nearest-neighbor parameters (17). For single mismatches in DNA, we have previously shown that a nearest-neighbor model can accurately predict duplexes with internal G·A and G·T with average deviations for ΔG°37, ΔH°, ΔS° and Tm of 5.0%, 8.0%, 8.0%, and 1.5°C respectively (17,21). In this study, we find that analysis of C·T mismatch contributions to duplex stability in terms of a nearest-neighbor model results in parameters that predict the thermodynamics of sequences with two-state transitions with an average deviation for ΔG°37, ΔH°, ΔS° and Tm of 6.4%, 9.9%, 10.6% and 1.9°C respectively. These average deviations are slightly higher than what was observed for G·A and G·T mismatches (17,21). Nonetheless, considering how unstable C·T mismatches are, one might expect that C·T mismatches are capable of disrupting double-helical DNA in a fashion that may extend to next-nearest-neighboring Watson-Crick pairs. However, results from this study suggest that if there are any next-nearest-neighbor effects for C·T mismatches they are very small and can be neglected. Hence, the nearest-neighbor parameters in Tables 3 and 4 make predictions that are adequate for most applications. An alternative way to test the applicability of the nearest-neighbor model is to synthesize oligonucleotides with different sequences but the same nearest-neighbor composition (17,45–47). In this study, three pairs of duplexes have the same nearest-neighbor composition (Tables 1 and 2). For example, the duplexes CGTGCCTCC·GGAGTCACG and GGAGCCACG·CGTGTCTCC have different sequences but the same nearest-neighbors and their ΔG°37, ΔH°, ΔS° and Tm agree within 0.17 kcal/mol, 0.1 kcal/mol, 0.3 e.u., and 1.0°C respectively. The average deviation from the mean between the three pairs of duplexes with the same nearest-neighbors for ΔG°37, ΔH°, ΔS° and Tm are 0.06 kcal/mol, 0.4 kcal/mol, 1.2 e.u., and 0.3°C respectively.
Four hydrogen bonded structures of the C·T mispair at neutral pH (a and b) and at acidic pH (c and d).
Four hydrogen bonded structures of the C·T mispair at neutral pH (a and b) and at acidic pH (c and d).
Trends in C·T mismatch thermodynamics
Trimer mismatch free energy (ΔG°37) contributions for internal C·T mismatches vary weakly, depending on the mismatch orientation and context (Tables 3 and 4). The most stable trimer sequences (GCG/CTC and CCG/GTC) destabilize the duplex by +1.02 kcal/mol and the least stable trimer (TCC/ATG) destabilizes the duplex by +1.95 kcal/mol. This range of 0.93 kcal/mol for ΔG°37 indicates that there is a weak stacking contribution to stability of a C·T mismatch. For trimer sequences with the cytosine of the C·T mismatch on the top strand, the general trend for the 5′-end closing Watson-Crick pair (with decreasing order of stability) is G·C ≈ C·G > A·T >> T·A. However, when the thymine of the C·T mismatch is on the top strand (i.e. T·C), the trend on the 5′-end becomes (with decreasing order of stability) C·G > A·T ≈ T·A > G·C. Close inspection of these trends reveals an interesting result. Generally, a G·C base pair (which has three hydrogen bonds) is expected to have a stabilizing effect on duplexes that is larger than an A·T pair (which has two hydrogen bonds). However, G·C pairs stacked on the 5′-end of a T·C mismatch destabilize the duplex by 0.98 kcal/mol and A·T pairs stacked on the 5′-end of a T·C mismatch destabilize the duplex by 0.73 kcal/mol. Therefore, in this case, a 5′ A·T pair stabilizes T·C mismatches more than does a 5′ G·C. Thus, stacking interactions, more than hydrogen bonding, play a major role in the stability of duplexes with internal C·T mismatches. This is also evident when a G·C pair stacked on a T·C mismatch (GT/CC) is compared with a C·G (CT/GC), which are destabilizing by 0.98 and 0.40 kcal/mol respectively (Table 4).
500 MHz 1H-NMR spectra of the exchangeable imino region (9–15 ppm) in 1 M NaCl, 10 mM disodium phosphate and 0.1 mM Na2EDTA at 10°C in 90% H2O/10% D2O of CATGTTACTAC●GTACTCACATG at (a) pH 7.0 and (b) pH 5.0.
500 MHz 1H-NMR spectra of the exchangeable imino region (9–15 ppm) in 1 M NaCl, 10 mM disodium phosphate and 0.1 mM Na2EDTA at 10°C in 90% H2O/10% D2O of CATGTTACTAC●GTACTCACATG at (a) pH 7.0 and (b) pH 5.0.
500 MHz 1H-NMR spectra of the exchangeable imino region (9–15 ppm) in 1 M NaCl, 10 mM disodium phosphate and 0.1 mM Na2EDTA at 10°C in 90% H2O/10% D2O of (CGTCTCATGATACG)2 at (a) pH 7.0 and (b) pH 5.0.
500 MHz 1H-NMR spectra of the exchangeable imino region (9–15 ppm) in 1 M NaCl, 10 mM disodium phosphate and 0.1 mM Na2EDTA at 10°C in 90% H2O/10% D2O of (CGTCTCATGATACG)2 at (a) pH 7.0 and (b) pH 5.0.
Comparison of thermodynamics of C·T mismatches and Watson-Crick pairs
No correlation is observed when comparing thermodynamics of trimer sequences with internal C·T mismatches with the corresponding trimer sequences with either G·C or A·T Watson-Crick base pairs (17). Free energies of Watson-Crick trimer sequences with a central A·T or G·C pair vary over a range of 2.95 kcal/mol, whereas the range of trimer sequences with internal C·T mismatches vary over 0.93 kcal/mol in ΔG°37. The most stable Watson-Crick trimer sequence is GCG/CGC (ΔG°37 =−4.41 kcal/mol) and the least stable is ATA/TAT (ΔG°37−1.46 kcal/mol) (17). For internal C·T mismatches, the most stable C·T trimer sequence contexts are GCG/CTC and CCG/GTC (+1.02 kcal/mol) and is the same context as the most stable Watson-Crick sequence (GCG/CGC). However, the trimer sequence TCC/ATG, which is the least stable C·T context, is different than the least stable Watson-Crick sequence (ATA/TAT).
Comparison of C·T, G·T and G·A mismatch thermodynamics
Comparison of internal C·T mismatches thermodynamics (Table 3) with previously published parameters for internal G·A (21) and G·T (17) mismatch thermodynamics indicates that C·T mismatches are among the most unstable mismatches in DNA consistent with previous observations (23,24). The most stable C·T trimer sequences are GCG/CTC and CCG/GTC (ΔG°37 of +1.02 kcal/mol) and the most stable G·A or G·T trimer sequences are GGC/CAG and CGC/GTG (ΔG°37−0.78 and−1.05 kcal/mol respectively). Moreover, the least stable C·T trimer sequence is TCC/ATG (ΔG°37 +1.95 kcal/mol) and the least stable G·A or G·T trimer sequences are TGA/AAT and AGA/TTT (ΔG°37 +1.16 and +1.05 kcal/mol respectively). The average free energy contribution of all 16 unique trimer sequences with internal C·T mismatches is +1.43 kcal/mol. Average internal G·A and G·T mismatch free energy contributions for all 16 unique trimer sequences, on the other hand, are +0.17 and +0.05 kcal/mol respectively. Furthermore, stabilities of G·A and G·T mismatches are spread over a range of 1.94 and 2.10 kcal/mol respectively, while C·T mismatch stabilities are spread over a range of 0.93 kcal/mol indicating that, while contributions of internal C·T mismatch thermodynamics depend slightly on the neighboring bases, their thermodynamics are not as sensitive to the surrounding base pair context as in G·A and G·T mismatches.

















Comments