Massive Parallel Analysis of Dna–hoechst 33258 Binding Specificity with a Generic Oligodeoxyribonucleotide Microchip

A generic oligodeoxyribonucleotide microchip was used to determine the sequence specificity of Hoechst 33258 binding to double-stranded DNA. The generic microchip contained 4096 oxctadeoxynucleo-tides in which all possible 4 6 = 4096 hexadeoxy-nucleotide sequences are flanked on both the 3′ ′ ′ ′-and 5′ ′ ′ ′-ends with equimolar mixtures of four bases. The microchip was manufactured by chemical immobili-zation of presynthesized 8mers within polyacrylamide gel pads. A selected set of immobilized 8mers was converted to double-stranded form by hybridization with a mixture of fluorescently labeled complementary 8mers. Massive parallel measurements of melting curves were carried out for the majority of 2080 6mer duplexes, in both the absence and presence of the Hoechst dye. The sequence-specific affinity for Hoechst 33258 was calculated as the increase in melting temperature caused by ligand binding. The dye exhibited specificity for A:T but not G:C base pairs. The affinity is low for two A:T base pairs, increases significantly for three, and reaches a plateau for four A:T base pairs. The relative ligand affinity for all trinucleotide and tetranucleotide sequences (A/T) 3 and (A/T) 4 was estimated. The free energy of dye binding to several duplexes was calculated from the equilibrium melting curves of the duplexes formed on the oligonucleotide microchips. This method can be used as a general approach for massive screening of the sequence specificity of DNA-binding compounds.


INTRODUCTION
Sequence-specific interactions of nucleic acids with each other, with proteins, and with other compounds play an important role in their structure and functional organization.As potential drugs for the treatment of diseases, many compounds are intensively screened and analyzed for sequence specificity of binding to regulate replication, transcription and translation.The screening includes testing the agents for specificity and the stability of their complexes with nucleic acids.The most comprehensive results are achieved by comparative thermodynamic analysis of the complexes of a ligand with nucleic acids of various sequences.Such an analysis of the complexes in solution requires much effort and time.Alternative, simpler procedures for large screening tests include protecting the nucleic acid in complexes against chemical modification (1) and nuclease digestion, using footprinting assays (2,3), separating the complexes by electrophoresis (4) and others.However, these tests are restricted in the number of sequences that can be analyzed and often provide qualitative rather than quantitative results.
Microarrays of oligodeoxyribonucleotides and DNA immobilized on filters or glass have been effective for parallel hybridization analysis of a large number of DNA and RNA sequences to identify genetic mutations and gene polymorphisms (5,6), gene expression (7)(8)(9), and to detect different microorganisms (10).
The use of gel pads as an immobilization support in oligonucleotide, DNA and protein arrays provides essential advantages over the use of probes attached to the solid support.Threedimensional immobilization in gel pads provides higher capacity and a more homogeneous environment than heterophase immobilization on glass or filters.Arrays of gel pads have been used to carry out chemical and enzymatic reactions (11), hybridization with immobilized oligonucleotides (5), and thermodynamic analysis of DNA duplexes that contain conventional and modified nucleotides (12).One study found a direct correlation between thermodynamic parameters for DNA duplexes in solution and in the gel pads of the microchips (12).Recently, generic gel pad microchips containing all 4096 possible hexadeoxynucleotides were manufactured for comprehensive analysis of DNA duplexes and their complexes (Timofeev et al., manuscript in preparation).
The fluorescent dye Hoechst 33258 is widely used in biochemical applications (13).The dye binds preferentially to four consecutive A:T base pairs from the side of the minor groove of the DNA B-form helix.Some details of the dye sequence specificity have been revealed by physicochemical *To whom correspondence should be addressed at: Biochip Technology Center, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA.Tel: +1 630 252 3161; Fax: +1 630 252 3387; Email: amir@everest.bim.anl.gov(14) and footprinting (15)(16)(17) studies, as well as by X-ray analysis (18,19).
Here we describe the application of generic hexadeoxynucleotide microchips for massive parallel analysis of the binding of Hoechst 33258 to different 6mer duplexes.Thermodynamic analysis of some complexes was carried out.Sequence-specific motifs were identified and compared quantitatively by statistical analysis of the melting curves of a large number of duplexes formed on the microchip in the presence and absence of the dye.This approach can be extended for mass screening the sequence specificity of other ligands, as well as proteins.
The 4096 octadeoxyribonucleotides used for manufacturing generic microchips were purchased from CyberSyn (USA).These 8mers have the structure 5′-NH 2 -MNNNNNNM-3′, where M is a 1:1:1:1 mixture of the four bases, N is one of the four bases of the core representing all 4096 possible 6mers, and NH 2 is an amino linker used to immobilize the 8mers in the polyacrylamide gel pads of the microchip.Two 8mer mixtures, 5′-MM(A/G)MM(A/G)MM-NH 2 -3′ and 5′-MM(A/C)MM(A/C)-MM-3′-NH 2 , were synthesized on an Applied Biosystems 394 DNA/RNA synthesizer, using standard phosphoramidite chemistry and 3′-C(7) amino modifier CPG (Glen Research, USA).The 8mer mixtures were fluorescently labeled with Texas Red sulfonyl chloride dye (Molecular Probes, USA) according to the manufacturer's protocol.The fluorescently labeled 20mer 5′-Fl-TACACCCATGAATTTGATGG-3′ was synthesized on the same synthesizer using a standard procedure.

Generic microchip
The generic microchips were manufactured in two steps.First, arrays of 4200 (60 × 70) 5% polyacrylamide gel pads (100 × 100 × 20 µm pads spaced by 200 µm) were prepared by photopolymerization (20).One nanoliter droplets of a 1 mM solution of an oligonucleotide in water were applied to each gel pad on a hydrophobic glass slide (5) and oligonucleotides were immobilized by reductive coupling of their amino groups with aldehyde groups of the gel (21).

Hybridization and melting curve measurements
Hybridization of the generic microchip with the mixture of fluorescently labeled 8mers in a 100 µl hybridization chamber was carried out at 0°C for 24 h.The hybridization solutions contained 200 µM oligonucleotides, 1 M NaCl, 10 mM NaHPO 4 , 5 mM EDTA, pH 6.8, and 0.1% Tween-20.After hybridization, the microchip in the chamber was placed on the fluorescence microscope thermotable and the melting curves were recorded for all elements of the microchip while increasing the temperature from 0 to 50°C at the rate of 2°C/h (12) in 1°C steps.After measuring the melting curves of the duplexes in the absence of Hoechst 33258, the fluorescently labeled oligonucleotides were washed off the microchips with water.A second round of hybridization and melting experiments were performed under the same conditions in the presence of 10 µM Hoechst 33258.
Hybridization of the 8mer microchip with a 1 µM solution of fluorescently labeled 20mer in the absence and presence of 2 µM Hoechst 33258 was performed under the same conditions.The equilibrium melting curves were measured to determine the free energy of binding on the microchip as described (12).
All measurements of the melting curves were performed by an automated 3.5 × 3.5 mm field epifluorescent microscope with mercury lamp as the excitation source and the filter set for TexasRed dye (LOMO, Russia) equipped with a CCD camera (Princeton Instruments, USA), a Peltier thermotable with temperature controller (Melcor, USA), and a computer supplied with a data acquisition board (National Instruments, USA) (12).The fluorescence intensity measurements were carried out at each temperature by scanning the generic microchip as several fields containing 100 gel pads.One acquisition of a 100 gel pad image took 2 s.The scanning system consisted of a two-coordinate table, stepper motors and controller (Newport, USA).Special software was designed for experimental control and data processing using the LabVIEW virtual instrument interface (National Instruments, USA).

Massive parallel measurement of Hoechst 33258 binding to duplexes on a generic microchip
The generic 6mer microchip contains all possible 4096 (4 6 = 4096) single-stranded hexadeoxyribonucleotides NNNNNN (N = one of four bases) immobilized within individual gel pads.These core 6mers are incorporated within 8mers of the general structure 5′-gel-MNNNNNNM-3′, where M is a 1:1:1:1 mixture of four nucleotides.
Hoechst 33258 shows no binding to single-stranded DNA and significant base pair specificity in binding to doublestranded DNA.DNA duplexes are stabilized in complexes with the dye and have a higher melting temperature (T m ).The specificity of dye binding can be estimated according to the increase in T m of duplexes upon ligand binding.To perform such measurements, the 4096 single-stranded oligonucleotides of the generic microchip need to be converted into doublestranded form.This can be achieved by hybridization of the microchip with the mixture of 4096 fluorescently labeled 8mers of a similar structure, 5′-MNNNNNNM-TR-3′.
There are 2016 = (4096 -64)/2 different 6mer non-palindromic duplexes that can be formed on the 6mer generic microchip.Sixty-four palindromic, self-complementary 6mers cannot be formed on the microchip due to their self-hybridization in solution.
To avoid competitive oligonucleotide hybridization between solution and microchip, six mixtures containing non-complementary oligonucleotides can be devised It appears that the predominant portion of the relevant data can be obtained with duplexes formed by hybridization of the microchip with two non-self-complementary mixtures: 5′-MM(A/G)MM(A/G)MM-TR-3′ and 5′-MM(A/C)MM(A/C)-MM-TR-3′.Each of these 8mer mixtures contains 1024 different core 6mers, but together they combine 1536 core 6mers duplexes because the structure MMAMMAMM is present in both of them.These two mixtures include A/T-rich duplexes needed for detection of most of the Hoechst-specific sequences.
A microchip was hybridized with each of the two mixtures of fluorescently labeled 8mers.Non-equilibrium melting curves for all duplexes formed on the microchips were recorded at increasing temperature.The hybridization and melting was repeated on the same microchip in the presence of Hoechst 33528 in the hybridization buffer.
Figure 1 shows examples of two such melting curves measured for a duplex in the presence and absence of Hoechst 33528.The temperature at which the hybridization signal was decreased to 10% of that at the 0°C initial level was considered the non-equilibrium melting temperature, T m .This allowed exclusion of the effect of mismatch-containing duplex structures which melt before perfect ones.Presence of the ligand shifted the melting curves to higher temperatures, increasing the T m (∆T m ) up to 13°C depending on the duplex structure.Therefore, the ∆T m was used to assess the affinity of Hoechst 33528 for the duplex.Figure 2 summarizes the ∆T m values measured for 1680 oligonucleotides that formed duplexes on the generic microchip by hybridization with the two mixtures of fluorescently labeled oligonucleotides.One hundred and twelve oligonucleotides were excluded from consideration due to weak hybridization signals.
Both separately synthesized sets contain the same oligonucleotides, 5′-MNANNANM-TR.The values of ∆T m for the oligonucleotides from the first set proved to be generally 1.15 times higher than the values for the same oligonucleotides from the second set, probably due to differences in the component concentrations in the two mixtures.Therefore, the values of all ∆T m were normalized by equalizing ∆T m for the same duplexes over the two experiments.Values of ∆T m for the duplexes corresponding to the microchip 6mers MNGNNGNM and MNCNNCNM were averaged because they form the same duplexes linked to the gel through different strands.This resulted in a set of ∆T m values for 1474 duplexes.One would expect that these oligonucleotides should have the same difference in ∆T m .However, since the experiments were performed with different oligonucleotide mixtures and different microchips, the above difference was <1.5°C (3 times the standard error of ∆T m ) for as many as 12% of the duplexes.

Analysis of Hoechst 33258-specific motifs in duplexes
The data from Figure 2 show that Hoechst 33258 has a stronger affinity for A/T-than G/C-containing duplexes.Ligandspecific motifs need more detailed analysis.The following computer-assisted approach was developed to analyze the large mass of experimental data obtained on the generic microchips.
Figure 3 is a histogram of the number of duplexes having specified ∆T m values.It reveals a sharp peak centered near 0°C corresponding to the duplexes that are not stabilized by Hoechst 33258, and a broad peak beginning at 1.5°C and ending with a long tail at 13°C.Therefore, ∆T m > 1.5°C is considered an indication that Hoechst 33258 is binding to the duplex.There were 300 such duplexes.
We define the binding motif as the sequence that is sufficient to place the duplex into the set of 300 Hoechst 33258-stabilized ones.The variables used to characterize sequence motifs are: D, the total number of measured duplexes containing a certain motif; S, the number of duplexes with this motif having ∆T m > 1.5°C; H, the total number of Hoechst-stabilized duplexes (H = 300 in our case).
The ratio S/D indicates whether or not the presence of a motif is sufficient for duplex stabilization by Hoechst 33528.Ideally the ratio equals 1, but in practice this may not be the case, due to experimental error.The ratio S/H indicates what proportion of the Hoechst 33258-binding sequences are constituted by the motif.Similarly to S/D, this ratio ideally equals 1 when the motif involved is the only binding motif.The closer both S/D and S/H are to 1, the better the motif describes the set of 300 Hoechst 33258-stabilized duplexes.
Figure 4 presents S/H and S/D values for the six most stable dinucleotide and trinucleotide motifs.Among all dinucleotides, AA, (A/T)A and A(A/T) have high S/D values, which indicates that they may be part of a Hoechst 33258-binding motif.However, they do not constitute a whole motif, as indicated by their low S/H values.These data indicate that a Hoechst 33258-binding motif should contain AA, (A/T)A or A(A/T).Among the trinucleotides, the most favorable is AA(A/T) with S/H = 0.78 and S/D = 0.80.All other trinucleotide motifs have either S/H or S/D values lower than 0.78.The longer tetranucleotide motifs are less favorable (not shown), generally due to the small number of duplexes they represent, which results in a low S/H ratio.
These results lead us to conclude that the sequence motif responsible for Hoechst 33258 binding to a DNA duplex is AA(A/T).However, the ligand has essentially higher affinity for some longer motifs.
More detailed analysis of the effect of the length of (A/T) n in duplexes on their affinity for Hoechst 33258 is shown in Figure 5. Hexanucleotides containing single or two adjacent A:T base pairs showed negligible affinity for the ligand.Affinity became noticeable for three adjacent A:T base pairs and plateaued for four A:T base pairs.A small affinity enhancement for stretches of five and six A:T base pairs may be due to the presence of two and more 3-4 bp long Hoechst 33258-binding sites in the duplexes; this is consistent with previous results (17).

Hoechst 33258 specificity for duplexes containing (A/T) 4 tetranucleotides
The results of an analysis of Hoechst 33258 affinity for different (A/T) 4 -containing duplexes are shown in Table 1.Both the sequences of these duplexes and their flanking base pairs may have a strong effect on the affinity and complicate the analysis.For example, duplexes containing five and more consecutive A:T base pairs may include several ligand-binding tetranucleotide motifs with different affinities.1).The affinity of the same A/T tetranucleotide varies significantly in the presence of different flanking sequences.However, independently of this effect of the flanking base pairs, all (A/T) 4 except one can be arranged according to their Hoechst 33258 specificity: AATT > AAAT > AAAA > AATA > TAAT > ATAT > TAAA > as follows: MMM(A/G)-(A/G)MMM, MMM(A/C)(A/C)MMM, MM(A/G)MM(A/G)MM, MM(A/C)MM(A/C)MM, M(A/G)MMMM(A/G)M and M(A/C)-MMMM(A/C)M.This set of mixtures can form all 2016 nonself-complementary duplexes on the genetic microchip.

Figure 1 .
Figure 1.Non-equilibrium melting curves for a microchip duplex measured in the absence (A) and presence (B) of Hoechst 33258 dye.A duplex gel-MTTT-TCGM-3′/5′-MCGAAAAM-TR-3′ was formed by hybridization of the mixture 5′-MM(A/G)MM(A/G)MM-Texas Red with the generic microchip.The nonequilibrium melting temperature, T m , was defined as the temperature when the hybridization signals drop to 10% of the initial value measured at 0°C.The Hoechst 33258 affinity to a duplex was measured as ∆T m = T mB -T mA .
To simplify the comparison, only three types of duplexes containing (A/T) 4 and restricted on one or both ends with G:C base pairs were analyzed: M(A/T) 4 (G/C)NM, MN(G/C)(A/T) 4 M and M(G/C)(A/T) 4 (G/C)M (see Table

Figure 2 .
Figure 2. Increase in the melting temperature, ∆T m , caused by binding of Hoechst 33258 dye to duplexes formed on the microchip.The generic 6mer microchip was hybridized successively with two mixtures of fluorescently labeled oligonucleotides 5′-MM(A/G)MM(A/G)MM-TR and 5′-MM(A/C)MM(A/C)MM-TR in the absence and presence of the dye, and the melting curves for duplexes were measured.The duplexes of the microchip are arranged as a 2-dimensional matrix according to the 3′-halves (in rows) and 5′-halves (in columns) of their 6mer cores of the fluorescent oligonucleotides in solution.The data are available in electronic form upon request from the corresponding author.

Table 1 .Figure 3 .
Figure 3. Histogram showing the number of duplexes, N, demonstrating the specified ∆T m .About 300 duplexes have strong affinity for Hoechst 33258, ∆T m > 1.5°C (arrow).Figure 4. Statistical analysis of Hoechst 33258-binding motifs for the presence of some di-and trinucleotide sequences.D is the total number of measured duplexes containing a certain motif; S is the number of duplexes with this motif having ∆T m > 1.5°C; H is the total number of Hoechst 33258-stabilized duplexes (H = 300 in our case).S/D indicates wether or not the presence of a motif is sufficient for stabilization of the duplex by Hoechst 33258.

Figure 4 .
Figure 3. Histogram showing the number of duplexes, N, demonstrating the specified ∆T m .About 300 duplexes have strong affinity for Hoechst 33258, ∆T m > 1.5°C (arrow).Figure 4. Statistical analysis of Hoechst 33258-binding motifs for the presence of some di-and trinucleotide sequences.D is the total number of measured duplexes containing a certain motif; S is the number of duplexes with this motif having ∆T m > 1.5°C; H is the total number of Hoechst 33258-stabilized duplexes (H = 300 in our case).S/D indicates wether or not the presence of a motif is sufficient for stabilization of the duplex by Hoechst 33258.