Histone H1 preferentially binds and aggregates scaffold-associated regions (SARs) via the numerous homopolymeric oligo(dA).oligo(dT) tracts present within these sequences. Here we show that the mammalian somatic subtypes H1a,b,c,d,e and H1° and the male germline-specific subtype H1t, all preferentially bind to the Drosophila histone SAR. Experiments with the isolated domains show that whilst the C-terminal domain maintains strong and preferential binding, the N-terminal and globular domains show weak binding and poor specificity for the SAR. The preferential binding of SAR by the H1 molecule thus appears to be determined by its highly basic C-terminal domain. Salmine, a typical fish protamine, which could have its evolutionary origin in histone H1, also shows preferential binding to the SAR. The interaction of distamycin, a minor groove binder with high affinity for homopolymeric oligo(dA).oligo(dT) tracts, abolishes preferential binding of the C-terminal domain of histone H1 and protamine to the SAR, suggesting the involvement of the DNA minor groove in the interaction.
Received July 27, 2004; Revised and Accepted October 29, 2004
H1 linker histones are thought to be primarily responsible for the condensation of the nucleosome chain in the thick chromatin fibre. It is currently accepted that histone H1 could have a regulatory role in transcription through the modulation of chromatin higher-order structure. H1 has been described as a general transcriptional repressor because it contributes to chromatin condensation, which limits the access of the transcriptional machinery to DNA. Other studies indicate that H1 may regulate transcription at a more specific level, participating in complexes that either activate or repress specific genes (1–8). Preferential binding to SARs (for scaffold-associated regions; also termed matrix-associated regions, MARs) (9) and participation in nucleosome positioning (10) are other mechanisms by which H1 could contribute to transcriptional regulation.
H1 has multiple isoforms. In mammals, six somatic subtypes, designated H1a–e and H1°, a male germline-specific subtype, H1t, and an oocyte-specific subtype, H1oo, have been identified (11–14). The subtypes differ in their timing of expression (15), extent of phosphorylation (16) and turnover rate (17,18). Analysis of the evolution of vertebrate H1 subtypes has shown that amino acid substitution rates differ among subtypes by almost one order of magnitude, suggesting that each subtype has acquired a unique function (19). Differences in DNA condensing capacity have also been demonstrated for some subtypes (20–22).
Histone H1 has been characterized as an SAR-binding protein (9). SARs were identified as DNA sequences rich in AT base pairs (>70%) and in homopolymeric oligo(dA).oligo(dT) tracts (A-tracts) (23) that were specifically bound by the nuclear and the metaphase scaffold. SARs have been proposed as DNA elements that would define the bases of chromatin loops in eukaryotic cells (23), and that could also be involved in chromosome dynamics (24). In addition to H1, several SAR-binding proteins have been identified, including topoisomerase II, lamin B1, nucleoline, HMG I/Y, SAT B1, SAF-A and SAF-B (25–30). It has been proposed that regulated H1 dissociation or assembly with SARs is implicated in the regional opening or closing of chromatin loops and, consequently, contributes to transcriptional regulation (9). The high affinity cooperative binding of H1 and other SAR-binding proteins to SARs is determined by the presence of the A-tracts, rather than a precise base sequence (25,31). A-tracts have peculiar structural properties, including a narrower minor groove (32,33). The importance of the conformational features of the DNA in H1 binding is stressed by the preferential affinity of H1 for supercoiled DNA, cruciforms and DNA fragments with intrinsic curvature (34–36).
H1 linker histones present a tripartite structure consisting of a central globular domain flanked by highly basic N- and C-terminal tail-like domains (37). The N- and C-terminal domains are very different in length, the C-terminal tail comprising almost 50% of the protein. The distribution of charge in the C-terminal domain is extremely uniform in spite of the variation in sequence of the different H1 subtypes (38). The structure of the globular domain has been described by X-ray diffraction (39) and nuclear magnetic resonance (40,41). It contains a three-helix bundle, which resembles the winged-helix motif found in some sequence-specific DNA-binding proteins, and a C-terminal β-sheet. The N- and C-terminal domains are largely unstructured in solution. However, they acquire a substantial amount of secondary structure upon interaction with DNA. Helix and turn elements, inducible upon interaction with the DNA, have been described in both the N- and C-terminal domains (42–45). The distinct structure of the H1 domains suggests that they could play specific functions in chromatin structure. The N-terminal domain could contribute to the binding stability of the globular domain; the globular domain is very likely to localize the molecule in the nucleosome; while the C-terminal domain is the main region of the molecule involved in chromatin condensation through binding and neutralization of the charge of the linker DNA.
The tripartite structure of histone H1 raises the question of the involvement of the individual H1 domains in the preferential binding of the protein to SARs. Here, we identify the C-terminal domain as the main determinant of the SAR-binding properties of histone H1. We also show that salmine, a typical protamine that could have its evolutionary origin in histone H1 (46), preferentially binds to SARs. The possible significance of these findings in relation to H1 and SAR functions is discussed.
MATERIALS AND METHODS
Peptides and protamine
The peptides Ac-EKTPVKKKARKAAGGAKRKTSG-NH2 (NE-1) and Ac-TENSTSAPAAKPKRAKASKK-NH2 (NH-1) were synthesized by standard methods (NE-1 by Neosystem Laboratoire, Strasbourg, France, and NH-1 by DiverDrugs, Barcelona, Spain). Peptide homogeneity was determined by HPLC on Kromasil C8. The peptide composition was confirmed by amino acid analysis and the molecular mass was checked by mass spectrometry. The sequence of NE-1 corresponds to residues 15–36 at the N-terminus of mouse H1e and that of NH-1 to residues 1–20 of mouse H1°. Protamine (salmine) was from salmon (Sigma P4005).
Separation of histone H1 subtypes
Nuclei were isolated from the brain of adult mice (47,48). Histone H1 was extracted with 0.35 M NaCl, following the method of García-Ramírez et al. (49). The mixture of subtypes was digested with alkaline phosphatase to eliminate small amounts of phosphorylated forms that could be present. H1° was purified by gel-filtration chromatography, according to Böhm et al. (50). The subtypes H1a–e were separated by reverse phase HPLC according to Brown et al. (51). All subtypes were obtained as homogeneous peaks, except H1d and H1e, which largely overlapped. The latter subtypes were separated by acetic acid/urea gel electrophoresis (52) and recovered by electroelution using a Biotrap camera (Schleider & Schuell). H1t was purified from mouse testes according to Khadake et al. (21). Before being used in binding experiments, H1 subtypes were subjected to a cycle of denaturation/renaturation by stepwise dialysis from 6 M urea into, successively, 3.0 M, 1.5 M, 0.7 M, 0.3 M and 0.0 M urea, in 0.2 M NaCl, 0.01 M phosphate buffer, pH 7. Finally, the proteins were dialysed against 0.14 M NaCl, 0.01 M phosphate buffer, pH 7.0. The concentration of protein was estimated by amino acid analysis.
Cloning, expression and purification of the globular and C-terminal domains of histone H1 subtypes
The sequences encoding the globular domains of histones H1° and H5, and the C-terminal domains of histones H1e, H1° and H1t were cloned and expressed. All gene fragments were amplified from mouse genomic DNA by PCR. The primers were 5′-GGCCGCCCATATGTCCACGGACCACCCCAAG-3′ and 5′-CTTGGATCCCTACGACCTCTTGGGCTC-3′ for the globular domain of H1°; 5′-GGCATCGCATATGTCGGCATCGCACCCCACC-3′ and 5′-GCCGGATCCTTAGGACCTCTTGGCC-3′ for the globular domain of H5; 5′-CCACCATGGATGAGCCTAAAAGGTC-3′ and 5′-GGAGATCTCTTCTTCTTGCTGGCCCTCT-3′ for the C-terminal domain of H1°; 5′-AAACCATGGCTGCTTCCGGTGAGGCTAA-3′ and 5′-ACAGATCTCTTTTTCTTGGCTGCGGTTTT-3′ for the C-terminal domain of H1e; and 5′-GTACCATGGCGGCTTCAGGGAACGAC-3′ and 5′-ACGGATCCCTTCCTCCCTGCTGCCTTCCT-3′ for the C-terminal domain of H1t. The amplification products of H1° and H5 globular domains were cloned in the pET11b vector (Novagen), using NdeI and BamHI restriction sites to yield the expression vectors pGH1° and pGH5, respectively. The C-terminal domains were cloned in the pQE-60 vector (Qiagen) using the NcoI and BglII restriction sites to yield the expression vectors pCTH1°, pCTH1e and pCTH1t.
The recombinant plasmids pGH1° and pGH5 were transformed into E.coli BL21(DE3). Cells were grown to an OD600 of 0.8 and then induced with 1 mM IPTG, allowing expression to proceed for 4 h at 37°C. Cells were then harvested and stored at −80°C. The protein was purified according to the protocol described for the globular domain of chicken histone H5 (53).
The expression vectors pCTH1°, pCTH1e and pCTH1t were transformed into E.coli M15 (Qiagen). Cells were grown and induced as previously described for pGH1° and pGH5. Cells were lysed in lysis buffer (0.05 M NaH2PO4, 0.75 M NaCl, 0.02 M imidazol) plus 4 M guanidine hydrochloride, pH 8.0 for 15 min at room temperature. Guanidine hydrochloride was found to be necessary to avoid degradation and aggregation of the expressed protein. The extract was centrifuged at 20 000 g for 25 min. The supernatants were loaded on a HiTrap chelating HP column (Amersham Biosciences) equilibrated with lysis buffer. The column was then washed in three steps with lysis buffer containing increasing amounts of imidazol: 40, 60 and 80 mM. Finally, the proteins were eluted with 250 mM imidazol in lysis buffer and desalted by gel filtration through Sephadex G-25 (Amersham Biosciences).
Preparation of DNA fragments
An SAR fragment of 657 bp from the histone cluster of Drosophila melanogaster was obtained by digestion of p1314 (9) with KpnI and BamHI. Another DNA fragment, of 587 bp, was excised from pUC19 by digestion with HaeIII. Both inserts were separated on agarose gels and electroeluted using a Biotrap camera (Schleider & Schuell). The 587 bp pUC19 fragment was extended from both ends to obtain a fragment of 763 bp. The long fragment was obtained by PCR on pUC19 using the primers 5′-GCGGTTAGCTCCTTCGGTCCTC-3′ and 5′-CACCCGCTGACGCGCCCTGACG-3′. An AT-rich sequence (75% AT) was prepared by polymerization of the 5′-phosphorylated oligomer.
5′-CTATGATATATAGATAGTTAATGTAATATGATATAGATATAGGGATCC-3′, annealed with a complementary sequence that left five overhanging nucleotides. The annealed DNA was ligated overnight at 16°C with T4-ligase (Roche). The products of the ligation reaction were separated on agarose gels. Fragments ranging in size from ∼500 to 5000 bp were electroeluted using a Biotrap camera (Scheider & Schuell).
Binding experiments were performed by mixing equivalent amounts of SAR and pUC19 fragments with the proteins at different ratios. Binding conditions were 0.01 M phosphate buffer, pH 7.0, 5% glycerol and either 0.07 or 0.14 M NaCl. After 1 h of incubation at 37°C, the mixtures were centrifuged at 14 000 g for 10 min. The pellets and the supernatants were digested with proteinase K at 37°C overnight and the DNA was purified by phenol–chloroform extraction. The proportions of the SAR and pUC19 fragments in the complex and the supernatant were analysed by electrophoresis on 2% agarose (Metaphor) gels.
In some experiments, histone H1 C-terminal domain and protamine binding was performed in the presence of the DNA-binding drug distamycin A (Sigma). The DNA was preincubated with the drug for 30 min at 37°C before adding the proteins. The mixture was then incubated for 90 min at 37°C, as described.
All histone subtypes are SAR-binding proteins
The specific interaction of H1 with SARs was originally established for the SAR of the Drosophila histone–gene cluster, using total H1 from rat liver and DNA fragments from pBR322 as competitors (9). The H1 complement from rat liver is dominated by the subtypes H1e and H1c, while H1a, b, d and H1° are either present in low amounts or absent. In order to determine whether the six mammalian somatic subtypes, H1a–e and H1° and the more divergent male germline-specific subtype, H1t, are all SAR-binding proteins, we performed binding experiments with the minimal 657 bp SAR (75% AT) derived from the Drosophila histone–gene cluster and each of the purified subtypes. An HaeIII/HaeIII fragment of similar length from pUC19 (587 bp, 56% AT) was used as a competitor. The experiments consisted of mixing approximately equal amounts of the SAR and the pUC19 fragments with a limited amount of one of the subtypes. The experiments were performed at physiological salt concentration (0.14 M NaCl), where at subsaturating protein concentrations fast-sedimenting fully complexed DNA molecules co-exist with free DNA (54). Analysis of the DNA in the soluble and insoluble fractions showed that under conditions of limited protein, H1 bound first to the SAR whilst the pUC19 fragment remained in the supernatant (Figure 1, lanes 2–15). The pUC19 appeared in the pellet only when the SAR was saturated, and was thus absent from the supernatants, which occurred, starting with a 1:1 mixture of SAR and pUC19, at a protein/(SAR+pUC19) weight ratio >0.5 (Figure 1, lanes 16 and 17). Above a weight ratio of 1.0, all DNA was saturated and further added protein was found as free protein in the supernatants. As shown in Figure 1, all subtypes, including H1° and H1t, showed strong cooperative binding to the SAR fragment.
The preferential binding of histone H1 is determined by its C-terminal domain
To establish the contribution of the C-terminal domain to the SAR-binding character of the entire H1 molecule, we performed experiments similar to those described for the entire H1 molecule, using the purified recombinant C-terminal domains of subtypes H1e, H1° and H1t in physiological salt. As with the entire molecule, in competition experiments the C-terminal domains bound first to the SAR fragment, and only when it was saturated did they bind to the pUC19 fragment, which occurred, starting with a 1:1 mixture of SAR and pUC19, at a protein/(SAR+pUC19) weight ratio of about 0.3 (Figure 2). The preferential binding of the C-terminal domain to the SAR was maintained in 70 mM NaCl, although under these conditions the binding was slightly less cooperative, as indicated by the presence of a faint pUC19 band in the pellet, co-existing with a similarly faint band of SAR in the supernatant (Figure 2D). In spite of the large sequence divergence between the C-terminal domains of H1e, H1° and H1t, all showed the same strong preference for the SAR sequences (Figure 3).
To make sure that the slightly longer size of the SAR fragment over the pUC19 fragment had no influence on the preference of the C-terminal domain for the SAR, we performed a competition experiment with an extended pUC19 fragment of 763 bp that included the 587 bp pUC19 sequence. It can be seen in Figure 2E that the preference of the H1 C-terminal domain for the SAR remained unaffected.
It has been reported that the preferential binding of histone H1 to SARs is not determined by its high AT content, but by the presence of abundant A-tracts (21,35). As a control of the preferential binding of the H1 C-terminal domain, we have included a competition between the SAR and a mixture of multimers of a 50 bp sequence containing 75% AT, but lacking A-tracts (see Materials and Methods). The size distribution of the multimers spanned from ∼500 to 5000 bp (Figure 4A). As shown in Figure 4B, the C-terminal domain bound with extremely high preference to the SAR in spite of the excess and longer size of a large fraction of the polymerized sequence.
In the study of the interaction of the N-terminal domain with the SAR, we used peptides that had been characterized previously (44,45). One peptide corresponded to the N-terminal domain of H1° (residues 1–20). Another peptide comprised the basic region of the N-terminal domain of H1e (residues 15–36). The affinity of these peptides for the DNA was much lower than that of the C-terminal domain, as expected from the lower number of positive charges involved in the interaction: 6–9 charges in the N-terminal domain compared with 33–46 in the C-terminal domain, depending on the subtype. Both N-terminal peptides precipitated the DNA, but a large excess of peptide was necessary. The N-terminal peptide of H1e had a moderate preference for the SAR in physiological salt (Figure 5A). The preference was most apparent at protein/DNA ratios that precipitated a small amount of DNA, and was lost at higher protein/DNA ratios that precipitated most of the DNA. In 70 mM NaCl, the preference for the SAR fragment was completely lost (Figure 5B). The N-terminal peptide of H1° did not show any preference for the SAR, either in 140 mM or in 70 mM NaCl (Figure 5C and D).
With the globular domain, precipitation experiments could not be performed in 140 mM NaCl because the interaction with the DNA was too weak in this salt concentration. Binding experiments were carried out instead in 70 mM NaCl, where with a high enough protein/DNA ratio all the DNA could be precipitated. The globular domain of H1° showed no preference for the SAR fragment, not even at protein/DNA ratios that precipitated a small proportion of the DNA (Figure 5E). The globular domain of H5, an avian subtype closely related to H1°, also failed to bind preferentially to the SAR (Figure 5F).
Protamines also bind preferentially to SARs
Typical protamines are sperm basic proteins with an arginine content of 60% or higher. We used salmine in the binding experiments, a typical protamine from salmon sperm. It has 32 residues, 21 of which are arginine. The Arg residues are mostly present in clusters of up to six residues. Figure 6 shows that protamine also bound with high preference to the SAR fragment either in physiological salt or in 70 mM NaCl. The preference is, however, not so high as displayed by the C-terminal domain, as shown by the presence of a small amount of competitor pUC19 in the pellets coexisting with a small amount of SAR in the supernatants. The preference for the SAR was maintained when the longer pUC19 fragment (763 bp) was used in the experiments (Figure 6). As in the case of the H1 C-terminal domain, protamine also bound preferentially to the polymerized sequence lacking A-tracts (Figure 4C).
Distamycin abolishes the SAR-binding character of the C-terminal domain of H1 and protamine
Distamycin is an antibiotic drug with high selectivity for A-tract DNA (55,56). Binding to A-tracts by distamycin abolishes the specific interaction of SARs with the nuclear scaffold and SAR binding proteins. In particular, binding of distamycin results in the suppression of preferential binding of histone H1 to SAR containing DNA, leading to a redistribution of histone H1 between SAR and non-SAR DNA (31). Distamycin has been used to confirm the SAR-binding properties of several proteins. We examined the effect of distamycin on the binding of the C-terminal domain of H1 and protamine, which were shown to bind highly preferentially to the SAR fragment. In both cases, distamycin abolished the preferential binding of the proteins to the SAR (Figure 7). The effect was so clear as to reverse the binding preferences, binding to pUC19 being stronger than to the SAR in the presence of distamycin.
Histone H1 has been generically described as an SAR-binding protein. Here, we have shown that the highly conserved mammalian subtypes, H1a–e, and the highly divergent H1° and H1t are all SAR-binding proteins. The C-terminal domain of histone H1 contributes to a large extent to the binding free energy of the entire molecule to DNA, providing the localized charge neutralization of the DNA necessary for the chromatin folded state (57–59). It was thus of interest to investigate whether the SAR-binding character of H1 subtypes was a property of the C-terminal domain. We studied the C-terminal domains of subtypes H1e, the most abundant somatic subtype, H1°, a subtype that accumulates in quiescent cells, and H1t, the male germline-specific subtype. In spite of their large sequence divergence (44.3% sequence identity between H1e and H1°, and 27.9% between H1° and H1t; Figure 3), all three C-terminal domains appear to have very strong preference for the SAR. H1e and H1° contain SPKK motifs, presumably contributing to SAR-binding specificity; however, this motif is not present in H1t, which indicates that its presence is not essential to determine the SAR-binding character of histone H1 subtypes. As shown previously for the entire molecule (31), distamycin, a minor groove binder with high affinity for A-tracts, abolishes the preferential binding of the C-terminal domain to the SAR fragment. This suggests involvement of the DNA minor groove in the interaction with the C-terminal domain, although it does not exclude the possibility that the C-terminal domain also binds to the wide groove (60). As shown by X-ray crystallography, distamycin occupies the minor groove with preference for narrow minor grooves (32). However, distamycin may also cause subtle changes in the structural parameters and mechanoelastic properties of the DNA that could contribute to the suppression of the preferential binding of SAR-binding proteins (56).
In agreement with its reduced number of positive charges compared with the C-terminal domain, the N-terminal domain exhibits much weaker DNA binding, and a large excess of protein is necessary to precipitate the DNA. The N-terminal peptide of H1° showed no preference for the SAR fragment. The N-terminal domain of H1e had a moderate preference for the SAR fragment in physiological salt, but in 70 mM salt the binding preference was completely lost. With the globular domain, binding experiments had to be performed at 70 mM NaCl, as in 140 mM the binding was too weak to observe a precipitate even with a large excess of protein. Under these conditions, the globular domain did not show any preference for the SAR sequences over the pUC19 fragments. Given the low selectivity and weak binding of the N-terminal peptides and the globular domain, it is unlikely that they make a substantial contribution to the SAR-binding character of the whole protein. It thus appears that the preferential binding of histone H1 to SARs is basically determined by its C-terminal domain.
Phosphorylation is the main post-translational modification undergone by histone H1. Most phosphorylation sites are located in the C-terminal domain. Phosphorylation weakens the binding of H1 to DNA, and could thus facilitate SAR chromatin opening through cooperative H1 dissociation (61,62). The C-terminal domain could also be responsible for the targeting of H1 molecules to SARs. At substoichiometric concentrations, the higher affinity of H1 for SARs would guarantee the saturation of SAR sequences with H1 in preference to other sequences.
Evolutionary evidence has recently been obtained for chordate protamines having originated from histone H1 through general substitution of Lys residues by Arg (46). In view of the preferential binding of the C-terminal domain of H1 to SARs, we investigated whether such a property was conserved in protamines. The results show that protamine also binds to the SAR fragment with high specificity. Moreover, the binding preference is also determined by the oligo-A tracts as shown by the suppression of preferential binding to the SAR by distamycin. The effect is so strong as to reverse the binding preferences of protamine. This is indicative of minor groove recognition by protamine. The role, if any, of the preferential binding of protamines to SARs cannot be ascertained at this stage; however, it is possible that this property could lend some spatial and temporal order to the substitution of core histones by protamines in spermatogenesis, the SARs providing nucleation sites for the substitution process.
The features that confer SAR-binding specificity to a protein or protein motif have yet to be established. Mammalian HMGA non-histone proteins contain multiple copies of a DNA-binding motif called ‘AT hook’ that preferentially binds to the narrow minor groove of AT sequences (63); the core of the AT hook is the sequence Arg–Gly–Arg. The interaction of the RGR element with the minor groove has been characterized by NMR spectroscopy (64). No RGA motifs are found in either histone H1 or protamine. While histone H1 is lysine-rich, the core motif of AT hooks and protamine contain Arg. The C-terminal domain is an even better SAR-binding protein than protamine. The choice between Arg and Lys does not therefore appear essential in itself in determining the SAR-binding character. Presumably, the interaction should as a rule involve minor groove recognition, as all SAR-binding proteins so far described are competed by distamycin. The narrower minor groove associated with A-tracts would be preferred because it would give better van der Waals contacts between the walls of the minor groove and the ligands (32). In the case of highly cationic ligands, such as the C-terminal domain of histone H1 and protamine, potentially interacting with the DNA phosphates, a narrower minor groove would also offer a more intense electrostatic potential along the minor groove path that could also contribute to the preferential binding.
We thank Prof U. K. Laemmli for generously providing the plasmid p1314 containing the Drosophila SAR. This work was financed by the Ministerio de Educación y Ciencia (DGICYT, BMC2002-00087) and the Generalitat de Catalunya (CIRIT, Grups de Recerca de Qualitat SGR00199).