The human nuclear retinoic acid (RA) receptor alpha (hRARα) is a ligand-dependent transcriptional regulator, which is controlled by a phosphorylation cascade. The cascade starts with the RA-induced phosphorylation of a serine residue located in the ligand-binding domain, S(LBD), allowing the recruitment of the cdk7/cyclin H/MAT1 subcomplex of TFIIH through the docking of cyclin H. It ends by the subsequent phosphorylation by cdk7 of an other serine located in the N-terminal domain, S(NTD). Here, we show that this cascade relies on an increase in the flexibility of the domain involved in cyclin H binding, subsequently to the phosphorylation of S(LBD). Owing to the functional importance of RARα in several vertebrate species, we investigated whether the phosphorylation cascade was conserved in zebrafish (Danio rerio), which expresses two RARα genes: RARα-A and RARα-B. We found that in zebrafish RARαs, S(LBD) is absent, whereas S(NTD) is conserved and phosphorylated. Therefore, we analyzed the pattern of conservation of the phosphorylation sites and traced back their evolution. We found that S(LBD) is most often absent outside mammalian RARα and appears late during vertebrate evolution. In contrast, S(NTD) is conserved, indicating that the phosphorylation of this functional site has been under ancient high selection constraint. This suggests that, during evolution, different regulatory circuits control RARα activity.
Retinoic acid (RA), the main active metabolite of vitamin A, plays a critical role in many biological processes, such as cell proliferation and differentiation, embryonic development, and adult homeostasis (Bour et al. 2006; Mark et al. 2009; Theodosiou et al. 2010). RA acts through nuclear receptors, RARs, which have been identified in a wide variety of animals.
There is one unique RAR ancestral gene for which an ortholog is known in some prostosomes, such as mollusks (Lottia gigantea) and annelids (Capitella capitata) (Campo-Paysaa et al. 2008; Albalat and Canestro 2009), and in some invertebrate deusterostomes, such as echinoderms (Strongylocentrolus purpuratus; Canestro et al. 2006; Marletaz et al. 2006), cephalochordates (Branchiostoma floridae; Escriva et al. 2002a; Fujiwara 2006), and urochordates (Ciona intestinalis and Polyandrocarpa misakiensis; Hisata et al. 1998; Fujiwara 2006). Early during vertebrates evolution, the total number of genes markedly increased by two rounds (2R) of whole-genome duplication (Dehal and Boore 2005). This is why vertebrates have three RAR paralogous genes that encode the three known subtypes of receptors: α (NR1B1), β (NR1B2), and γ (NR1B3) (Escriva et al. 2006; Germain et al. 2006). Note that in teleost fishes, a third round (3R) of whole-genome duplication combined to gene losses occurred (Amores et al. 1998; Postlethwait et al. 1998), giving rise to 4 RAR genes in zebrafish (Bertrand et al. 2007). The history of RARs in regard to genome duplications has been addressed in several phylogenetic studies that clearly validated this evolutionary scenario (Escriva et al. 2002b; Jaillon et al. 2004; Robinson-Rechavi et al. 2004; Bertrand et al. 2007; Kuraku et al. 2009). Note that the timing of the genome duplications inferred from a recent analysis of the RAR synteny group by Kuraku et al. (2009).
RARs are ligand-dependent transcriptional regulators (for review, see Rochette-Egly and Germain (2009) and references therein), which bind to specific sequence elements located in the promoters of target genes. They have a well-defined domain organization, consisting mainly of a central DNA-binding domain (DBD) linked to a C-terminal ligand-binding domain (LBD) and a N-terminal domain (NTD) (fig. 1A). Although the NTDs are naturally not structured and not conserved (Dyson and Wright 2005; Lavery and McEwan 2005), DBDs and LBDs are highly structured and depict a significant degree of conservation between vertebrate species (Escriva et al. 2006; Campo-Paysaa et al. 2008; Theodosiou et al. 2010). Briefly, the DBD contains two typical cysteine-rich zinc–binding motifs and two alpha helices, which cross at right angles, folding into a globular conformation to form the core of the DBD. Concerning the LBD, it shows a common fold comprising 12 conserved alpha helices and a short beta turn, arranged in three layers to form an antiparallel «alpha-helical sandwich» (Renaud et al. 1995) (fig. 1B). Ligand binding triggers conformational changes in the LBD that direct the dissociation/association of several coregulator protein complexes and thereby the transcription of target genes (Rochette-Egly and Germain 2009).
In addition to this scenario, a new concept emerged according to which RARs are also subjected to rapid phosphorylation cascades. Recent studies from our laboratory (Bruck et al. 2009) demonstrated that, via nongenomic effects, RA activates rapidly the p38MAPK/MSK1 pathway, which in turn leads to the phosphorylation of the RARα subtype (mouse and human) at two serine residues located in solvent-accessible regions of the receptor. One serine is located in the LBD (S[LBD]), in a loop between helices 9 and 10 (L9–10) (fig. 1A and B) and belongs to an arginine-lysine-rich motif that corresponds to a consensus phosphorylation motif for MSK1 (fig. 1A). The other serine residue is located in the NTD (S[NTD]), in a proline-rich motif (fig. 1A) and is phosphorylated by cdk7 (Rochette-Egly et al. 1997; Bastien et al. 2000), which forms with cyclin H and MAT1 the CAK subcomplex of the general transcription factor TFIIH. Most interestingly, the correct positioning of cdk7 and thereby the efficiency of the NTD phosphorylation rely on the docking of cyclin H at a specific site of the LBD located in loop L8–9 and the N-terminal part of helix 9 (H9) (fig. 1A and B) (Bour et al. 2005).
In the case of human and mouse RARα, we previously demonstrated that the phosphorylation of the two serines results from a coordinated phosphorylation cascade starting with the phosphorylation by MSK1 of S(LBD) (fig. 1B) (Bruck et al. 2009). Phosphorylation of this residue increases the binding efficiency of cyclin H to the nearby loop L8–9 (fig. 1B), allowing the right positioning of cdk7 and the phosphorylation of the serine located in the NTD (fig. 1A) by this kinase (Gaillard et al. 2006). Finally, phosphorylation of S(NTD) leads to the recruitment of RARα to promoters (Bruck et al. 2009). Whether it also controls the association/dissociation of specific coregulators as described for the other RAR subtypes (Vucetic et al. 2008; Lalevee et al. 2010) is still unknown.
In RARs, ligand binding is conserved at least in chordates (Escriva et al. 2006), indicating that the ligand-triggered conformational changes are a common feature of all chordate species. Interestingly, the high regulatory potential of the phosphorylation cascade also makes phosphorylations prime candidates for evolutionary studies. It must be noted that S(LBD) and S(NTD) are conserved in the different human and mouse RAR subtypes α, β, and γ (Rochette-Egly 2003; Rochette-Egly and Germain 2009), but the above cascade has been described only in the context of RARα (Bruck et al. 2009), which has ubiquitous or quite widespread expression patterns. There are still no indications whether this cascade also occurs in the context of the other RAR paralogs (RARβ and RARγ), which show rather complex tissue-specific expression (Dolle 2009). Therefore, we analyzed the pattern of conservation of the S(NTD) and S(LBD) phosphorylation sites focusing on the RARα subtype.
First, we demonstrated that in nonmammalian vertebrates exemplified by zebrafish, the S(NTD) of RARα is conserved, whereas S(LBD), the phosphorylation of which increases the flexibility of L8–9 that is required for cyclin H-binding, is absent. However, this process was compensated by changes in the sequence of L8–9 mimicking the conformation/flexibility changes induced by phosphorylation. Then, we traced back the evolution of chordate RARα phosphorylation sites. This work led to the conclusion that in RARα, S(NTD) is evolutionary conserved, indicating that the phosphorylation of this functional site has been under ancient strong selection constraint. However, S(LBD) is most often absent outside mammalian RARα. This indicates that the fine-tuned phosphorylation cascade of RARα, starting at S(LBD), appears late during vertebrate evolution. Thus, the evolution of phosphorylation sites appears to provide a reservoir of changes in order to provide additional levels of regulation of critical functional proteins.
Materials and Methods
Sequences Alignment and Ancestral Sequences Reconstructions
RAR protein sequences were found in the nuclear receptor database (NureXbase) (http://nurexbase.prabi.fr) and by Blast and gene homology (NCBI). Multiple sequence alignments were performed by the MUSCLE software (Edgar 2004) and analyzed with ClustalX. Sequence assignment was verified by phylogenetical reconstruction as in Escriva et al. (2006). Ancestral sequences were estimated with PAML (Yang 1997) under the JTT + γ substitution model from a data set of 71 sequences containing a 232 amino acid long portion of the LBD. Some sequences containing obvious predictions errors or indels at unambiguous positions were manually corrected by parsimony. Other sequences with too many uncertainties were excluded from the reconstruction data set. PhyML (Guindon and Gascuel 2003) generated the starting tree.
Molecular Dynamics Simulations
Given the lack of an experimental structure for the LDB of human (h) RARα in an agonist form (holo) at the start of this work, a model structure was assembled from closely related structures available in the Protein Data Bank (Berman et al. 2000). The majority of the structure that includes helix 1 (H1) to helix 10 (H10) was taken from the structure of human nuclear RA receptor alpha (hRARα) (PDBID 1DKF) bound to the selective antagonist BMS614 (Bourguet et al. 2000). Structural information for an agonist conformation of H11 and H12 was taken from the structures of hRARγ (PDBID 1FCZ) and RARβ (PDBID 1XAP) in the agonist forms (Klaholz et al. 2000; Germain et al. 2004). Side chains specific to hRARα were positioned using the Scwrl3.0 software (Canutescu et al. 2003). The structures of 9-cis RA and of a fragment of the TRAP220 coactivator were obtained from the structure of RARβ (PDBID 1XDK) in an agonist conformation (Pogenberg et al. 2005). The protonation states of all titratable groups at physiological pH (7.4) were determined as described in Schaefer et al. (1998), and all were found to favor their standard protonation states. Our model shows a very high degree of correspondence with an experimental structure of hRARα in an agonist form, which has been recently deposited in the Protein Data Bank (3A9E) (Sato et al. 2010), after the termination of this work.
Structural models were also constructed for apo hRARα, that is, in the absence of ligand and coactivator peptide. Under these conditions, the C-terminal end of the LBD, in particular H12, extends toward the solvent where it displays significant conformational flexibility (Renaud and Moras 2000). In the absence of any experimental apo hRARα structure, a model was constructed that kept H1–H10 in the same conformation as the holo structure but repositioned the C-terminal end based on the apo RXR structure (PDB 3A9E) (Sato et al. 2010). This model was constructed using the Modeler 9v8 program (Sali and Blundell 1993). Given that in the apo structures, H12 is conformationally mobile, variants of this apo model were constructed with different initial positions of H12. As all the initial apo models gave similar simulation results, we presented the data corresponding to the initial structure (Sato et al. 2010).
The LBDs of zebrafish (zf) RARα(-A and -B) were constructed by modifying all residues that differ from hRARα, maintaining the backbone conformation and modifying the side chains using the SCRWL4 program (Canutescu et al. 2003). For zfRARα-A, this involved the following side chain modifications: E183D, V184T, G185E, E186Q, L187M, E189D, K190R, A201S, N211S, Q216R, S219A, I222V, I335L, P345A, R347K, M350V, V361I, K365N, S369H, and R370K. For zfRARα-B, the modifications were E183D, V184T, G185E, E186K, L187M, K190Q, A201S, S214A, E215D, Q216H, S219A, I222V, E280D, I335L, P345S, R347K, M350E, V361I, K365N, S369H, and R370K. A similar protocol was used to construct hRARα mutants (hRARαP345G/D346A and hRARαP345A). The models for the apo forms of zfRARαs and the hRARα mutants were constructed as above.
All molecular dynamic simulations were done using the CHARMM program (Brooks et al. 1983) and the all atom parameter set of CHARMM27 (MacKerell et al. 1998), with CMAP corrections (Mackerell et al. 2004). Hydrogen atoms were added using the HBUILD module (Brunger and Karplus 1988). Bonds between heavy atoms and hydrogen atoms were constrained using SHAKE (Ryckaert et al. 1977). We employed a shift-type cutoff at 14 Å for electrostatic interactions and a switch-type cutoff at 12.0 Å for the van der Waals energy terms.
The system was energy minimized using the steepest descent algorithm after placing harmonic constraints on the backbone and side chain heavy atoms with force constants of 50 and 100 kcal mol−1Å−2, respectively. The force constants were systematically scaled by a factor of 0.65, and minimization was repeated until there were no constraints on the protein. The protein was then solvated with a shell of explicit TIP3P water molecules (Gaillard et al. 2009) extending 12 Å from the protein surface. The system was equilibrated in two phases. In the first phase, a 20 ps molecular dynamics simulation of the water around the fixed protein was performed with a time step of 2 fs. In the second phase, the entire solvated protein was heated to 300 K and equilibrated. During heating, velocities were assigned every 50 steps from a Gaussian distribution function. During equilibration, velocities were scaled by a single factor only when the average temperature was lying outside the 300 ± 10 K window. This was followed by a 10 ns production phase without any further intervention.
Simulations were stable as measured by the backbone root-mean-square coordinate differences (RMSD), which were all less than 1.24 Å with respect to the initial starting structure. The phosphate group was assigned a charge of −2 based on the pKa of 6.5 (Kast et al. 2010).
Using the above protocol, simulations were run for the LBD of hRARα unphosphorylated or phosphorylated at S(LBD) either in the apo or holo- forms. Simulations were also run for hRARαP345G/D346A, hRARαP345A, and zfRARα-A and -B in the apo forms. Upon completion of the simulations, the root-mean-square fluctuations (RMSfl) as well as the RMSD were calculated from the trajectories.
The pSG5- and pGEX-2T-based expression vectors for hRARα1 have been previously described (Bour et al. 2005). The full-length or truncated cDNAs of zfRARα-A and zfRARα-B were amplified by polymerase chain reaction (PCR) and inserted into pSG5-hER-B10-tag or pGEX-2T vectors. The cDNA of hcyclin H (a gift from D. Busso, Institut de Génétique et de Biologie Moléculaire et Cellulaire [IGBMC]) was inserted into pCX-HA-FLAG. The cDNA of zfcyclin H (Liu et al. 2007) was inserted into pCX-HA or pET-15b. All constructs were generated using standard cloning procedures and were verified by PCR, restriction enzyme analysis, and DNA sequencing. The sequence of primers used for PCR amplifications are available upon request.
Mouse monoclonal antibodies recognizing hRARα phosphorylated at S77, cyclin H, and the epitope B of the estrogen receptor (B10) were previously described (Ali et al. 1993; Bruck et al. 2009). Rabbit polyclonal recognizing the N-terminal part of cyclin H and anti-FLAG monoclonal antibodies were from Sigma.
Mouse monoclonal antibodies recognizing zfRARα-B phosphorylated at the conserved serine residue located in the N-terminal proline-rich domain (S72) were generated by immunization of Balb/c mice with a synthetic phosphopeptide (EEMVPSSPS(p)PPPPPRVYKPC). Six-week-old female BALB/c mice were injected intraperitoneally (thrice at 2-week intervals) with 100 μg of peptide coupled to ovalbumin and 100 μg of poly I/C as adjuvant. Mice with positive sera were reinjected 4 days prior to hybridoma fusion and spleens were fused with Sp2/0.Ag14 myeloma cells. After hybridoma cell selection and cloning (de StGroth and Scheidegger 1980), the culture supernatants were tested by differential enzyme-linked immunosorbent assay with the phosphopeptide, the corresponding nonphosphopeptide, and an irrelevant phosphopeptide. Positive clones were confirmed by immunoblotting and cloned twice on soft agar. Ascites fluids were prepared by injection of 2 × 106 hybridoma cells into pristane-primed BALB/c mice.
Cell Lines, Transfections, and Immunoprecipitation Experiments
COS-1 cells were grown and transiently transfected as described (Bour et al. 2005). ZF13 cells were grown at 27 °C in Leibovitz L-15 medium (Invitrogen) supplemented with 5% fetal calf serum and 15 mM acid 4-(2-hydroxy ethyl)-1 piperazine ethane sulfonic acid and transiently transfected by using FuGene 6 reagent (Roche). Immunoprecipitations were performed with cell extracts prepared from paraformaldehyde-fixed cells (Bruck et al. 2009).
In vitro Binding and Phosphorylation Experiments
Glutathione S-transferase (GST) and GST-fusion proteins expressed in Escherichia coli were immobilized onto glutathione–Sepharose beads and incubated with recombinant human cyclin H over expressed in insect Sf9 cells (Bour et al. 2005) or with purified bacterially expressed zfcyclin H. Bound proteins were immunoprobed and quantified by using the Chemigenius XE imaging system as described (Bour et al. 2005). Data were analyzed according to standard statistical procedures using Graph Pad Prism 5.0 and compared using the Tukey’s test in conjunction with analysis of variance.
In vitro phosphorylation experiments were performed with equimolar amounts of immobilized GST-RARα proteins (5 μg). Phosphorylation by the purified cdk7/cyclin H complex (Bour et al. 2005) was performed as in (Bruck et al. 2009) and detected by immunoblotting with antibodies recognizing specifically the phosphorylated forms. Phosphorylation by recombinant active MSK1 (Millipore Upstate Chemicon) (30 ng) was performed in the presence of γ[32P] as described (Rochette-Egly et al. 1995) and visualized by autoradiography.
In Mammalian RARα, Phosphorylation of S(LBD) Increases the Dynamics/Flexibility of the Cyclin H-Binding Domain
In hRARα, the upstream serine residue of the phosphorylation cascade, S369 [S(LBD)], is located in the LBD, in loop L9–10 within an arginine–lysine-rich motif (fig. 1A and B). This serine is in the vicinity of a specific domain of the LBD, encompassing loops L8–9 and the N-terminal tip of H9 and involved in the binding of cyclin H (fig. 1A and B) (Bour et al. 2005). Phosphorylation of S(LBD) has been shown to increase the ability of hRARα to interact with cyclin H, with a characteristic downstream consequence on the phosphorylation by cdk7 of the serine located in the NTD (S[NTD]) (Gaillard et al. 2006; Bruck et al. 2009, fig. 1A). This is a typical model of substrate recognition by a protein kinase through association via another substrate-binding subunit.
To further investigate the consequences of phosphorylation on the LBD of hRARα, molecular dynamic simulations (MD) were performed (see Materials and Methods) to analyze whether phosphorylation of S(LBD) generates conformational changes affecting the cyclin H-binding domain located at a 30 Å distance, in L8–9 (fig. 1B).
Simulations were first performed with the holo form of hRARα (i.e., in the presence of RA and of a coactivator peptide), which is closest to the in vivo experimental phosphorylation studies (Bruck et al. 2009).
Average structures of the native and phosphorylated forms of hRARα were calculated and the RSMD from the initial structures were measured (fig. 2A). RMSD of the backbone atoms forming secondary structure was less than 1.0 Å, indicating that the overall structure of the LBD is conserved upon phosphorylation. However, local conformational changes of loops L8–9 were observed with RMSD values in the order of 4Å. These conformational changes are shown in figure 2A where the average structures of unphosphorylated and phosphorylated hRARα are superposed. An upward displacement of L8–9 is clearly visible, linked to an upward bending of helix H9. These changes likely result from the locally enhanced electrostatic environment due to the −2 charge of the phosphate moiety.
More significant, however, is the increase in the local conformational dynamics or flexibility of L8–9 as measured by the atomic RMSfl averaged by residue (Fidelak et al. 2010). RMSfl are directly related to the temperature factors determined during an X-ray crystallography structural study and are a direct calculation of local short-time scale dynamics of L8–9. As shown in figure 2B, in the absence of S(LBD) phosphorylation, L8–9 was generally more flexible than the neighboring helices, with an RMSfl in the order of 0.9 Å. However, with S(LBD) phosphorylated, the average RMSfl of L8–9 increased by a factor of 2. Thus, the simulations clearly indicate that S(LBD) phosphorylation affects the cyclin H-binding domain. In the absence of an experimental structure of the hRARα–cyclin H complex, the investigation of the detailed molecular mechanism of this allosteric signaling was, however, beyond the scope of this study.
Then, molecular dynamics simulations were repeated without or with S(LBD) phosphorylated but with the apo form of hRARα in order to assess whether phosphorylation can still affect L8–9 structural dynamics in the absence of RA and of a coactivator peptide and with H12 in an extended conformation (see Materials and Methods). RMSfl analysis shows that L8–9 exhibits an increased flexibility when the apo form was phosphorylated at S(LBD) (fig. 2C). This suggests that S(LBD) phosphorylation by itself can affect the conformational dynamics of the cyclin H–-binding domain of hRARα in the apo form.
In conclusion, from these results and our previous experimental results (Gaillard et al. 2006; Bruck et al. 2009), one can suggest that the increase in the flexibility of loops L8–9 observed upon phosphorylation of S(LBD) might facilitate the binding of cyclin H to this domain.
In zebrafish RARα, S(LBD) Is Absent but S(NTD) Is Conserved and Phosphorylated
Given the functional importance of RARα not only in mouse and human but also in other vertebrate species such as zebrafish (Dolle 2009; Linville et al. 2009), we investigated whether the phosphorylation cascade was conserved in zebrafish (Danio rerio), which expresses two RARα genes: RARα-A and RARα-B.
Sequence alignment revealed that in zf RARα-A and RARα-B, the S(LBD) residue was not present (fig. 1A). Instead, a histidine residue was found. However, the arginine–lysine-rich motif flanking this residue was well conserved (fig. 1A). Accordingly, in vitro phosphorylation experiments indicated that the LBDs of zfRARαs were not phosphorylated by MSK1, the upstream kinase involved in the phosphorylation cascade (fig. 3A). As phosphorylation sites that are not strictly conserved at a specific position can be compensated by others, driving the same functions (Nguyen Ba and Moses 2010), we investigated whether other phosphorylation sites are present in the nearby region. However, an in silico prediction of potential phosphorylation sites in the LBDs of zfRARα-A and zfRARα-B did not reveal any other compensatory phosphorylation sites.
In contrast, in zfRARα-A and zfRARα-B, the serine residue located in the NTD [S(NTD)] was conserved as well as its flanking region, that is, the proline-rich motif (fig. 1A). Then, the question was whether, in zfRARαs, the conserved S(NTD) could be phosphorylated even in the absence of a phosphorylatable S(LBD). Most interestingly, in vitro, the S(NTD) of zfRARα, was phosphorylated by the purified cdk7/cyclin H complex, as assessed by immunoblotting with antibodies recognizing specifically this phosphorylated residue (fig. 3B, lanes 6 and 7). No signal was obtained with zfRARα deleted for the NTD, confirming the specificity of the antibodies (fig. 3B, lane 8). These results were confirmed in vivo, with B10-tagged zfRARα over expressed in zebrafish (ZF13) or mammalian (COS-1) cells and immunoprecipitated with an antibody recognizing specifically the phosphorylated receptor (fig. 3C). Collectively, these results indicate that, in zebrafish, RARα can be phosphorylated at S(NTD) even in the absence of phosphorylation of the LBD. This raises the question of how the phosphorylation cascade starting at the LBD can be bypassed in zebrafish.
In zebrafish RARα, the Cyclin H-Binding Domain Is More Flexible than in hRARα
Given that in hRARα, phosphorylation of S(NTD) by cdk7 relies on the binding of the associated cyclin H, we investigated whether zfRARα could interact with cyclin H despite the absence of phosphorylation of the LBD.
The ability of zfRARα to interact with cyclin H was compared with that of hRARα in in vitro protein–protein interaction assays. In GST-pulldown assays that use nonphosphorylated bacterially expressed fusion proteins, both zfRARα-A and zfRARα-B interacted with cyclin H but more efficiently than did hRARα (fig. 4A, lanes 4 and 5 and fig. 4B). zebrafish RARαs also interacted with cyclin H in coimmunoprecipitation experiments performed with extracts from transfected COS-1 cells (supplementary fig. S1A, Supplementary Material online). In this case, the interaction was as efficient as with hRARα in line with the fact that hRARα is phosphorylated at S369 in transfected cells (supplementary fig. S1B, Supplementary Material online). Similar results were obtained whatever cyclin H was from human (supplementary fig. S1A, Supplementary Material online) or zefrafish (supplementary fig. S1C, Supplementary Material online). Altogether, these observations suggest that the conformation of zfRARα favors cyclin H binding.
Given that the efficiency of cyclin H binding to hRARα relies on the conformational features of L8–9 and the N-terminal tip of H9 (Bour et al. 2005), we compared this domain with that of zfRARα-A and zfRARα-B (fig. 1A). The overall sequence of the cyclin H–-binding domain is well conserved, but the proline residue located at the N-terminal tip of H9 in hRARα (P345) is replaced by an alanine and a serine in zfRARα-A and zfRARα-B, respectively. Similarly, the methionine residue found in helix 9 at position 350 is replaced by a valine or a glutamic acid in zfRARαs. Altogether, these observations pinpoint some specific mutations that may be functionally relevant to affect the conformation of zfRARα L8–9 in order to favor cyclin H binding.
This led us to explore the conformational dynamics of zfRARαs in molecular dynamics simulations carried out with the apo forms that best match the experimental GST-pulldown conditions (i.e., in the absence of RA and of a coactivator peptide and with H12 extended into solution). Comparison of the RMSfl calculated from these simulations showed that both zfRARα-A and zfRARα-B exhibited an increased flexibility of L8–9 compared with hRARα also in the apo form (fig. 4C). Such results indicate that in zebrafish RARα, L8–9 is natively more flexible than in the human counterpart.
Then, given that in hRARα, substitution of P345 and D346 with a glycine and an alanine respectively, significantly increases cyclin H binding (fig. 4A, lane 7; Bour et al. 2005), we analyzed the consequences of these changes in MD simulations. As shown in figure 4D, the hRARαP345G/D346A mutant in the apo form showed an increased flexibility of L8–9 compared with WT RARα. Finally, the single substitution of P345 with an alanine in hRARα (as in zfRARα-A) was sufficient to increase the flexibility of L8–9 (supplementary fig. S2, Supplementary Material online). Collectively, these results highlight the importance of the amino acid sequence in the flexibility of the cyclin H-binding domain.
During Vertebrate Evolution, S(NTD) Is Conserved but not S(LBD)
From the comparison of human and zebrafish RARαs, one can suggest that the serine located in the NTD would be conserved across vertebrates, whereas it would not be the case for the serine in the LBD. Therefore, we investigated whether there is a constraint on these two RARα phosphorylation sites during evolution. Sequence comparison and prediction of ancestral sequences at all nodes of the chordate RAR tree were performed.
Figure 5 provides a global overview of the evolution of S(NTD) and S(LBD) in all known full length and functional chordate RARαs with available complete sequences, from invertebrate chordates such as amphioxus (B. floridae) and ascidian tunicates (C. savignyi) through basal vertebrates such as lampreys (Lethenteron japonicum) and teleost fishes (e.g., D. rerio) to mammals.
It appeared that S(NTD) is strictly conserved in all available complete chordate sequences (fig. 5), even in amphioxus (B. floridae). Note, however, that for some species such as Eptatretus burgeri, Mordacia mordax, Callorhinchus callorynchus, and Lepisosteus platyrincus, the RARα sequences are incomplete (hyphens in fig. 5), making difficult the introduction of these species in our evolutionary study. The situation was very similar for the RARβ and RARγ paralogs (fig. 5). Interestingly, the flanking region, that is, the proline-rich motif, was also conserved (fig. 6). This indicates that S(NTD) has been under a high selective pressure through chordate evolution.
In contrast, S(LBD) was not present in RAR from cephalochordates (B. floridae) and urochordates (C. savignyi and P. misakensis) (fig. 5) despite the conservation of the flanking arginine–lysine-rich motif (fig. 6 and supplementary fig. S3, Supplementary Material online). It was not present either in RARα from most vertebrates (teleost fish, amphibians, and birds) (fig. 5). Instead, aspartic acid, glutamic acid, asparagine, or histidine residues were found. Most interestingly, the presence of a serine in L9–10 seems to be specific to the main clades of mammals with an exception in the case of Anolis carolinensis (fig. 5 and supplementary fig. S3, Supplementary Material online). The situation was very similar for RARγ (fig. 5). However, in the case of RARβ, a serine was present in the LBD not only in mammals but also in teleost fishes and in Lepisosteus platyrhincus (fig. 5).
Given this complex pattern, it was difficult to infer directly (using simple parsimony reasoning) which amino acid was at the position of S(LBD) before the two duplications (A1 and A2 in fig. 7) that led to RARα in gnathostomes. Thus, we reconstructed the ancestors using maximum likelihood. This allowed us to propose an evolutionary scenario (fig. 7) in which the inferred ancestor harbors an asparagine in L9–10. Then in vertebrates, this asparagine is replaced by a serine in teleost fishes RARβ as well as in mammalian RARα, RARβ, and RARγ. This suggests that, late during vertebrate evolution, a change in selective pressure allowed the acquisition of an easily phosphorylatable residue in L9–10.
Note that S(NTD) was also found in TR (fig. 7), a nuclear receptor belonging to the same NR subfamily. Though part of a distinct motif, this serine is known as a functional phosphorylation site (Glineur et al. 1990), corroborating the ancient high selection constraint acting on this residue.
Protein phosphorylation is crucial for the regulation of many cellular events, and it has been hypothesized that it would serve as a transcriptional clock, orchestrating rapid, and dynamic exchanges of coregulators so that at the end, the right proteins are present with the right activity, at the right place, and at the right time (Rochette-Egly and Germain 2009; Lalevee et al. 2010). Most interestingly, similar to changes in genes cis-regulatory modules that modify specific aspects of the expression pattern of a gene without affecting the function of the encoded protein (Hoekstra and Coyne 2007), specific changes of phosphorylation processes can modify a regulatory cascade without affecting the overall function of the protein (Basu et al. 2008). Therefore, phosphorylation sites may be important targets of evolutionary processes. Now, with the availability of high throughput data sets, it becomes possible to examine and test experimentally the evolution of large sets of proteins and phosphorylation sites.
Human and mouse RARα are typical transcriptional regulators that are modulated by RA binding and rapid concomitant RA-induced phosphorylation cascades starting at a serine located in the LBD (S[LBD]) and ending at an other serine in the NTD (S[NTD]).
The present work indicates that there is a strong conservation of S(NTD) in all chordate RARα sequences known to date, comprising cephalochordates (amphioxus), urochordates (Ciona), and vertebrates. Such an evolutionary constraint correlates with the importance of this phosphorylation site for RARα binding to DNA and RARα-mediated transcription (Bruck et al. 2009; Rochette-Egly and Germain 2009). It is worth noting that S(NTD) is also highly conserved in the other RAR paralogs, RARγ and RARβ, all along the chordate phylum, confirming the functional importance of this phosphorylation site. However, out of chordates, only a few RAR sequences have been identified, one in another deuterostome such as S. purpuratus (Canestro et al. 2006; Marletaz et al. 2006) and only two in prostosomes (one in a mollusk and one in an annelid) (Campo-Paysaa et al. 2008; Albalat and Canestro 2009), and functional data are still lacking. Therefore, the phylogenic coverage is not sufficient yet to generalize our data out of chordates and to protostomes on a safe basis. Finally, given the overall conservation of the flanking proline-rich motif, S(NTD) appears to be phosphorylated by similar kinases in all species. In line with this, cdk7 and cyclin H are highly conserved and functional homologs have been described even in yeast (Damagnez et al. 1995).
The original aspect of human RARα phosphorylation resides in the fact that the cdk7 kinase involved in the phosphorylation of S(NTD), recognizes the receptor through the binding of cyclin H at a specific domain located in a disordered loop of the LBD (L8–9). This process is controlled by the phosphorylation of S(LBD), a nearby serine residue located in an other disordered loop, L9–10 (Bour et al. 2005; Gaillard et al. 2006). Due to the importance of this phosphorylation cascade, we have combined evolutionary studies with molecular dynamics computer simulations and experimental analysis, to predict phosphorylation of S(LBD), flexibility of the cyclin H-binding domain, cyclin H-binding, and the evolution of these processes.
The present study demonstrates that in human RARα, phosphorylation of S(LBD) increases the flexibility of the nearby L8–9 involved in cyclin H binding and thereby in the phosphorylation of S(NTD), whatever RARα is under an apo or holo form. Thus, one can suggest that this process cooperates with the conformational changes induced by RA-binding for hRARα transcriptional activity (fig. 8B).
However, despite the good conservation of the flanking basic K/R-rich motif, located at the end of the highly structured helix 9, our evolutionary analysis points out that S(LBD), located in L9–10, a disordered loop, is not universally conserved throughout vertebrates. Indeed, S(LBD) is present mostly in mammalian RARα, whereas an asparagine is present at this position at the basis of the chordate RAR tree, as exemplified by amphioxus. The same conclusion was made for the paralog RARγ but not for RARβ because S(LBD) was also found in teleost RARβ. From these observations, two major different mechanisms of evolution can be proposed. In the first one, S(LBD) might have appeared in the three RAR paralogs at the basis of vertebrates during the second round (2R) of duplication. Then, this residue might have been lost independently in the three RARs from several vertebrates and during the third round (3R) of duplication in teleost RARα and RARγ. In the second mechanism, the asparagine might change to a serine in each of the three mammalian RARs (α, β, and γ) after the two rounds of duplications and in teleost RARβ during the third round of duplication. This latter mechanism might be the most probable one, in line with the functional role of S(LBD). Nevertheles, whatever the mechanism is, our observations suggest a convergent evolution of a similar regulatory mechanisms that occured four times independently. However, as we still have no evidence whether the phosphorylation cascade described for RARα also occurs in the context of RARγ and RARβ, it is still premature to speculate on this convergence. Nevertheless, it is worth noting that due to its unstructured nature, loops L9–10 should respond rapidly and accurately to changing environmental conditions, that is, requirement of a phosphorylation or not (see below).
Besides, the cyclin H docking site of RARα evolved subtly in parallel to S(LBD). Indeed, in zebrafish RARα, which did not acquire S(LBD), L8–9 harbors a flexible conformation that favors cyclin H-binding without any requirement for a phosphorylation process in L9–10 (fig. 8A). Interestingly, amphoxius RAR also depicts a mutation in the cyclin H-binding domain (fig. 6), suggesting that the flexibility of this domain might be also increased. In contrast, in mammalian RARα, the acquisition of a serine in L9–10 is associated to a drastic reduction in the dynamics of L8–9 and in its ability to interact with cyclin H. The appearance of such a rigidity makes necessary a fine-tuned regulation by the phosphorylation of S(LBD) (fig. 8B). It is worth noting that both loops L8–9 and L9–10 correspond to disordered domains that evolve faster than ordered ones (Schaefer et al. 2010). In line with this, phosphosites frequently appear in such disordered regions, thus facilitating the evolution of kinase-signaling circuits (Beltrao et al. 2009; Holt et al. 2009; Landry et al. 2009).
Thus, we believe that during evolution, a selective pressure might push for rapid changes in the disordered loops L8–9 and L9–10 of the LBD, in order to maintain in a changing environment, the phosphorylation of the NTD that is essential for RARα transcriptional activity (fig. 8). Indeed, when L8–9 lost its flexibility, there was a strong pressure for compensation, that is, the appearance of a phosphorylatable serine in L9–10. Of note, MSK1, the kinase involved in the phosphorylation of S(LBD) is a vertebrate kinase, but an ortholog has been identified in drosophila (Jin et al. 1999), suggesting that the phosphorylation machinery predates chordates RAR diversification.
In conclusion, the present work highlights the evolutionary potential of the RARα phosphorylation network, especially at the level of the kinase–substrate interaction. As the complex combinatorial control of hRARα phosphorylation by multiple kinases is a readily evolved network, one can predict that its deregulation might be at the basis of disease. In support of such an hypothesis, we have shown that in Xeroderma Pigmentosum patients, RARα is not efficiently phosphorylated by cdk7 with characteristic downstream consequences on the expression of RAR target genes (Keriel et al. 2002). This has been correlated at least in part to the clinical abnormalities of the patients but also to their high risk of skin cancer in response to UV.
We thank Dr Yiping Liu (Shanghai Institute for Biological Science, China) for the gift of the zebrafish cyclin H cDNA, M. Oulad Abdelghani (Institut de Génétique et de Biologie Moléculaire et Cellulaire [IGBMC]) for the mouse monoclonal antibodies, and members of the cell culture facilities for help. Special thanks to all teams members for fruitful discussions and suggestions and to L. Azzab (IGBMC), A. Rodriguez and A. Perret (Universite de Strasbourg) for help in the development of simulation protocols. This work was supported by funds from Centre National de la Recherche Scientifique (CNRS), INSERM, the Association pour la Recherche sur le Cancer (ARC 3169), the Agence Nationale pour la Recherche (ANR-05-BLAN-0390-02 and ANR-09-BLAN-0127-01), the Fondation pour la Recherche Médicale (DEQ20090515423), and the Institut National du Cancer (INCa-PL09-194). The Institut du Developpement et des Ressources en Informatique Scientifique (IDRIS), the Centre Informatique National de l'Enseignement Supérieur (CINES), and the Centre d'Etude du Calcul Parallèle de Strasbourg (Université de Strasbourg) are aknowledged for generous allocations of computer time. I.A. was supported by the Ligue Nationale contre le cancer.