## Abstract

The structure of human Janus kinase 2 (JAK2) comprising the two C-terminal domains (JH1 and JH2) was predicted by application of homology modelling techniques. JH1 and JH2 represent the tyrosine kinase and tyrosine kinase-like domains, respectively, and are crucial for function and regulation of the protein. A comparison between the structures of the two domains is made and structural differences are highlighted. Prediction of the relative orientation of JH1 and JH2 was aided by a newly developed method for the detection of correlated amino acid mutations. Analysis of the interactions between the two domains led to a model for the regulatory effect of JH2 on JH1. The predictions are consistent with available experimental data on JAK2 or related proteins and provide an explanation for inhibition of JH1 tyrosine kinase activity by the adjacent JH2 domain.

## Introduction

Janus kinases (JAK proteins, JAKs) are receptor associated protein tyrosine kinases and are of critical importance for cytokine-mediated signal transduction. Cytokine binding to the extracellular parts of their receptor chains leads to the formation of homo- or heteromeric receptor complexes involving at least two receptor chains. Through the formation of such a complex the receptor-associated JAK proteins are brought together at the inside of the cell membrane, which in turn induces trans-phosphorylation of the JAK proteins (Leonard and O'Shea, 1998; Yeh and Pellegrini, 1999). Once the JAKs are activated, a cascade of further signalling events is triggered, involving phosphorylation of selected receptor chain tyrosines, binding of the so-called STAT proteins and phosphorylation of these STATs (Bazan, 1990; Cosman, 1993; Kishimoto et al., 1994). The phosphorylated STATs then dimerize, translocate to the nucleus and initiate gene activation (Briscoe et al., 1996a).

So far, four mammalian JAKs have been identified (JAK1, JAK2, JAK3 and TYK2) (Duhé and Farrar, 1998). The members of the JAK family display some remarkable features. They consist of seven regions of conserved homology, known as JAK homology (JH1–JH7) domains. The two carboxy-terminal domains are a tyrosine kinase (JH1) and a tyrosine kinase-like domain (JH2). The remaining five domains (JH3–JH7) do not show significant sequence similarity to other known proteins and are thought to interact with regulatory proteins or with the cytoplasmic domains of cytokine receptors (Gauzzi et al., 1997; Saltzman et al., 1998; Yamamoto et al., 1999). For the interferon-γ receptor it has been shown that β-chain-associated JAK2 could be exchanged for another Janus kinase without any consequences for activation or activity of the so-called STAT proteins (Kotenko et al., 1996). Similar results were obtained by Jiang et al. (1996); their studies showed that JAK2 could be substituted for JAK1 in a mitogenic response to interleukin-2. In contrast to these studies, which indicate little contribution from the JAKs to post-receptor specificity, a comparison between JAK1 and JAK3 indicated a better ability of JAK1 to phosphorylate STAT5 (Liu et al., 1997). Schaper et al. (1998) studied the response of JAK1, JAK2 and TYK2 deficient human fibrosarcoma cell lines by stimulation with soluble interleukin-6. They found that only JAK1 and not JAK2 or TYK2 deficient deficient cell lines display reduced activity of phosphorylation of the associated STAT proteins. The question thus remains of how JAK kinases confer specificity to cytokine-induced signal transduction.

A number of structure–function studies have been performed on JAK proteins. These studies involved deletion of entire domains of the JAKs, introduction of single point mutations or generation of mutants with short deletions. JAK proteins lacking the amino terminal domains (JH3–JH7) have been shown to be unable to bind to their cognate cellular receptors. (Tanner et al., 1995; Chen et al., 1997; Yan et al., 1998). Deletion of the tyrosine kinase domain resulted in inactive JAKs (see, for example, Duhé and Farrar, 1995; Velazquez et al., 1995). Deletion of the kinase-like domain was shown to have different effects on the function of the respective JAK protein. A TYK2 mutant without the JH2 domain lacked in vitro catalytic activity and in vivo function (Velazquez et al., 1995). By contrast, similar JAK2 mutants were able to function in vivo (Frank et al., 1995) and showed increased catalytic activity (Sakai and Kraft, 1997). Another study found several indications for an inhibitory role of the JH2 domain in regulation of JAK2 activity, through interaction with the JH2 domain (Saharinen et al., 2000). For JAK3 the JH2 domain seems to be required for full catalytic activity, but appears at the same time to inhibit kinase activity (Chen et al., 2000). A number of point mutations have been found to affect catalytic activity or regulation of the JAK proteins. In the tyrosine kinase domain mutation of an invariant lysine (K882 in human JAK2) has been found to generate defective kinases for JAK1/JAK2 (Briscoe et al., 1996b) and TYK2 (Gauzzi et al., 1996). Substitution of Asp by Ser in the DFG triplet at the start of the activation loop in JH1 rendered human JAK1 inactive (Briscoe et al., 1996b). A double mutation (W1020G/E1024A) in JAK2 inactivated this kinase (Zhuang et al., 1994). In JAK2 Y1007 and Y1008 have been found to be sites of trans- or autophosphorylation in vivo and in vitro (Feng et al., 1997). Mutation of Y1007 to phenylalanine (Y1007F) reduced the kinase activity significantly, whereas mutation of Y1008 (Y1008F) had no influence on kinase activity (Feng et al., 1997). Y1007 of JAK2 has recently been identified to interact with a JAK binding protein (JAB), which negatively regulates kinase activity (Yasukawa et al., 1999). Mutations within the JH2 domain can also have a significant effect on kinase activity and indicate a regulatory role of this domain. Point mutations of a glutamate residue in Drosophila HOP (E695K) and murine JAK2 (E665K) hyperactivated the corresponding JAK-STAT pathways (Luo et al., 1997). By contrast, the corresponding mutation in JAK3 (E639K) had no effect (Chen et al., 2000). A short deletion (Δ586–592) in the kinase-like domain of JAK3 has been found to inhibit kinase activity (Chen et al., 2000).

From a mechanistic viewpoint, the importance of the JAKs stems from the fact that they are the first proteins involved in the intracellular part of signal transduction. There is also mounting evidence that JAKs would be suitable targets for therapeutic intervention in many diseases. It has been shown that selective inhibition of JAK2 blocks leukemic cell growth in vitro and in vivo (Meydan et al., 1996). Blockade of JAK3 induced apoptosis in human leukemia cell lines (Sudbeck et al., 1999). Malaviya and Uckun (1999) found that JAK3 plays a pivotal role in IgE receptor/FcεRI mediated mast cell responses and they suggest that targeting JAK3 could provide the basis for new and effective treatment for mast cell-mediated allergic reactions.

The explosive growth of sequence databases created by ongoing research in the human genome project continuously increases the gap between protein sequences and known three-dimensional protein structures. However, it has been observed that newly determined structures increasingly tend to fall into structural folds already known (Murzin, 1996, 1998). This indicates that the number of folds is finite (Chothia, 1992; Orengo et al., 1994; Govindarajan et al., 1999). The prediction of the three-dimensional structure of proteins has made significant progress over the last few years (Koehl and Levitt, 1999). Also, it has been shown that protein models should be good enough for high-level functional analysis (Wei et al., 1999). Model structures with a more moderate resolution are at least sufficient for identifying enzymatic active sites with specific residue geometry (Fetrow and Skolnick, 1998; Fetrow et al., 1998). Therefore, in the absence of experimentally determined structures, prediction of the three-dimensional structure of proteins by computer represents an important technique in order to explore structure and function of proteins.

Because there are no experimental structures of JAK proteins available to date, we decided to initiate prediction studies of these important proteins. JAK2 was chosen as a representative because of its involvement in leukemic cell growth (Meydan et al., 1996) and the fact that extensive experimental data are available for comparison with the predictions. In this paper we present the prediction of the three-dimensional structures of the two carboxy-terminal domains (JH1 and JH2) of JAK2. At first the structures of these two domains were predicted separately. This allowed for comparison of the tyrosine kinase and kinase-like domains and structural and functional differences could be highlighted. Then a model for the relative orientation of the two domains was developed. Analysis of the combined structure led to prediction of interactions between the domains and yielded a model for regulation of kinase activity through JH1–JH2 interactions. The structures were validated as much as possible through comparison with experimental studies and were found to be consistent with these studies.

## Methods

Multiple sequence alignment was performed with the programs ClustalW (Thompson et al., 1994) and Dialign (Morgenstern, 1999). For identification of folds likely to be adopted by the JH1 and JH2 domains the following fold recognition methods were used: Threader (Jones et al., 1992), Predict Protein (Rost and Sander, 1993, 1994; Rost et al., 1994), 123D (Alexandrov et al., 1996), ToPLign (Thiele et al., 1999) and UCLA-DOE (Fischer and Eisenberg, 1996). Generation of the 3D structures was accomplished by application of restraint-based homology modelling using the program MODELLER (Šali and Blundell, 1993).

For predicting interactions between amino acids, a novel technique was developed. This procedure is based on a method applied to RNA molecules, where it is referred to as `comparative sequence analysis' (Chiu and Kolodziejczak, 1991; Gutell et al., 1992). Here the interactions between nucleotides of RNA molecules are identified by statistical analysis of a set of aligned RNA sequences that are related. Given a set of aligned sequences, the task is to identify correlated mutations at two positions x and y along the alignment. From such a correlation one can infer that in 3D the residues at positions x and y are in contact. The underlying assumption is that this correlation actually stems from the evolutionary pressure to respond to a mutation at x by a compensatory mutation at y (if the residues at x and y are in contact in the 3D structure), in order to preserve structure and/or functionality. During the analysis, the 20 amino acids can be considered as each being a class of its own or they can be grouped into different physicochemical classes, e.g. according to hydrophobicity, size, charge, etc. Calculation of the so-called mutual information M(x,y) between all possible position pairs provides a quantitative measure of the interactive strength in those pairs. The mutual information is expressed as a sum of entropy-like terms that contain the frequencies (or probabilities) of the amino acids (or the classes to which they belong) and the frequencies of the possible pairings at x and y. More specifically, the mutual information in the positions x and y is defined as

1
$\mathit{M}(\mathit{x},\mathit{y})\ =\ {{\Sigma}_{\mathit{a_{x}a_{y}}}}\ \mathit{fa}_{\mathit{x}}\mathit{a}_{y}\ ln\frac{\mathit{fa}_{\mathit{x}}\mathit{a}_{\mathit{y}}}{\mathit{fa}_{\mathit{x}}\mathit{fa}_{\mathit{y}}}$
where fax refers to the frequency at which (the probability with which) an amino acid a (or an amino acid class) can be found at position x and faxay is the frequency at which a specific residue pair can be found at positions x and y. Another, equivalent, way of calculating M(x,y) is
2
$\mathit{M}(\mathit{x},\mathit{y})\ =\ \mathit{H}(\mathit{x})\ +\ \mathit{H}(\mathit{y})\ {\mbox{--}}\ \mathit{H}(\mathit{x},\mathit{y})$
with
3a
$\mathit{H}(\mathit{x})\ =\ {\mbox{--}}\ {{\Sigma}_{\mathit{a}_{\mathit{x}}}}\mathit{fa_{x}}\ ln\mathit{fa_{x}}$
and
3b
$\mathit{H}(\mathit{x},\mathit{y})\ =\ {{\Sigma}_{\mathit{a}_{\mathit{x}}\mathit{a}_{\mathit{y}}}}\ \mathit{fa}_{\mathit{x}}\mathit{a}_{\mathit{y}}\ ln\mathit{fa}_{\mathit{x}}\mathit{a}_{y}$
where H is the estimated composition entropy or Shannon entropy (Shannon and Weaver, 1963). Putative residue interactions will be identified according to the magnitude of the mutual information values. This approach is regarded as being more robust, as it does not rely on a normal distribution of the underlying data required for conventional correlation analysis. By introducing the necessary modifications, the method was ported to proteins. Before it was applied to JAK2 proteins, we tested the method by studying a set of aligned protein tyrosine kinase sequences. Most of the correlated mutations were in good agreement with the known three-dimensional structures of the corresponding proteins and included residues either in the catalytic core of the kinases or between adjacent α-helices.

A detailed description of the use of this method for proteins is in preparation and an outline of its application to JAK2 follows: 495 receptor associated or integrated tyrosine kinase sequences were retrieved from the protein sequence databases SWISSPROT and SWISSNEW (Bairoch and Apweiler, 1997). The sequences were aligned with ClustalW (Thompson et al., 1994) and their phylogenetic relation was calculated (Clamp, 1998). The topology of the corresponding phylogenetic tree indicated that the sequences could be combined in three sets: Set A contained 10 mammalian JAK sequences that are closely related to the human JAK2 sequence. Set B included set A and 175 other sequences which are less related to the human JAK2 sequence than those in set A. Set A is therefore a subset of set B (A ⊂ B). Set C contained all 495 sequences (A ⊂ B ⊂ C). With these sets a comparative sequence analysis was performed. For the analysis different amino acid classification schemes were set up according to (1) hydropathy, (2) protonation state, (3) hydrogen bond donor/acceptor capabilities of the amino acid side chains and (4) a class containing 20 different members corresponding to the individual amino acids. Correlated mutations were identified by high mutual information values.

The modelled structures were refined by 1000 steps of energy minimization using the Amber force field (Pearlman et al., 1994). Analysis of the final structures was carried out with the program Procheck (Laskowski et al., 1993). Checks performed included Ramachandran plots, χ1 versus χ2 plots, bump checks and evaluation of various stereochemical parameters. All parameters were found to be within the normal tolerances.

## Results

### Sequence alignment

Multiple sequence alignment of the two carboxy-terminal homology domains (JH1, JH2) for 10 JAK sequences from different species is shown in Figure 1. Comparison of the sequences indicates strong sequence homology among all members of the JAK family. More specifically, the JAK homology domains JH1, JH2 and JH4 share strong pairwise sequence identity (JH1 34%, JH2 26–28%, JH4 20–25%), whereas the other domains show lower sequence identity (<19%), but are similar to each other.

### Structure of human JAK2 homology domain JH1

First, possible folds of this domain were evaluated by fold recognition methods. Several proteins of known three-dimensional structure were identified as possible templates by high scoring values, as shown in Table I. Interestingly, the highest scoring structures not only share the function of a tyrosine kinase with the JAK proteins, they also play a similar role in signal transduction, as they are receptor-integrated tyrosine kinases. These structures were then used as template structures in the prediction process. The alignment of the template sequences with the target sequence is shown in Figure 2. The sequences share a rather high sequence identity of 22%. Both the hydropathy profile and the secondary structure elements (assigned and predicted) align very well. Using the alignment shown, restraint based modelling as implemented in MODELLER (Šali and Blundell, 1993) was applied for calculating the three-dimensional structure of the JAK2-JH1. Several structures satisfying the input restraints were obtained and were superimposed by their Cα atoms (residues: R839 to T1049 and L1078 to S1115). The difference between these structures was marginal, as indicated by an averaged r.m.s. value of 0.56 ± 0.08 Å. The alignment shown in Figure 2 reveals a loop insertion within the JAK2-JH1 domain. This loop is located between amino acids T1049 and L1078. Secondary structure prediction (Rost and Sander, 1994) indicated a short α-helix between amino acids M1062 and M1064, but with a probability of <87%. Therefore, the region between residues T1049 and L1078 was modelled separately by employing loop database searches (Jones and Thirup, 1986; Moult and James, 1986).

### Structure of human JAK2 homology domain JH2

The JAK homology domain JH2 is known as a tyrosine kinase like domain because it lacks the functionality of a tyrosine kinase (Duhé and Farrar, 1998). Application of fold recognition methods yielded high scoring values for several proteins as summarized in Table I. Notably, the four highest scoring structures were identical with the set of proteins identified for JAK2-JH1. For prediction of JAK2-JH2 the same procedure as for JAK2-JH1 was applied. Figure 3 shows the sequence alignment of the template and the target sequences. The hydropathy profile and the secondary structure elements (assigned and predicted) aligned very well.

Different structures satisfying the input constraints were then obtained by distance constraint-based modelling. Superimposition of these structures by their Cα atoms indicated that they are very similar (average r.m.s. value: 0.60 ± 0.04 Å).

### Combined structure of JAK2-JH1 and JAK2-JH2

Evidence for assembling JAK2-JH1 and JAK2-JH2 was gathered from two different sources. The first was to use the crystal structure of a dimer of the FGF receptor tyrosine kinase domain (Mohammadi et al., 1996) as a basis for the combined structure. In the paper by Mohammadi et al., the possibility of dimerization of the FGF receptor tyrosine kinase domain was discussed and the potential biological significance of three different dimers was considered. In the crystal two of the dimers had a relatively small interface (950 and 670 Å2). These values are relatively low for protein–protein interactions of biological relevance (Conte et al., 1999). The third dimer had a much larger interface area (1650 Å2). In the corresponding interface the first N-terminal α-helix of each tyrosine kinase domain is oriented almost parallel to the other one and the helices are located at the interface. The dimer can be described as axial symmetric. The symmetry axis is located between the α-helices and parallel to them. This type of dimerization has been supported also by experimental evidence, as a mutation located in the interface of the dimer suppressed the functionality of the protein (Mohammadi et al., 1996). Therefore, the two amino-terminal JAK2 domains were combined according to the dimer of the FGF receptor tyrosine kinase domain. This was accomplished by fitting the N-terminal α-helices of JAK2-JH1 (residues E889 to S904) and of JAK2-JH2 (residues R588 to K603) on to corresponding helices of the dimer of the FGF receptor tyrosine kinase domain. The second source of evidence was the application of comparative sequence analysis to a set of aligned JAK sequences, as described in the Methods section. This procedure indicated several correlated residue mutations in the two N-terminal helices of JH1 and JH2, thereby supporting the hypothesis that the two helices are in contact in the protein structure.

The loop between JAK2-JH1 and JAK2-JH2 domains was then modelled by employing loop database searches (Jones and Thirup, 1986; Moult and James, 1986). The final structure of the two carboxy-terminal domains of JAK2 is shown in Figure 5A. Its structural details (Figure 5B and 5C) will be discussed in the next section.

## Discussion

### Human JAK2 homology domain JH1

In the model JAK2-JH1 adopts a fold which is typical for the architecture of tyrosine kinases (Hubbard et al., 1994). The domain consists of two lobes. The N-terminal lobe contains a twisted β-sheet of five anti-parallel β-strands and one α-helix. The C-terminal lobe is mainly α-helical and three of the helices are oriented almost parallel to each other. Catalysis of phosphotransfer is thought to take place in the cleft between the two lobes. Particularly important in this respect are the nucleotide binding loop (residues G856 to G861), catalytic loop (residues K970 to N981) and the activation loop (residues D994 to E1024).

The activation loop (red in Figure 5) begins with the conserved DFG motif (residues D994 to G996) and ends with the conserved motif APE (residues A1022 to E1024) (Hanks et al., 1988). The conformation of this loop can vary significantly from one kinase to another, as shown by a comparison of the inactivated insulin receptor and FGF receptor kinases (Hubbard et al., 1994; Mohammadi et al., 1996b). In the present study this loop was modelled initially on the FGF receptor kinase, as this protein then served as a basis for the combined JH1/JH2 model. Of the residues in the DFG motif D994 points towards the active site and F995 away from it, similar to the corresponding residues (D641/F642) in the FGF receptor kinase. The activation loop contains two conserved tyrosine residues (residues Y1007 and Y1008), which are crucial for the tyrosine kinase activity of JAK2-JH1 (Songyang and Cantley, 1995). In the model both tyrosine residues point away from the active site. Y1007 is surrounded by several hydrophobic residues (I1018, L1026 and I1079), similar to Y653 in the FGF receptor kinase, which interacts with V664, L672 and F710. The OH group of Y1007 forms a hydrogen bond with D1004. This hydrogen bond is different from that formed by Y653 in the FGF receptor kinase; this tyrosine is hydrogen bonded to the backbone oxygen of L672. The corresponding residue in JAK2 JH1 is L1026 and a slight change in structure would enable it to interact with Y1007. Y1008 is more solvent exposed and exhibits hydrophobic contacts with V1075. Based on sequence homology alone it is very difficult to decide which conformation the inactivated activation loop in JH1 would take up. The hydrophobic residues that interact with Y1007 and Y1008 are conserved in the sequence alignment (Figure 2), with the exception of V1075, which is replaced by a serine in the insulin receptor kinase. Also, it would be possible to model the activation loop analogous to the conformation taken up in the insulin receptor kinase and Y1007 would consequently hydrogen bond with D976, similar to the Y1162–D1132 pair in the insulin receptor kinase. In both loop conformations, however, Y1008 would be significantly more solvent exposed than Y1007, which could provide an explanation for the different role of these two tyrosines in trans- or autophosphorylation (Feng et al., 1997).

In the catalytic loop (orange in Figure 5), the residues corresponding to D976 and N981 are invariant in the protein tyrosine kinase families and nearly invariant in the protein serine kinase families (Hubbard et al., 1994). Within the known structures the residue corresponding to D976 of the model is the catalytic base of the phosphotransfer reaction and hydrogen bonded to the amino acid equivalent to N981. In the JH1 model the residues D976 and N981 are close in space. Analysis of the structure with the program HBexplore (Lindauer et al., 1996) indicated that the side chains of these two residues are connected by hydrogen bonds. An arginine (R980) within the catalytic loop, four residues along the sequence from the catalytic base, provides charge neutralization and forms a hydrogen bond with the catalytic base. It has been discussed that the residues corresponding to D976, N981 and R980 are involved in coordination of Mg2+ ions (Hubbard et al., 1994). In the model these residues are close in space and stabilized by hydrogen bonding interactions. They are also orientated in such a way that they could interact with a putative Mg2+ ion.

The nucleotide binding loop (yellow in Figure 5), which is also referred to as the glycine-rich loop, is located between two β-strands in the N-terminal lobe of the tyrosine kinase. Compared with insulin tyrosine kinase (IRK), the nucleotide binding loop of JAK2-JH1 contains a mutation. The serine in IRK is replaced by an asparagine in JAK2-JH1 (Asn859). Members of the JAK1 and TYK2 family contain a histidine at the position of Asn859. The nucleotide binding loop of the lymphocyte kinase contains a glutamine at the corresponding position (Yamaguchi and Hendrickson, 1996). In IRK the backbone amide hydrogen of the serine is hydrogen bonded to the β-phosphate of ATP, indicating that the side chain is less important for ATP binding (Hubbard, 1997). This would also explain the low degree of conservation of this residue among the tyrosine kinases. Another feature known to be important for nucleotide binding is a hydrophobic pocket next to the nucleotide binding loop. The corresponding amino acids in JAK2-JH1 are F995 and L905.

### Human JAK2 homology domain JH2

A mutation (E695K) in the JH2 domain of the Hop JAK kinase has been reported to cause an increase of the kinase activity (Luo et al., 1997). The corresponding residue in JAK2 is E665. In the present model this residue displays several interactions with surrounding amino acids. In particular it forms a hydrogen bond with the δ-hydrogen of H662 and the backbone amide hydrogen of F798. Other, positively charged, side chains in the vicinity include R799 and K736. Therefore, according to the model, mutation of E665 from a negatively charged residue to a positively charged amino acid could cause severe distortion of the local conformation. This in turn could lead to a local rearrangement of the whole structure. Such a change could have an effect not only on the structure of the JAK-JH2 domain but also on the relative orientation of the JAK-JH1 and JAK-JH2 domains. As it is known that the JH2 domain is responsible for negative regulation of the JAK-STAT pathway (Brechtold et al., 1997; Barahmand-Pour et al., 1998), a change in the relative orientation of these two domains would influence the kinase activity of JH1.

### Combined structure of JAK2-JH1 and JAK2-JH2

Evidence for the spatial arrangement of the two domains was gathered from two different sources. The first was the observed dimer formation in the crystals of the tyrosine kinase domain of the fibroblast growth factor (Mohammadi et al., 1996). Therefore, two α-helices (one N-terminal helix from each domain) were oriented relative to each other as described for the dimer in the crystal. The resulting helix–helix interface (Interface 1 in Figure 5A) displayed a network of polar interactions, predominantly between complementary charged amino acids (Figure 7A). The second source of evidence was the detection of correlated residue mutations within the two α-helices. Two of these mutations were found to be in contact in the model (S599–E900 and S599–S904, Figure 7A). This provided further support for assembly of the domains as described.

Deletion of the JAK2-JH2 domain led to intrinsic kinase activity of JAK2 (Berchtold et al., 1997; Barahmand-Pour et al., 1998). The interactions described above would be able to explain the observed influence of the JH2 domain on the tyrosine kinase activity of JH1. More evidence of how the JH2 domain could influence the activity of the tyrosine kinase domain is provided by a comparison of JH1 with two crystal structures of the insulin receptor kinase. Initially the activation loop of JH1 was modelled in its inactivated conformation and this structure of JH1 was then used to generate the combined JH1/JH2 model shown in Figure 5A. In this, inactivated, conformation stabilizing hydrophobic interactions between L1001 and P1002 of the activation loop and residues around C618 of JH2 can be found (Figure 7B). It has been shown that protein kinases undergo major conformational changes in the activation loop upon ligand binding. We therefore generated a second structure of JH1, in which the activation loop was modelled using the activated form of the insulin receptor kinase (Hubbard, 1997) as a basis. In this conformation there is unrestricted access to the binding sites for ATP and substrate. When this loop was inserted into the combined JH1/JH2 structure (Figure 7C), it exhibited several very close contacts or unfavourable interactions with residues in the JH2 domain (Table II). The JH2 residues participating in these contacts are located either at the end of the N-terminal α-helix or in the β-strand ending with C618. L1001 and P1002, which were interacting with C618 in the inactivated structure, now overlap with S602 and N612, respectively. Residues Q1003 to K1005 display a range of short contacts with amino acids L611 to V615 of JH2. V1010 and K1011 exhibit very close contacts with V617 and C618. If the proposed JH1/JH2 domain arrangement is correct, the range of unfavourable interactions with the activated form of JH1 would imply that loop motion from an inactivated to an activated conformation in JH1 is inhibited by the JH2 domain.

That the β-strand of JH2 ending with C618 is involved in regulation of JH1 activity is also supported by a recent study on JAK3 mutants (Chen et al., 2000). A deletion mutant of JAK3 lacking residues L586 to M592 (the corresponding residues in JAK2 would be L611 to V617) was shown to inhibit JH1 activity. In the view of the present JAK2 model this finding could be interpreted in such a way that the activated conformation of the activation loop of JH1 in JAK3 is stabilized by the non-mutated β-strand of JAK3 JH2. Another possible interpretation would be that during transition to the activated conformation in the deletion mutant of JAK3 the JH1 activation loop is stabilized in a conformation which is detrimental to kinase activity.

A more general explanation for the influence of JH2 on JH1 activity is provided by the fact that phosphorylation of the tyrosine residues within the activation loop causes a rearrangement in tyrosine kinases (Hubbard, 1997). This rearrangement can be described in terms of a rotation of the N-terminal lobe with respect to the C-terminal lobe. For JAK2-JH1 this rotation could be influenced by the interaction of the JH2 domain with the N- and C-terminal lobes of the JH1 domain. Internal rotation within the JH1 domain would be hindered by the interactions with JH2. This in turn would reduce the activity of the JH1 domain.

In order to test the predictions, one could introduce point mutations in the loop of JH2 which is predicted to interact with the activation loop of JH1. One possibility would be to substitute a cysteine for L1001 or P1002. According to the model, this new residue would then be able to form a covalent bond with C616. Subsequent alteration of the activity of JAK2 would give an indication of the validity of the model described above. Conversely, within the JH2 domain, it would be interesting to evaluate the effect of mutations in the β-strand before C618. According to the present model, one could prepare either deletion mutants between L611 and C618 or introduce point mutations in this β-strand.

Summarizing, the structures of the two carboxy-terminal domains (JH1 and JH2) of human JAK2 were predicted. Comparison of the structures of the two domains revealed some important differences. Overall, the two isolated domains are similar in shape, but crucial differences are apparent in the activation loops. The activation loop of JH1 contains all the features necessary for function, whereas the corresponding loop of JH2 lacks some important amino acids. In JH2 the catalytic base of the catalytic loop is replaced by a neutral residue. The hydrogen pattern, however, appears to be preserved. The nucleotide binding loops of JH1 and JH2 do not show any major differences.

Two different sources of evidence were used for establishing the relative orientation of the two domains. The resulting combination of JAK2-JH1 and JAK2-JH2 yielded some new insights, which could not be derived from the isolated domains. According to the prediction there are two main interactions between JH1 and JH2. The first is between two α-helices of the two domains. The second concerns a loop between two β-strands of JH2 and the activation loop of JH1. These interactions are predicted to have a significant effect on the kinase activity of JH1, which is in line with experimental evidence that JH2 has a detrimental effect on JH1 activity. Support for the model also comes from the prediction that the activation loop of JH1 in its activated conformation undergoes unfavourable interactions with a β-strand in JH2; this is consistent with the finding that the corresponding residues in JAK3-JH2 have a profound effect on kinase activity.

It may be difficult to verify every structural detail of the model, but on the whole the predictions are consistent with available experimental data on JAK2 or related proteins and these data do not contradict the predictions. We therefore hope that the model will be used as a working hypothesis for the design of further experiments, which will help to verify or reject it. The predicted structure is intended to serve several purposes. First, it will provide the starting point for further prediction studies of JAK proteins, with the ultimate goal of generating structures for the whole proteins. Second, it should stimulate the design of experiments in order to find out more about the structure and function of these proteins. Also, it could serve as a basis for the design of molecules interacting selectively with JAK2 for therapeutic benefit.

Table I.

Likely protein folds for JAK2-JH1 and JAK2-JH2

Template structures Fold recognition methodsa
PDB code Reference Predict protein 123D UCLA-DOE Threader ToPLign
aFor references, see Methods section.
bProtein structures with significant scoring values are indicated by X.
1irk Hubbard et al. (1994)  Xb
1ir3 A Hubbard (1997)
1fgk A Mohammadi et al. (1996)
3lck Yamaguchi and Hendrickson (1996)
Template structures Fold recognition methodsa
PDB code Reference Predict protein 123D UCLA-DOE Threader ToPLign
aFor references, see Methods section.
bProtein structures with significant scoring values are indicated by X.
1irk Hubbard et al. (1994)  Xb
1ir3 A Hubbard (1997)
1fgk A Mohammadi et al. (1996)
3lck Yamaguchi and Hendrickson (1996)
Table II.

Unfavourable interactions between the activated A-loop in JH1 and JH2 residues

JH1 residue JH2 residue
Within 2.0 Å (including hydrogens) Within 1.5 Å (without hydrogens)
V1000 S599 S599
S602
L1001 S602 S602
P1002 M601
S602
N612 N612
Q1003 L611 L611
N612 N612
Y613
D1004  N612
Y613 Y613
K1005 N612
Y613 Y613
G614 G614
V615 V615
L614
V1010 C618 C618
K1011 F595
V617 V617
C618 C618
JH1 residue JH2 residue
Within 2.0 Å (including hydrogens) Within 1.5 Å (without hydrogens)
V1000 S599 S599
S602
L1001 S602 S602
P1002 M601
S602
N612 N612
Q1003 L611 L611
N612 N612
Y613
D1004  N612
Y613 Y613
K1005 N612
Y613 Y613
G614 G614
V615 V615
L614
V1010 C618 C618
K1011 F595
V617 V617
C618 C618
Fig. 1.

Multiple sequence alignment of Janus kinase homology domains JH1 and JH2. The numbering scheme refers to the mature sequence of human JAK2. Boundaries of the homology domains are indicated by arrows. Sequence homology is indicated by symbols below the sequences as follows: *, amino acid identity; |, amino acid identity with one exeption; X, common physicochemical properties (such as hydrophobicity, charge or aromaticity).

Fig. 1.

Multiple sequence alignment of Janus kinase homology domains JH1 and JH2. The numbering scheme refers to the mature sequence of human JAK2. Boundaries of the homology domains are indicated by arrows. Sequence homology is indicated by symbols below the sequences as follows: *, amino acid identity; |, amino acid identity with one exeption; X, common physicochemical properties (such as hydrophobicity, charge or aromaticity).

Fig. 2.

Alignment of the sequences of the template structures with the JAK2-JH1 sequence. Secondary structure assignments are shown underneath each sequence in grey. The secondary structure of JAK2-JH1, as predicted by the Predict Protein Server (Rost and Sander 1993, 1994; Rost et al., 1994), is indicated as follows: C, coil; H, helix; E, extended/β-strand. The assigned secondary structure for the known structures follows the DSSP notation (Kabsch and Sander, 1983): B, residue in isolated β-bridge; E, extended strand; G, 3/10 helix; H, a-helix; I p-helix; S, bend; T, hydrogen bonded turn. The functional loops (activation loop, A-Loop; catalytic loop, C-Loop; nucleotide-binding loop, NBL) are indicated by arrows. Amino acid homology is indicated as in Figure 1.

Alignment of the sequences of the template structures with the JAK2-JH1 sequence. Secondary structure assignments are shown underneath each sequence in grey. The secondary structure of JAK2-JH1, as predicted by the Predict Protein Server (Rost and Sander 1993, 1994; Rost et al., 1994), is indicated as follows: C, coil; H, helix; E, extended/β-strand. The assigned secondary structure for the known structures follows the DSSP notation (Kabsch and Sander, 1983): B, residue in isolated β-bridge; E, extended strand; G, 3/10 helix; H, a-helix; I p-helix; S, bend; T, hydrogen bonded turn. The functional loops (activation loop, A-Loop; catalytic loop, C-Loop; nucleotide-binding loop, NBL) are indicated by arrows. Amino acid homology is indicated as in Figure 1.

Fig. 3.

Alignment of the sequences of the template structures with the JAK2-JH2 sequence. Details as in Figure 2.

Alignment of the sequences of the template structures with the JAK2-JH2 sequence. Details as in Figure 2.

Fig. 4.

Details of the alignment of the activation loop of JH2. (A) Alignment from previous studies; (B) alignment used in the present study. Bold residues indicate identity or homology between JH1 and JH2, after the catalytic and within the activation loop. The FGF receptor kinase sequence is shown as a reference.

Fig. 4.

Details of the alignment of the activation loop of JH2. (A) Alignment from previous studies; (B) alignment used in the present study. Bold residues indicate identity or homology between JH1 and JH2, after the catalytic and within the activation loop. The FGF receptor kinase sequence is shown as a reference.

Fig. 5.

Combined structure of JAK2-JH1 and JAK2-JH2. The general colouring scheme for JH1 is blue and JH2 is displayed in green. Colours for selected loops: activation loop, red; catalytic loop, orange; nucleotide-binding loop, yellow. (A) Overall structure. N-terminus, C-terminus and interfaces are annotated. (B) Catalytic core of JH1. (C) Catalytic core of JH2. Figure drawn with MOLSCRIPT (Kraulis, 1991).

Fig. 5.

Combined structure of JAK2-JH1 and JAK2-JH2. The general colouring scheme for JH1 is blue and JH2 is displayed in green. Colours for selected loops: activation loop, red; catalytic loop, orange; nucleotide-binding loop, yellow. (A) Overall structure. N-terminus, C-terminus and interfaces are annotated. (B) Catalytic core of JH1. (C) Catalytic core of JH2. Figure drawn with MOLSCRIPT (Kraulis, 1991).

Fig. 6.

Superposition of JAK2-JH1/JH2 (dark grey) with the FGF receptor kinase dimer (light grey). Orientation and domain arrangement of JAK2 as in Figure 5A. Figure drawn with MOLSCRIPT (Kraulis, 1991).

Fig. 6.

Superposition of JAK2-JH1/JH2 (dark grey) with the FGF receptor kinase dimer (light grey). Orientation and domain arrangement of JAK2 as in Figure 5A. Figure drawn with MOLSCRIPT (Kraulis, 1991).

3
To whom correspondence should be addressed. E-mail: r.t.kroemer@qmw.ac.uk

## References

Alexandrov,N.N., Nussinov,R. and Zimmer,R.M. (1996) In Hunter,L. and Klein,T. (eds), Biocomputing: Proceedings of the Pacific Symposium. World Scientific, Singapore, pp. 53–72.
Bairoch,A. and Apweiler,R. (
1997
)
Nucleic Acids Res.
,
25
,
31
–36.
Barahmand-Pour,F., Meinke,A., Groner,B. and Decher,T. (
1998
)
J. Biol. Chem.
,
273
,
12567
–12575.
Bazan,J.F. (
1990
)
Immunol. Today
,
11
,
350
–354.
Berchtold,S., Moriggl,R., Gouilleux,F., Silvennoinen,O., Beisenherz,C., Pfitzner,E., Wissler,M., Stocklin,E. and Groner,B. (
1997
)
J. Biol. Chem.
,
272
,
30237
–30243.
Bernstein,F.C., Koetzle,T.F., Williams,G., Mayer,E.F., Bryce,M.D., Rodgers,J.R., Kennard,O., Simanouchi,T. and Tasumi,M. (
1977
)
J. Mol. Biol.
,
112
,
535
–542.
Briscoe,J., Kohlhuber,F. and Müller,M. (
1996
)
Trends Cell Biol.
,
6
,
336
–340.
Briscoe,J., Rogers,N.C., Witthuhn,B.A., Watling,D., Harpur,A.G., Wilks,A.F., Stark,G.R., Ihle,J.N. and Kerr,I.M. (
1996
)
EMBO J.
,
15
,
799
–809.
Chen,M., Cheng,A., Chen,Y.Q., Hymel,A., Hanson,E.P., Kimmel,L., Minami,Y. Taniguchi,T., Changelian,P.S. and O'Shea,J.J. (
1997
)
,
94
,
6910
–6915.
Chen,M., Cheng,A., Candotti,F., Zhou,Y.-J., Hymel,A., Fasth,A., Notarangelo,L.D. and O'Shea,J.J. (
2000
)
Mol. Cell. Biol.
,
20
,
947
–957.
Chiu,D.K.Y. and Kolodziejczak,T. (
1991
)
CABIOS
,
7
,
347
–352.
Chothia C. (
1992
)
Nature
,
357
,
543
–544.
Clamp M. (1998) JalView 1.3 beta. http://www.ebi.ac.uk/~michele/jalview.
Conte,L.L., Chothia,C. and Janin,J. (
1999
)
J. Mol. Biol.
,
285
,
2177
–2198.
Cosman,D. (
1993
)
Cytokine
,
5
,
95
–106.
Duhé,R.J. and Farrar,W.L. (
1995
)
J. Biol. Chem.
,
270
,
23084
–23089.
Duhé,R.J. and Farrar,W.L. (
1998
)
J. Interferon Cytokine Res.
,
18
,
1
–15.
Feng,J., Witthuhn,B.A., Matsuda,T., Kohlhuber,F., Kerr,I.M. and Ihle,J.N. (
1997
)
Mol. Cell. Biol.
,
17
,
2497
–2501.
Fetrow,J.S. and Skolnick,J. (
1998
)
J. Mol. Biol.
,
281
,
949
–968.
Fetrow,J.S., Godzik,A. and Skolnick,J. (
1998
)
J. Mol. Biol.
,
282
,
703
–711.
Fischer,D. and Eisenberg,D. (
1996
)
Protein Sci.
,
5
,
947
–955.
Frank,S.J., Yi,W.S., Zhao,Y.M., Goldsmith,J.F., Gilliand,G. and Jiang,J. (
1995
)
J. Biol. Chem.
,
270
,
14776
–14785.
Gauzzi,M.C., Velazquez,L., McKendry,R., Morgensen,K.E., Fellous,M. and Pellegrini,S. (
1996
)
J. Biol. Chem.
,
271
,
20494
–20500.
Govindarajan,S., Recabarren,R. and Goldstein,R.A. (
1999
)
Proteins: Struct. Funct. Genet.
,
35
,
408
–414.
Gutell,R.R., Power,A., Hertz,G.Z., Putz,E.J. and Stormo,G.D. (
1992
)
Nucleic Acids Res.
,
20
,
5785
–5795.
Hanks,S.K., Quinn,A.M. and Hunter,T. (
1988
)
Science
,
241
,
42
–52.
Hubbard,S.R. (
1997
)
EMBO J.
,
16
,
5572
–5581.
Hubbard,S.R., Wei,L., Ellis,L. and Hendrickson,W.A. (
1994
)
Nature
,
372
,
746
–754.
Jiang,N., He,T.C., Miyajima,A. and Wojchowski,D.M. (
1996
)
J. Biol. Chem.
,
271
,
16472
–16476.
Jones,D.T., Taylor,W.R. and Thornton,J.M. (
1992
)
Nature
,
358
,
86
–89.
Jones,T.H. and Thirup,S. (
1986
)
EMBO J.
,
5
,
819
–822.
Kishimoto,T., Taga,T. and Akira,S. (
1994
)
Cell
,
76
,
253
–262.
Koehl,P. and Levitt,M. (
1999
)
Nature Struct. Biol.
,
6
,
108
–111.
Kotenko,S.V., Izotova,L.S., Pollack,B.P., Muthukumaran,G., Paukku,K., Silvennoinen,O., Ihle,J.N. and Pestka,S. (
1996
)
J. Biol. Chem.
,
271
,
17174
–17182.
Kraulis,P.J. (
1991
)
J. Appl. Crystallogr.
,
24
,
946
–950.
Laskowski,R.A., McArthur,M.W., Moss,D.S. and Thornton,J.M. (
1993
)
J. Appl. Crystallogr.
,
26
,
283
–291.
Leonard,W.J. and O'Shea J.J. (
1998
)
Annu. Rev. Immunol.
,
16
,
293
–322.
Lindauer,K., Bendic,C. and Suehnel,J. (
1996
)
CABIOS
,
12
,
281
–289.
Liu,K.D., Gaffen,S.L., Goldsmith,M.A. and Greene,W.C. (
1997
)
Curr. Biol.
,
7
,
817
–826.
Luo,H., Rose,P., Barber,D., Hanratty,W.P., Lee,S., Roberts,T.M., D'Andrea,A.D. and Dearolf,C.R. (
1997
)
Mol. Cell. Biol.
,
17
,
1562
–1571.
Malaviya,R. and Uckun,F.M. (
1999
)
Biochem. Biophys. Res. Commun.
,
257
,
807
–813.
Meydan,N. et al. (
1996
)
Nature
,
379
,
645
–648.
1996
)
Cell
,
86
,
557
–587.
Morgenstern,B. (
1999
)
Bioinformatics
,
15
,
211
–218.
Moult,J. and James,M.N. (
1986
)
Proteins
,
1
,
146
–163.
Murzin,A.G. (
1996
)
Curr. Opin. Struct. Biol.
,
6
,
386
–394.
Murzin A.G. (
1998
)
Curr. Opin. Struct. Biol.
,
8
,
380
–387.
Orengo,C.A., Jones,D.T. and Thornton,J.M. (
1994
)
Nature
,
372
,
631
–634.
Pearlman,D.A., et al. (1994), AMBER 4.1. University of California, San Francisco.
Rost,B. and Sander,C. (
1993
)
J. Mol. Biol.
,
232
,
584
–599.
Rost,B. and Sander,C. (
1993
)
,
90
,
7558
–7562.
Rost,B. and Sander,C. (
1994
)
Proteins
,
19
,
55
–72.
Rost,B., Sander,C. and Schneider,R. (
1994
)
CABIOS
,
10
,
53
–60.
Saharinen,P., Takaluoma,K. and Silvennoinen,O. (
2000
)
Mol. Cell. Biol.
,
20
,
3387
–3395.
Sakai,I. and Kraft,A.S. (
1997
)
J. Biol. Chem.
,
272
,
12350
–12358.
Šali,A. and Blundell,T.L. (
1993
)
J. Mol. Biol.
,
234
,
779
–815.
Saltzman,A., Stone,M., Franks,C., Searfoss,G., Munro,R., Jaye,M. and Ivashchenko,Y. (
1998
)
Biochemistry Biophys. Res. Commun.
,
246
,
627
–633.
Schaper,F., Gendo,C., Eck,M., Schmitz,J., Grimm,C., Anhuf,D., Kerr,I.M. and Heinrich,P.C. (
1998
)
Biochemistry
,
335
,
557
–565.
Shannon,C.E. and Weaver,W. (1963) The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL.
Sudbeck,E.A., Liu,X.-P., Narla,R.K., Mahajan,S., Ghosh,S., Mao,C. and Uckun,F.M. (
1999
)
Clin. Cancer Res.
,
5
,
1569
–1582.
Songyang,Z. and Cantley,C. (
1995
)
Trends Biochem. Sci.
,
20
,
470
–475.
Tanner,J.W., Chen,W., Young,R.L., Longmore,G.D. and Shaw,A.S. (
1995
)
J. Biol. Chem.
,
270
,
6523
–6530.
Thiele,R., Zimmer,R. and Lengauer,T. (
1999
)
J. Mol. Biol.
,
290
,
757
–779.
Thompson,J.D., Higgins,D.G. and Gibson,T.J. (
1994
)
Nucleic Acids Res.
,
22
,
4673
–4680.
Thompson,J.D., Plewniak,F. and Poch,O. (
1999
)
Nucleic Acids Res.
,
27
,
2682
–2690.
Velazquez,L., Mogensen,K.E., Barbieri,G., Fellous,M., Uze,G. and Pellegrini,S. (
1995
)
J. Biol. Chem.
,
270
,
3327
–3334.
Wei,L., Huang,E.S. and Altman,R.B. (
1999
)
Structure
,
7
,
643
–650.
Wells,J.A. and DeVos,A.M. (
1996
)
Annu. Rev. Biochem.
,
65
,
609
–634.
Wilks,A.F., Harpur,A.G., Kurban,R.R., Ralph,S.J., Zürcher,G. and Ziemiecki,A. (
1991
)
Mol. Cell. Biol.
,
11
,
2057
–2065.
Yamaguchi,H. and Hendrickson,W.A. (
1996
)
Nature
,
384
,
484
–489.
Yamamoto,K., Shibata,F., Miura,O., Kamiyama,R., Hirosawa,S. and Miyasaka,N. (
1999
)
Biochem. Biophys. Res. Commun.
,
257
,
400
–404.
Yan,H., Piazza,F., Krishnan,K., Pine,R. and Krolewski,J.J. (
1998
)
J. Biol. Chem.
,
273
,
4046
–4051.
Yasukawa,H. et al. (
1999
)
EMBO J.
,
18
,
1309
–1320.
Yeh,T.C. and Pellegrini S. (
1999
)
Cell. Mol. Life Sci.
,
55
,
1523
–1534.
Zhuang H., Patel,S.V., He,T.-C., Sonsteby,S.K., Niu,Z. and Wojchowski,D.M. (
1994
)
J. Biol. Chem.
,
269
,
21411
–21414.
Ziemiecki,A., Harpur,A.G. and Wilks,A.F. (
1994
)
Trends Cell Biol.
,
4
,
207
–212.