Utrophin is a large protein which accumulates at the neuromuscular synapse and myotendinous junctions in adult skeletal muscle, and is widely expressed in several non-skeletal muscle tissues. Evidence from a variety of sources suggests that a successful strategy for treatment of Duchenne muscular dystrophy patients will be to increase expression of utrophin in muscle. There is still much to be learnt about utrophin gene regulation, in particular regarding alternative isoforms, their promoters and role in muscle and non-muscle tissues. Using 5′-RACE we have identified two novel transcripts of utrophin, Up71 and Up140, with unique first exons and promoters located in intron 62 and intron 44, respectively. These transcripts appear to be structural homologues of the short dystrophin transcripts, Dp140 and Dp71, emphasizing the high degree of structural conservation between the utrophin and dystrophin genes. RT–PCR shows that Up71 and Up140 are widely expressed in both human and mouse tissues, including skeletal muscle. We present evidence for transcript-specific differential mRNA splicing of exon 71, in both Up71 and Up140, similar to that described for dystrophin. No evidence for splicing of exon 78 of utrophin was found. This is in contrast to dystrophin and may reflect a subtle functional difference in patterns of phosphorylation between the two proteins.
Utrophin is a large protein (395 kDa) which is widely expressed and most abundant in several non-skeletal muscle tissues, for example lung, intestine, embryonic neural tube, sensory ganglia, tendons and ossifying cartilages (1–3). Utrophin is the autosomal (chromosome 6q24) homologue of dystrophin (4,5), the protein absent or abnormal in X-linked Duchenne and Becker's muscular dystrophies (DMD and BMD, respectively).
In adult muscle, utrophin accumulates at the neuromuscular synapse (NMJ) and myotendinous junctions (MTJ) (6,7). At the NMJ, utrophin contributes to the maintenance of the post-synaptic membrane and clustering of acetylcholine receptors by interacting with a sub-neuronal complex of proteins; these include F-actin, several of the ‘dystrophin-associated proteins’ (DAPs; dystroglycan, dystrobrevins, syntrophins and sarcogly-cans) (8–10) and the synapse-associated proteins rapsyn and agrin (11). In contrast to utrophin, dystrophin in muscle is most abundant at the sarcolemma. Here it also interacts with DAPs to form a link between the sub-membraneous network of non-muscle actin and the extracellular matrix and thereby maintains the integrity of the sarcolemma (8–10).
Utrophin is an important focus for research in DMD because of its structural and functional similarities to dystrophin and its therapeutic potential (12). The finding that in developing and regenerating muscle fibres utrophin protein locates not only to the NMJ and MTJ but also to the sarcolemma (13–15), suggests that utrophin might functionally replace dystrophin. This idea is supported by the demonstration that overexpression of utrophin in dystrophin-deficient (mdx) mice leads to the widespread appearance of utrophin at the sarcolemma and amelioration of the muscle phenotype (16). Recently, it was shown that the severe muscle phenotype exhibited by dystrophin/ utrophin doubly deficient mice is also corrected by overexpression of utrophin (17). These important observations all support the view that increased expression of utrophin might be a successful therapeutic strategy forthe treatment of DMD (16,18).
Compared with dystrophin there is relatively little information about utrophin gene regulation, in particular with regard to alternative isoforms and their promoters. Regulation of the dystrophin gene is complex, with at least seven promoters that determine the expression of multiple isoforms. There are three full-length (427 kDa) isoforms: muscle dystrophin confined to skeletal, cardiac and smooth muscle (19), C-dystrophin found in cortical neurons and P-dystrophin in cerebellar Purkinje cells (20,21). There are also four shorter apo-dystrophin iso-forms: Dp71 (22) expressed in a wide variety of tissues, Dp140 (23) in brain glial cells, Dp116 (24) found in fetal brain and adult peripheral nerve and Dp260 (25) in the retina and regions of the CNS.
Given the structural homology between utrophin and dys-trophin it seems very likely that utrophin will show similar complexity (12). At present only a single full-length utrophin isoform has been described. This is transcribed from a TATA box-less promoter associated with a CpG island at the 5′ end of the transcript and is expressed in skeletal muscle and several other tissues (26). Direct injection of promoter/gene reporter constructs into muscle demonstrated that this promoter functions in a synapse-specific way (27). One other utrophin isoform has been reported and designated G-utrophin. This 5.5 kb mRNA is transcribed from a promoter, which probably lies in intron 55, and is expressed in sensory dorsal root and cranial ganglia and is a major utrophin in the brain (28). An important point about G-utrophin is that its sequence diverges from full-length utrophin at the same point that the short dystrophin transcript, Dp116, diverges from full-length dystrophin. This suggests that the position of the Dp116 promoter is conserved between the two genes and that utrophin homologues of other dystrophin alternative promoters, Dp260, Dp71 and Dp140, may be found. This possibility is strengthened by reports of 140 and 80 kDa utrophin-immunoreactive proteins in extracts of skeletal muscle and peripheral nerve (29,30) that may represent homologues of the dystrophin transcripts Dp140 and Dp71.
In the present study we have explored this possible functional conservation further. Using 5′-RACE we have searched for structural homologues of the short dystrophin isoforms and have identified two novel utrophin transcripts. These transcripts appear to be the utrophin counterparts of Dp71 and Dp140; each shows a unique 5′ exon, wide tissue distribution and evidence for differential mRNA splicing.
The 5′-RACE procedure was used to identify utrophin transcripts that differ from the published utrophin sequence at their 5′ ends. Human fetal brain mRNA was chosen as the starting material and in the first instance, reverse transcription was carried out using primers specific for utrophin exons 63 and 45 (exon numbering as for dystrophin). These primers were designed specifically to look for the utrophin homologues of dystrophin Dp71 and Dp140.
Isolation and characterization of the novel utrophin transcript Up71
After 5′-RACE using nested primers specific for exon 63, two extension products of 360 and 250 bp were isolated and sub-cloned. The sequence of the 360 bp product corresponded to that expected for full-length utrophin extending across exons 63 and 62, while the 250 bp product contained novel sequence upstream of the exon 63 sequence (Fig. 1A). The novel sequence diverges from the known mRNA sequence at a point exactly coincident with the junction between exon 63 and intron 62. This is the same point at which the short dystrophin transcript Dp71 diverges from full-length dystrophin sequence. The novel utrophin transcript is designated Up71.
Up71 has 169 nt of unique 5′ sequence. This includes 101 nt of open reading frame but it is not certain that this sequence is translated. There is an ATG codon situated 116 nt from the 5′ end of the transcript within the open reading frame, but this is in a suboptimal sequence context. In particular, the critical purine in position -3 of the Kozak translation initiation consensus (31) is absent. Furthermore, if this ATG were functional as an initiation codon then the 5′ sequence would encode an N-terminal sequence unusually rich in Phe (shown in small caps in Fig. 1A). The next ATG is 90 bp further downstream in exon 63 and here the sequence context is more favourable for translation initiation. If translation initiates from this downstream codon the novel Up71 sequence would comprise entirely 5′-untranslated sequence.
Sequence comparisons show that in the mouse, the upstream ATG of the human sequence is not present, while the downstream ATG in exon 63 is conserved between the two species (Fig. 1B). The novel first exons of the human and mouse sequences show 56% nucleotide homology, whereas the adjacent coding exon 62 shows 91% homology across species. This supports the view that the 5′ exon of Up71 is untranslated and that the downstream ATG is the major site of translation initiation.
Isolation and characterization of the novel utrophin transcript, Up140
After 5′-RACE using exon 45-specific primers, four extension products of different sizes were isolated and subcloned (Fig. 2). Sequence analysis showed two products, of 480 and 370 bp, containing sequence corresponding to exons 44 and 45 of the full-length utrophin transcript. In contrast, the other products, of 550 and 630 bp, contained novel sequence 5′ of exon 45 (Fig. 3A) and identical in the two clones where they overlapped. The novel 5′ sequence diverges from the known full-length utrophin transcript at a point exactly coincident with the junction between exon 45 and intron 44 and continues 5′ for a further 507 nt. This short utrophin transcript has been designated Up140, in line with the dystrophin transcript (Dp140) whose transcription is initiated in intron 44 (23). The reading frame of Up140 stays open for 86 nt 5′ of the junction with exon 45, before a TAA stop codon is encountered, but does not contain a translation initiation codon. The first ATG lies within exon 46, 709 nt downstream from the 5′ end of the transcript. The relatively long 5′-untranslated region (5′-UTR) of Up140 is a feature held in common with dystrophin Dp140, which is also translated from a downstream ATG located in exon 51. However, in the case of Dp140 the unique 5′ sequence does not lie immediately adjacent to the exon 45/intron 44 boundary but derives from a region further within intron 44 (23).
Sequence comparisons (Fig. 3B) show that the level of nucleotide homology between the novel sequence in human and mouse is 70% across 200 bp immediately upstream of exon 45. This is relatively low for coding sequence but is similar to that found when the adjacent coding exons 45 and 46 are compared (71% nucleotide identity and only 61% amino acid identity). Thus, the region around exon 45 is not highly conserved between man and mouse and, indeed, the ATG codon in the human sequence is absent from the mouse sequence. An ATG situated a further 177 nt downstream (in exon 47) is conserved between the two species and is presumably the functional translation start site in the mouse. The differences in the positions of the translation start sites mean that in man Up140 has 58 extra amino acids at its N-terminus compared with mouse (Fig. 3).
Alignment of full-length utrophin and dystrophin cDNA sequences in the region adjacent to the Up140/Dp140 promoters shows that in exons 48 and 49, utrophin lacks 113 amino acids present in dystrophin. These amino acids correspond to the rod domain repeat 19 unit, encompassing the hinge 3 region which is absent in utrophin. These comparisons contribute to the view that the region close to the Up140/Dp140 promoters is susceptible to mutation and loss of sequence conservation.
Expression of Up140 and Up71 in human and rodent tissues
The patterns of tissue expression of the short utrophin transcripts were investigated by reverse transcriptase PCR (RT–PCR) using primers specific for Up140 and Up71 and mRNA prepared from human and mouse tissues. As a comparison, primers in human exons 16 and 17 and mouse exons 17–20 (marked UTRN in Fig 4 and Fig 5) were also used in order to amplify mRNAs transcribed from the 5′ end of the utrophin gene. As a check on the quality of the mRNA/cDNA preparation the cDNAs were amplified with primers for phosphoglucomutase 1 (PGM1 in Figs 4 and 5), a glycolytic enzyme expressed in all cell types (32).
Typical findings are summarized in Figure 4. Both Up140 and Up71 mRNAs are amplified in all the adult and fetal tissues tested, including skeletal muscle, and neither shows particular tissue specificity. While the two short transcripts appear to differ in the detail of their distribution, for example compare the relative amplification of Up71 and Up140 in human adult thymus, both transcripts are relatively abundant in several human tissues, such as lung and kidney, but poorly amplified from human fetal testis, brain and adult muscle. The relative levels in mouse tissues were similar, but not identical, to those found in the human tissues (Fig. 5).
The Dp71 dystrophin homologue of Up71 is similarly transcribed in a wide range of fetal and adult tissues, including embryonic stem cells and chorionic villous (33). However, Dp140, unlike Up140, shows a restricted tissue distribution largely confined to the central nervous system and kidney (23).
The RNA/cDNAs used in these experiments were checked for contaminating DNA, first by amplification in the absence of reverse transcriptase, which failed to amplify visible products (see lung-RT in Fig 4 and Fig 5). We also tried to amplify putative contaminating genomic DNA by locating forward primers in the adjacent 5′ intron sequence while keeping the reverse primer within exon sequence (either exon 63 or exon 45). This primer combination did not amplify products from tissue cDNAs but readily amplified genomic DNA, thereby confirming that the cDNA samples did not contain contaminating genomic DNA.
Differential splicing of Up140 and Up71mRNA
There have been several reports of differential splicing of dystrophin mRNA; in particular, Dp71 transcripts are alternatively spliced for exons 71 and/or 78 in a variety of adult human tissues (34). Differential splicing of exons 71 and 78 has also been reported for Dp140. Analysis of Dp140 cDNA clones from human cerebellum and kidney found that some clones were missing exons 71 and/or 78. Other transcripts also lacked exons 71–74 (35).
We have investigated whether this pattern of differential splicing is conserved between dystrophin and utrophin. The splicing patterns of exons 71 and 78 in Up71 and Up140 were examined by PCR amplification of cDNAs in the region between the unique first exons and either exon 72 or 79. The sizes of the products for Up71 were 1.2 and 2.0 kb, and for Up140 were 3.7 and 4.5 kb (to exons 72 and 79, respectively). The presence of spliced forms was demonstrated by nested amplification of the transcript-specific PCR products. Primers placed in exons 70 and 72, or 70 and 79, were used to look for splicing of exons 71 and 78, respectively; a product of 162 bp was generated in the presence of exon 71, and of 124 bp in its absence. A product of 216 bp was generated in the presence of exon 78, and of 184 bp in its absence. No evidence for splicing out of exon 78 was found for either transcript in any tissue (data not shown); however, exon 71 was alternatively spliced in a transcript- and tissue-specific manner.
For Up71, several fetal and adult tissues contain a mixture of transcripts spliced and unspliced for exon 71, although the proportion of spliced and unspliced forms varies from tissue to tissue. For example, fetal muscle contains predominantly the unspliced form while fetal brain and testis contain predominantly the spliced form (Fig. 6). Adult testis and thymus showed no evidence of splicing, and fetal lung contained only mRNA spliced out for exon 71 (data not shown).
For Up140 the unspliced form is the major isoform in most tissues, although splicing out of exon 71 occurs at very low levels in several tissues. In contrast, the spliced form is the major Up140 transcript in adult skeletal muscle (Fig. 6). The nucleotide sequence of spliced forms was checked by direct sequence analysis of DNA eluted from separated bands excised from agarose gels (‘isolated bands’ in Fig. 6); this also confirmed that the splicing out of exon 71 occurs exactly at the known exon/intron boundaries.
In this study we describe two novel transcripts of the utrophin gene, Up71 and Up140, whose promoters are located in the distal half of the gene. Both mRNAs are distinguished by a novel first exon comprising untranslated sequence (Up71, 206 nt; Up140, 709 nt). The Up140 and Up71 mRNA sequences diverge from that of full-length utrophin at the same positions as the short dystrophin transcripts, Dp140 and Dp71, diverge from full-length dystrophin sequence. These findings provide good evidence that the gene duplication event which gave rise to these two genes occurred after the appearance of multiple promoters in the ancestral gene.
The identification of Up71 and Up140 means that utrophin homologues for three of the four known short dystrophin transcripts have now been described. In a previous study, a short utrophin, corresponding to Dp116 and encoding a 113 kDa protein, was identified (28). This transcript was designated G-utrophin because of its relatively abundant expression in sensory ganglia, but it also appears to be the predominant utrophin isoform in the brain. It is noteworthy that the utrophin/ dystrophin homologue identified in the sea urchin Strongylo-centrotus purpuratus (36) transcribes an internal transcript structurally similar to G-utrophin and Dp116, suggesting that this alternative promoter has been functionally significant since before the evolution of the chordates. It remains to be seen whether a utrophin homologue of the fourth short dystrophin transcript, Dp260 (25), will be found and whether more than one promoter which transcribes full-length utrophin occurs; three, perhaps four, promoters have been described at the 5′ end of the dystrophin gene, each with a distinctive tissue specificity(19,21).
Up71 is predicted to encode a 4.0 kb mRNA and a 71 kDa protein with the same cysteine-rich and C-terminal domains as full-length utrophin, commencing with exon 63 (dystrophin exon numbering). It seems probable that an 80 kDa utrophin isoform described by Fabbrizio et al. (37) that was detected in peripheral nerve by western blotting corresponds to Up71 protein. Up140 is predicted to encode a 6.75 kb mRNA and a 155 kDa protein comprising the last six repeats of the distal rod domain and the cysteine-rich and C-terminal domains of utrophin. This seems likely to be the equivalent of the mouse 140 kDa utrophin isoform detected by Peters et al. (29) after immunoaffinity purification of muscle fibre extracts. In a recent paper, Lumeng et al. reported multiple small utrophin isoforms in adult mouse tissues, seen after western blotting using a C-terminal region antibody (38). Amongst these they identified a possible utrophin homologue of the short dys-trophin isoform Dp71 in brain and a possible homologue of Dp140 intestis, muscle and other tissues. Taken together, these observations support the view that the Up71 and Up140 transcripts are translated into functional protein in some tissues.
What is the functional significance of Up71 and Up140? These short transcripts are predicted to encode proteins that have lost the N-terminal actin-binding domain and part or all of the long spectrin-like rod region. Both retain sequences that encode the C-terminal domains that mediate binding to membrane proteins and it seems probable that their cellular function will involve interactions with cell membranes. Utrophin and dystrophin share 74% amino acid identity in their membrane protein-binding domains and both are able to bind β-dystroglycan, syntrophin and dystrobrevin (10,39,40). It is possible that their sequence differences confer some discrete role in cellular function. This may be particularly true for Dp140 and Up140, which each contain a portion of the rod region where their protein homology is much less well conserved. However, it is also reasonable to propose that in tissues where short utrophin and dystrophin isoforms are co-expressed, they may be functionally redundant. This idea is attractive. It would help to explain why in Duchenne muscular dystrophy patients where internal promoters are lost by deletion there is not a severe clinical outcome in non-muscle tissues known to express short dystrophin isoforms. [Singularly amongst the short dystrophin isoforms, the loss of Dp260 has been linked to an ocular phenotype associated with an abnormal electroretinogram (25).] It would also help to explain why there is very little or no difference in clinical phenotype between mice knocked-out for utrophin in such a way that only full-length utrophin expression is ablated (41) and those in which the insertional mutation prevents transcription of all utrophin transcripts (42).
The importance of assigning splice patterns to particular iso-forms of dystrophin was recognized once it was established that distinct mRNAs were being transcribed from alternative promoters and that differential splicing of dystrophin was transcript-specific (34,35). In the present study, we demonstrate that utrophin is also transcribed from multiple promoters and that the various transcripts show different patterns of mRNA splicing. Our findings regarding the splicing of exon 71 and the lack of differential splicing of exon 78 coincide exactly with those of Lumeng et al. (38), although in that study no information was given about which transcripts were subject to differential splicing. The alternative splicing of exon 71 in Up71 and Up140 may have functional significance since this exon is close to and upstream of the syntrophin/dystrobrevin-binding domains and may play a role in membrane-cytoskeletal interactions. It may be of significance that no evidence for alternative splicing of exon 78 was found for these short isoforms, although this is a feature of their dystrophin counterparts, Dp71 and Dp140 (34,35). In dystrophin, one of two consensus p34cdc2 phosphorylation sites is located in exon 78 and it has been proposed that alternative splicing of exon 78 may modulate phosphorylation, resulting in differential dystrophin-protein interactions. For utrophin, this mechanism would not have a functional outcome since the phosphorylation site in exon 78 is not conserved (34).
Materials and Methods
The 5′ regions of utrophin cDNAs were amplified using the 5′-RACE procedure under conditions exactly as specified by the manufacturer (5′-RACE system; Gibco BRL, Paisley, UK). In brief, human RNA (1 µg) was reverse transcribed using a utrophin-specific primer GSP1 and was C-tailed. After removal of the mRNA the cDNA was made double-stranded
using the anchor primer GGCCACGCGTCGACTAGTAC-GG18; 5 µl of this product were further amplified using the anchor primer and a nested utrophin-specific primer, GSP2. In order to facilitate cloning of the amplified products into the pAMP1 vector (Gibco BRL) using the uracil DNA glycosylase method, a third round of amplification was carried out using 5 µl of a 1:100 dilution of the second round amplification product and primers with codons containing uracil (CAU4 or CUA4) incorporated at their 5′ ends; these were the universal primer and either GSP2 or GSP3.
The primers used were: hGSP1-Up71, 5′-CTGTACGGTAGG-CAGAAAAACG; hGSP2-Up71, 5′-CATTATTCAGGTCAGC-AAGGG; hGSP1-Up140, 5′-TCCCAGCGTTGGTTTAAACCT-GCC; hGSP2-Up140, 5′-GCTTCCATCTGCCTGGGAGAG; hGSP3-Up140, 5′-TGCACAATCCCATCCCCAGTTCG.
Isolation of human and mouse utrophin genomic clones
Human genomic clones containing utrophin sequence were isolated from the ICRF-sorted chromosome 6 library (ICRF reference C109; 13), by hybridization with a 288 bp 32P-labelled PCR product amplified using hGSP2-Up140 located in exon 45 and hUp140-F6 (5′-CCTGATGACATTCATGAAGGTGG) located in the novel sequence. Mouse genomic clones were isolated from a 1DASH library (a gift from Dr A. Reith, UCL, London, UK) by hybridization with 32P-labelled PCR products as probes; these were a 159 bp fragment amplified from exon 45 using mUp45-F2 (5′-GAACTGGAAGAGGGCCTCAGC) and mUp45-R2 (5′-TCCACCTCAGCTACAAGAGTGG) and a 62 bp fragment amplified from exon 63 using mUp63-R6 (5′-CAAGGGATT-GGAAGAGCTCAGTC) and mUp63-F5 (5′-CCATCAAACA-CAGACAACCTG). DNA was prepared from these recombinant clones using maxi-preparation kits from Qiagen.
Human and mouse post-mortem tissues were collected and flash frozen in liquid nitrogen. The fetal mouse samples were taken between 11.5 and 15.5 days post-coitum and the human fetal samples were between 14 and 20 weeks gestation. RNA was isolated from tissue that had been powdered under liquid nitrogen and homogenized in RNAzolB reagent (Biogenesis, Poole, UK). RNA RT–PCR was carried out using random oligonucleotide primers to generate first strand cDNA. In order to ensure that in the subsequent PCR, amplification of cDNA aliquots was specific for each utrophin isoform, the forward primers contained only sequence from the unique first exon of each alternative transcript. These primers were as follows: hUp71-F4, 5′-ATTG-GGCATTGGGTTTCTGAATGG; mUp71-F8, 5′-CTAGATT-TCTGAGCGATCC; hUp140-F6, 5′-GCCTGATGACATTCA-GAAGGTGG; mUp140-F4, 5′-AAGACATTATAGCCCTA-CCCCG.
Various control RT-PCRs were also carried out using the same cDNA samples. These included amplification of the ubiquitously expressed phosphoglucomutase (PGM1) gene (32) using the primers PGM1-1F (5′-AACAAGATGCCCTT-GGGAGCTGTGA) and PGM1-1R (5′-GAACTGATTGGAC-AGAAGGCACTAG), which amplify both mouse and human PGM1. In addition, amplification across human utrophin exons 16/17 and mouse utrophin exons 17–20 was carried out using the primers: hUp-ex16/17-F, 5′-GGAAGACATGGAAATGA-AGCGT; hUp-ex16/17-R, 5′-GCTTGTTCTCTTACACGAA-CAGTC; mUp-ex17/20-F, 5′-GAGAACAAGGGATGGTGA-AGAAGC; mUp-ex17/20-R, 5′-GCTGCTGAGATATCATC-TTCC.
DNA sequence analysis
DNA sequences were determined by the dideoxy chain termination method of Sanger et al. (43) using thermo-sequenase and a radiolabelled terminator cycle sequencing kit (Amer-sham Life Sciences, Little Chalfont, UK).
J.W. was supported by anMRC post-graduate studentship and C.J. by a research training grant from the EEC.