The metalloproteinase-like, disintegrin-like, cysteine-rich (MDC) family is a large group of sequence-related proteins, first characterized in the male reproductive tract, but subsequently also identified in non-reproductive tissues. Their primary translation products are of ~90 kDa and each can be divided into distinct domains which show remarkable homology to reprolysins; snake venom haemorrhagic components possessing metalloproteinase and/or disintegrin domains. Several MDC proteins are abundantly-expressed in the male reproductive tract, suggesting functions in fertility. We now describe the cloning, sequence determination and characterization of transcripts encoding the human and macaque (Macaca fascicularis) orthologues of a novel member of the MDC family (eMDC II) which is abundantly-expressed in the epididymis. Unlike many MDC proteins expressed in the reproductive tract, eMDC II possesses the extended `catalytic centre' consensus sequence characteristic of a reprolysin-like metalloproteinase. This suggests that eMDC II has proteolytic activity.
Following spermatogenesis in the testis, spermatozoa, although differentiated, are immature and incapable of fertilizing an egg. However, during their subsequent passage through the epididymis, spermatozoa undergo a variety of maturational changes during which they acquire forward motility and are subsequently able to recognize and penetrate the zona pellucida of the egg, prior to fertilization. Sperm maturation is thought to be a complex series of biochemical and physiological changes involving spermatozoa, epididymal fluid, epididymal epithelia and their interstitia.
The metalloproteinase-like, disintegrin-like, cysteine-rich (MDC) family (also known as the ADAM family) is a group of proteins whose members share striking sequence homology with a variety of snake venom haemorrhagic proteins, the reprolysins. Although MDC proteins have a wide tissue distribution, many of those best described to date are expressed exclusively, or at elevated levels, in the male reproductive tract (Wolfsberg et al., 1995; Frayne et al., 1997, 1998), suggesting roles for such proteins in male reproductive function. Indeed, one of the first MDC proteins to be cloned and completely sequenced was epididymal apical protein I (EAP I; Perry et al., 1992), the only MDC protein found to be predominantly expressed in the epididymis. In view of the importance of the epididymis in sperm maturational events, the aim of the present study was to identify further novel epididymal MDC proteins which may be involved in sperm maturation or have other reproductive roles.
Members of the MDC family, like their snake venom counterparts, share a conserved domain organization comprising an N-terminal prodomain, a metalloproteinase-like domain, a disintegrin-like domain, a cysteine-rich domain and, in most cases, a transmembrane domain (although a sub-group of secreted, soluble, MDC proteins, the ADAMTS family, have recently been reported; Kuno et al., 1997; Tang and Hong, 1999). All cysteine residues present in the disintegrin-like domains of MDC proteins are conserved, indicating a high level of structural and functional similarity. Furthermore, the disintegrin-like domains of many MDC proteins contain an XCD (often ECD) motif in a similar position to the RGD integrin-binding tripeptide found in many snake venom disintegrins. Interestingly, ECD is one of the most common alternative motifs in non-RGD-containing snake venom disintegrins. By analogy with the integrin-binding activities of snake venom disintegrins, it has been suggested that the disintegrin-like domains of MDC proteins may interact with integrins to mediate cell–cell adhesion or the attachment of cells to the extracellular matrix. Indeed meltrin α has been shown to be involved in myoblast fusion (Yagami-Hiromasa et al., 1995), whilst the disintegrin-like domain of the sperm surface protein, fertilin β, is thought to bind to an α6β1 integrin receptor on the egg oolemma (Almeida et al., 1995). Evidence is also emerging that a further MDC protein which is abundant in germ cells, tMDC I (alternatively known as cyritestin), also plays a role in sperm–egg binding (Yuan et al., 1997).
About half of the MDC proteins that have been identified to date are predicted to be active metalloproteinases, as they possess the extended `catalytic centre' sequence (HEXGHXXGXXHD; see Figure 1) characteristic of the reprolysins. Examples of such MDC proteins are fertilin α, metargidin (Krätzschmar et al., 1996), meltrin α (Yagami-Hiromasa et al., 1995), MDC 9 (Weskamp et al., 1996; Roghani et al., 1999), a mammalian disintegrin-metalloprotease (MADM) (Howard et al., 1996), tumour necrosis factor (TNF)-α converting enzyme (TACE; Black et al., 1997) and a Drosaphila metalloproteinase-disintegrin protein (KUZ) (Rooke et al., 1996; Pan and Rubin, 1997). Although catalytic activity has been demonstrated for the last five, a physiological role for this activity has only been proposed for the last two; TACE is capable of processing a number of cell surface proteins including TNF-α, l-selectin and transforming growth factor (TGF) (Peschon et al., 1998), while the Drosophila protein KUZ cleaves the signalling molecule, NOTCH (Rooke et al., 1996; Pan and Rubin, 1997). Nearly all of the MDC proteins predominantly expressed in the reproductive tract are predicted to be catalytically inactive, with the possible exception of fertilin α which contains the HEXGHXXGXXHD consensus sequence. However, fertilin α is absent in humans and gorillas (Jury et al., 1997; 1998) and this protein cannot, therefore, have an essential role in reproduction in these species.
We now describe the cloning and sequence determination of a novel, epididymis-derived, MDC family member which possesses the zinc-binding and `catalytic centre' residues characteristic of snake venom metalloproteinases. Much of the work on the MDC family has been carried out using rodent models, although we have previously demonstrated significant differences in the tissue distribution of a number of MDC transcripts between rats and primates (Frayne et al., 1997, 1998). Furthermore, fertilin α (Jury et al., 1997), tMDC I (Frayne and Hall, 1998) and tMDC II (Frayne et al., 1999) transcripts are abundant, but non-functional, in the human. With this in mind, we initially cloned this novel sequence (which we have called eMDC II) from a more closely-related human model, the monkey Macaca fascicularis, and performed tissue distribution experiments in this species. In light of the absence of fertilin α, tMDC I and tMDC II in the human, we then cloned the human orthologue of eMDC II and found it to contain an open reading frame capable of encoding a novel, functional MDC protein.
Materials and methods
Isolation of total RNA from macaque tissues
Fresh macaque tissues (epididymis, heart, kidney, lymph node, muscle, ovary, pancreas, prostate, salivary gland, testis and uterus) were obtained, flash frozen in liquid nitrogen and stored at –70°C until required. Total RNA was then extracted as described previously (Frayne et al., 1997).
Cloning and sequence analysis of macaque eMDC II cDNA
Isolation of cDNA clones representing MDC transcripts from M.fascicularis libraries has been described in detail elsewhere (Perry et al., 1995b). Briefly, redundant primers, designed to conserved regions flanking the disintegrin-like domain of several known MDC cDNA sequences, were used in polymerase chain reactions (PCR) with macaque epididymis cDNA. The resulting PCR products were cloned and their sequences determined to identify those encoding novel disintegrin-like domains. One such clone was designated eMDC II and ~1.4×105 independent M.fascicularis epididymis cDNA clones were screened with this cloned PCR product under conditions of high stringency in an attempt to obtain additional sequence information. A number of positively-hybridizing clones were purified during subsequent rounds of screening.
5′ rapid amplification of cDNA ends
The 5′ end of the macaque eMDC II transcript was obtained from total epididymal RNA using a SMART™ RACE cDNA amplification kit from Clontech, Basingstoke, UK) essentially as described by the supplier. Briefly, first strand cDNA synthesis was performed using 2 μg of total epididymal RNA, the 5′-CDS primer and SMART II™ oligonucleotide (both provided in the RACE kit) and SuperScript™ II reverse transcriptase (Gibco-BRL, Paisley, UK). This cDNA was then used in a primary PCR reaction with the upstream primer mix (included in the RACE kit) and an eMDC II-specific primer (5′-CCATTACACATTTCAGGCAGGTCGCAC-3′) and using the Advantage® 2 PCR kit from Clontech with the following touchdown parameters: 5 s at 94°C, 3 min at 72°C for 5 cycles; then 5 s at 94°C, 10 s at 70°C, 3 min at 72°C for 5 cycles; then 5 s at 94°C, 10 s at 68°C, 3 min at 72°C for 25 cycles. 1 μl of the resulting PCR product was amplified in a second round of PCR using the nested universal primer mix (supplied with the RACE kit) and a nested eMDC II primer (5′-ATCTTTTGCTGGTCTGCACACCATCCC-3′) and the following parameters: 1min at 94°C, 1 min at 58°C, 1 min at 72°C for 30 cycles. The resulting PCR product was resolved on a low melting temperature agarose gel and the appropriate band excised and cloned into pGEM®-T Easy plasmid vector (Promega, Southampton, UK) for subsequent sequence determination.
Cloning the human eMDC II orthologue
The cDNA sequence of the human orthologue of macaque eMDC II was obtained using a commercially-available human `testis' total RNA preparation (Clontech), known to be contaminated with epididymis-specific sequences. Reverse transcription of this RNA was carried out using oligo(dT)12–18 as a primer and Expand™ reverse transcriptase (Boehringer-Mannheim, Lewes, East Sussex, UK) under conditions recommended by the manufacturer. Human eMDC II cDNA was then amplified by PCR using primers based on the macaque sequence and an expressed sequence tag from the Genbank database (accession number: AA648830) representing the extreme 3′ end of the human sequence, identified on the basis of homology to the macaque eMDC II sequence. The resulting PCR products were either cloned into pGEM®-T Easy plasmid vector or sequenced directly, using an ABI 377 automated DNA sequencer.
Reverse transcription PCR (RT–PCR)
Total RNA (2 μg) was used as a template for Expand™ reverse transcriptase-directed cDNA synthesis using oligo(dT)12–18 as a primer. One-fifth of this cDNA synthesis reaction was then used in PCR reactions with either eMDC II-specific primers (5′-GCTGTGATGCTAAGACATGT-3′ and 5′-TGAACAGCCTTTACCATCTG-3′) or actin-specific primers (5′-CAACTGGGACGAYATGGAGA-3′ and 5′-AGGATCTTCATGAGGTAGTC-3′) as a control.
Cloning and sequence analysis of macaque eMDC II transcripts
A strategy for cloning novel macaque MDC sequences from epididymal RNA, which has been described previously (Perry et al., 1995b), led to the isolation of a novel MDC cDNA clone (eMDC II; 1.75 kbp). Preliminary sequence analysis identified an appropriate reading frame with homology to other members of the MDC family of proteins, but indicated that the clone was not full-length; lacking the 5′ end. Despite repeated screening, a full-length clone could not be obtained from the macaque epididymal cDNA library, so a 5′ rapid amplification of cDNA ends (RACE) approach was adopted using total RNA from the caput region of macaque epididymis as the template. A 1.4 kb PCR product was obtained and subsequent direct sequence analysis indicated that this RACE product encoded the N-terminal end of the eMDC II protein, as well as 82 nucleotides of 5′ non-coding sequence. The compiled full-length cDNA sequence of macaque eMDC II, after combination of cDNA clone and 5′ RACE data, and the corresponding deduced amino acid sequence, are shown in Figure 2. As can be seen from the sequence alignment with macaque EAP I (Figure 4), the deduced amino acid sequence of eMDC II exhibits the typical MDC domain organization, comprising a prodomain, a metalloproteinase-like domain, a disintegrin-like domain and a 21 residue C-terminal transmembrane domain. Interestingly, database searches with the eMDC II amino acid sequence indicate that EAP I is the most closely related MDC protein. Within its metalloproteinase-like domain, eMDC II contains a HEXGHXXGXXHD `catalytic centre' consensus motif which includes three of the four zinc-binding histidine residues, as well as the glutamate residue critical for catalytic activity in the snake venom reprolysins (see Figure 1). Although EAP I possesses a sequence similar to this consensus, there is a glutamine residue in place of the glutamate and hence EAP I is unlikely to be proteolytically active.
As well as being a putative metalloproteinase, eMDC II may also have a function mediated by its disintegrin-like domain. Like EAP I, a number of other MDC proteins, and some snake venom proteins, the disintegrin-like domain of eMDC II contains a putative integrin-binding ECD tripeptide motif, analogous to the RGD motif found in the majority of snake venom disintegrins.
Cloning the human eMDC II orthologue
The full-length human eMDC II cDNA sequence was obtained by a PCR-based strategy using cDNA prepared from commercially-available human `testis' total RNA. Previous work in our laboratory had established that this source of RNA contained epididymis-specific sequences, indicating that the epididymides had not been removed from the testes prior to RNA extraction.
The entire coding region of human eMDC II was obtained within three overlapping PCR fragments. Sequence analysis (Figure 3) of these combined fragments revealed a very high degree of sequence conservation with the macaque orthologue at both the nucleotide and deduced protein sequence levels (Figure 4). The HEXGHXXGXXHD `catalytic centre' sequence was conserved in the metalloproteinase-like domain of the human eMDC II sequence and the potential integrin-binding ECD tripeptide motif was present in the disintegrin-like domain.
Tissue distribution of macaque eMDC II transcripts
MDC proteins are expressed in a wide range of mammalian tissues, some with a broad tissue distribution, whilst others have a restricted pattern of expression. Northern blot data originally suggested that most MDC transcripts isolated from the macaque testis were specific to this tissue (e.g. Barker et al., 1994; Perry et al., 1995a). However, subsequent studies, using the more sensitive technique of reverse transcription–PCR (RT–PCR), have shown that only fertilin β transcripts are testis-specific while fertilin α and tMDCs I–IV transcripts are present, albeit at much lower levels, in a variety of other tissues (Frayne et al., 1998). Similarly, the mouse orthologue of EAP I, although expressed principally in the caput region of the epididymis, is also detected in anterior pituitary gonadotrophes (Cornwall and Hsia, 1997). We therefore determined the tissue distribution of macaque eMDC II transcripts by RT–PCR (Figure 5). The eMDC II-specific primers used were designed to amplify a 925 bp product which spans a number of predicted exon/intron boundaries (by analogy with the genomic organization of the mouse fertilin β gene; Cho et al., 1997) thereby eliminating the possibility of amplifying fragments from contaminating genomic DNA.
Macaque eMDC II transcripts were found to be abundantly-expressed in the caput region of the epididymis. Although also detected in the cauda epididymidis, the level of eMDC II transcripts was apparently much lower, indicating that expression is not uniform throughout the epididymis. Similar low levels of eMDC II transcripts were evident in lymph node, pancreas, ovary and uterus, even lower levels in salivary gland and kidney and virtually undetectable levels in all other tissues analysed (Figure 5). While the physiological significance of such low levels is difficult to assess, the comparatively high level of eMDC II transcripts present in the caput epididymidis strongly suggests that it is functionally important in this tissue.
Interestingly, the above RT–PCR analyses identified the presence of a slightly larger, but minor, transcript in many of the tissues examined. Whilst, without direct sequence analysis, we cannot exclude the possibility that this minor product represents an unrelated, cross-reacting, transcript, the stringent conditions used for priming suggest it is more likely to be an alternatively-spliced form of eMDC II, as observed for some other MDC transcripts.
Possible functions of eMDC II
Sperm maturation during epididymal transit involves the proteolytic modification of many sperm membrane proteins and considerable remodelling of the plasma membrane. Different sperm surface proteins are processed at different stages during transit and this is presumably achieved by the carefully controlled and co-ordinated expression of a variety of proteases and protease inhibitors. Whilst a number of epididymal protease inhibitors have been identified and partially characterized, for example acrosin trypsin inhibitor (Perry et al., 1993), little is known about the proteases involved in the processing of sperm surface proteins. However, it is difficult to envisage how soluble, secretory proteases could achieve the necessary degree of control, with different sperm proteins being processed at different times. In contrast, a membrane-bound protease, with its positional constraints, could afford a high degree of control. Such proteases might include members of the matrix metalloproteinase (MMP) family (reviewed by Hulboy et al., 1997) or the MDC family. Indeed the idea of a membrane-bound MDC proteinase acting on a membrane-bound substrate is not new; the ectodomain shedding activity of TACE has been proposed to play a role in the release of adjacent membrane proteins from the cell surface (Peschon et al., 1998). However, eMDC II, which is presumably located on the membrane of epididymal cells, must first come into close juxtaposition with the sperm surface for any proposed sperm surface processing to occur. In this respect it is interesting to note that the budding off and release of apical portions of the epididymal principal cells, as membrane-bound vesicles, is well documented and has been suggested to represent an apocrine secretion process (Fornes and De Rosas, 1991; Morales and Cavicchia, 1991). Indeed, a number of epididymally-expressed, glycosylphosphatidylinositol (GPI)-anchored proteins (e.g. CD52) are acquired by spermatozoa during epididymal transit and this cell–cell transfer has been proposed to be mediated via prostasome-like vesicles (Kirchhoff and Hale, 1996). Such vesicle-mediated transfer of an active metalloproteinase, e.g. eMDC II, from the caput epididymidis to caput spermatozoa, would be an attractive way of facilitating the regulated endoproteolytic processing of sperm surface antigens at specific stages during sperm maturation. However, experimental evidence for such a role must await the availability of eMDC II-specific antisera.
This work was supported by the Medical Research Council (UK). We thank Helen Barker and Rhiannon Murray for assistance with automated DNA sequencing.