Differential display PCR (DD RT-PCR) has been extensively used for analysis of differential gene expression, but continues to be hampered by technical limitations that impair its effectiveness. In order to isolate novel genes co-expressing with human RAG1, we have developed an effective, multi-tiered screening/purification approach which effectively complements the standard DD RT-PCR methodology. In ‘primary’ screens, standard DD RT-PCR was used, detecting 22 reproducible differentially expressed amplicons between clonally related cell variants with differential constitutive expression of RAG mRNAs. ‘Secondary’ screens used differential display (DD) amplicons as probes in low and high stringency northern blotting. Eight of 22 independent DD amplicons detected nine independent differentially expressed transcripts. ‘Tertiary’ screens used reconfirmed amplicons as probes in northern analysis of multiple RAG- and RAG+ sources. Reconfirmed DD amplicons detected six independent RAG co-expressing transcripts. All DD amplicons reconfirmed by northern blot were a heterogeneous mixture of cDNAs, necessitating further purification to isolate single cDNAs prior to subcloning and sequencing. To effectively select the appropriate cDNAs from DD amplicons, we excised and eluted the cDNA(s) directly from regions of prior northern blots in which differentially expressed transcripts were detected. Sequences of six purified cDNA clones specifically detecting RAG co-expressing transcripts included matches to portions of the human RAG2 and BSAP regions and to four novel partial cDNAs (three with homologies to human ESTs). Overall, our results also suggest that even when using clonally related variants from the same cell line in addition to all appropriate internal controls previously reported, further screening and purification steps are still required in order to efficiently and specifically isolate differentially expressed genes by DD RT-PCR.
Central to V(D)J recombination are the products of the recombination activating gene 1 (RAG1) and recombination activating gene 2 (RAG2) locus (1,2). Their necessary role in this process has been well established based on several independent lines of evidence (3–7). The expression of RAG1 and RAG2 is precisely regulated. For the most part, RAG1 and RAG2 are only expressed concordantly and stage-specifically in cells of the lymphocyte lineage (8). Briefly, a first ‘wave’ of RAG expression is detected in committed B and T progenitor lymphocytes at the point where immunoglobulin (Ig) µ and T cell receptor (TCR) β and δ loci undergo V(D)J rearrangement (9,10). The second ‘wave’ corresponds to V(J) rearrangement of the TCR-α (11) and Igκ and λ loci (12,13). A third ‘wave’ of RAG expression has been observed by several groups in Ig+ B cells (14–19) and TCR+ thymocytes (20,21). Recently, functional recombinase activity has been demonstrated to accompany this third wave of RAG expression in germinal center mature B cells (22,23).
The importance of precise regulation of RAG expression is obvious in cases where there is too little or no expression. For example, the resulting phenotypes in humans or mice in which the RAG genes have been disrupted by mutation or by targeted recombination, respectively, is a lack of mature lymphocytes and a resulting severe combined immunodeficiency (4,5,24). In contrast, improper shutting off of the RAG genes may potentially lead to aberrant rearrangements and subsequent oncogenic events, since overexpression of the RAG locus in transgenic mice results in various lymphocytic abnormalities (25,26).
Genes co-expressed with RAG may be of considerable general interest because they may represent unidentified, lymphocyte-specific components of the recombinase machinery and/or novel, developmentally regulated, lymphocyte-specific genes, some of which may themselves be regulating RAG mRNAs. Our overall objective was to isolate such genes. In this report, we have used stable, constitutive differential expression in OCI LY8 to isolate by differential display PCR (DD RT-PCR) several partial cDNAs that detect RAG1 co-expressing transcripts. We report our limitations with the DD RT-PCR technique and demonstrate modifications that have enabled us to identify six partial cDNAs that co-express with RAG1 mRNA. Based on the sequences of these cDNAs and the expression pattern of transcripts that they detect, we suggest that the screening approach reported here may yield the isolation of other genes that are developmentally and lineage restricted in the same manner as the RAG genes.
Materials and Methods
Cell lines, tissue culture and RNA isolation
OCI LY8-C3P (µλ+) is a single cell RAGlow clone from the OCI LY8 human B cell line (IgM+/IgD−/CD10+/CD19+/CD20+/CD38+), a line originally established from a patient with a B lymphoid large cell lymphoma (27). The OCI LY8 system has been described in detail elsewhere (16). RAG+ variants clonally derived from OCI LY8-C3P included C3-A11N and A8-6P. The other human mature B cell lines used included the diffuse large cell lymphoma cell lines OCI LY1 and OCI LY2, the Burkitt's lymphoma EBV+ lines Raji and Daudi and the human pre-B cell lines Nal-1 and PB-697 (pre-B acute lymphoblastoid lymphoma, RAG1+, λ5+, µ+). Human T cell and non-lymphoid cell lines used in our studies include Jurkat, K562 (erythroleukemia cell line, ATCC CCL 243), HeLa (ATCC, epithelial carcinoma) and U937 (pro-monocytic cell line). All cells were routinely cultivated at 37°C, 5% CO2 in RPMI 1640 medium (ICN, St-Laurent, Canada), supplemented with 10% fetal calf serum (Hyclone, Logan, UT), 2 mM L-glutamine, 100 U/ml penicillin and 100 µg/ml streptomycin (Gibco BRL, Grand Island, NY). For all cell lines, total RNA was extracted from cells in log phase growth (5 × 105 cells/ml) by the single-step guanidium thiocyanate phenol-chloroform extraction procedure (28).
DD RT-PCR was performed on DNA-free total RNA from OCI LY8-C3P and C3-A11N as previously described (29–31) with the following specific parameters. Forty primer combinations were employed, representing statistical coverage of ∼7000 eukaryotic mRNA species (29,31). For the first five primer combinations, DNA-free total RNA from each variant was reverse transcribed using the degenerate primer T12MG from the GenHunter RNA MAP kit (GenHunter Corp, Brooklyn, MA). Samples of each OCI LY8-C3P and C3-A11N reverse transcription (RT) were then amplified by low stringency PCR (with a 40°C annealing step) in the presence of [35S]dATP (Dupont NEN, Boston, MA) and employing the same T-specific primers used for RT in combination with various random 10mers: AP1, AP2, AP3, AP4 or AP5, also from GenHunter. To show reproducibility of banding patterns, RNA from two independent subclones of each variant was used in independent RT reactions and two separate PCR reactions were run for each representative RT reaction. Additionally, a control for each RT reaction, in which no reverse transcriptase was added, was subjected to the same PCR reactions and represented along with the rest of the samples (-RT) to control for contaminating chromosomal DNA. The resulting radioactive patterns of 3′ partial cDNA sequences were displayed on 6% denaturing polyacrylamide gels. Differential display with the rest of the primer combinations was similar except that the random primers themselves differed from the original RNA MAP primers: three one base anchored T-specific primers, HT11A, HT11G and HT11C, were used both in the RT and in combination with random 13mers in the PCR. These newer primer combinations have been reported to confer increased sensitivity and reduced redundancy (32,33). Sizes were determined by running known sequencing ladders adjacent to the displays and cDNAs in the 150–500 bp range were selected based on reproducibility of banding across separate RT and PCR reactions. All displays were performed using total RNA from variants in which differential expression of RAG1 and RAG2 was confirmed by northern blotting prior to display. cDNA bands differentially expressed are designated such that the first character represents the 10mer primer, the second character the T-primer and the third their order, from largest (first) to smallest (last).
For isolating candidate differentially expressed cDNAs, bands of interest were excised from the polyacrylamide gels, eluted by boiling and precipitated with the aid of glycogen as a carrier. Reamplifications were done using the same primer set and PCR conditions except that the dNTP concentrations were 10-fold higher and no [α-35S]dATP was added. cDNAs that failed to be reamplified in the first round of PCR were diluted and used as template in a subsequent round of amplification. After PCR amplification, 30 µl of each of the final reaction products were electrophoresed and then run on high percentage (2%) EtBrstained agarose gels run in 1× Tris/acetate/EDTA with appropriate low molecular weight markers in order to resolve the amplicons. cDNAs for subsequent radiolabeling reactions were purified from agarose gels using the Qiaex kit (Qiagen).
Northern blot analysis
Aliquots of 10–20 µg total RNA were electrophoresed under denaturing conditions and transferred to a ξ-probe nylon membrane (BioRad, Hercules, CA) by overnight capillary transfer. Membranes were cross-linked in a Stratalinker 2400 UV crosslinker (Stratagene, La Jolla, CA), pre-hybridized, hybridized to the appropriate probe, washed, wrapped in plastic wrap and exposed to BioMax MS X-ray film (Eastman Kodak, Rochester, NY) at −70°C. All technical procedures were according to the protocol supplied by the manufacturer of the ξ-probe with the exception that multiple washing regimens, from non-stringent washes (2 × 30 min 5% SDS washes at 65°C) to highly stringent washes (multiple 0.5% SDS washes at 70°C) were used so as not to miss any low message transcripts. As with display gels, reconfirmations were performed at least twice using total RNA from independent subclones of each variant to exclude irreproducible differentially expressed bands. Where applicable, the intensity of the hybridization signals were quantitated by the ImageQuant PhosphorImager software (Molecular Dynamics, Sunnyvale, CA).
Probes used in northern blot analysis were generated by radiolabeling cDNAs with [32P]dCTP (Dupont) using the Quickprime random hexamer labeling kit (Stratagene). The following cDNAs were used as probes: a 0.9 kb human RAG1 coding region fragment generated by XhoI and HindIII digestion of a 6.6 kb cDNA fragment supplied by Dr D. Schatz (Yale University School of Medicine, New Haven, CT); a 1.0 kb cDNA probe specific for the human β-actin gene used as a loading control, obtained from Dr N. Lassam. Because DD probes were typically <500 bp of 3′-untranslated regions, sensitivity was increased by modifying the random hexamer labeling reaction such that the T-specific primer was included in the labeling mix. Additionally, each probe was purified using Nic spin columns in order to minimize background and to verify that incorporated nucleotides for each radiolabeled differential display amplicon were at least 2 × 107 c.p.m.
Isolation of radiolabeled cDNAs detecting single, differentially expressed transcripts from northern blots
To isolate cDNA clones specifically detecting differentially expressed transcripts from those detecting non-specific bands in northern blots (representing the majority of situations in our screenings), the region of the membrane in which the differentially expressed transcript was located was determined using an RNA ladder size standard (Gibco BRL) and subsequently excised. The radiolabeled cDNA probe was eluted by boiling using glycogen as a carrier. The eluted cDNA was then reamplified in the same manner in which the initial reamplification round was performed. The resulting PCR amplicon corresponding to the expected size determined from the previous round of reamplification was verified to yield specific, differential expression by using it as a radiolabeled probe through one further round of northern blot reconfirmations. Subsequently, these were cloned into the PCRII vector using the TA Cloning Kit (Invitrogen, San Diego, CA) using 10 ng quantitated cDNA for ligation purposes.
RT-PCR for assessing RAG expression
Total RNA (2 µg) from OCI LY8-C3P and C3-A11N was treated with DNase I (Pharmacia) to remove contaminating genomic DNA, reverse transcribed using Superscript II reverse transcriptase (Life technologies), 1 µg random hexamer (Gibco BRL), 1 mM dNTPs and the supplied buffer. After RNase H digestion (Pharmacia), 1/10 of synthesized first strand cDNA was used to amplify either RAG1, RAG2 or α-tubulin transcripts by PCR using a Perkin Elmer-Cetus thermal cycler in a 100 µl final reaction volume. The PCR primers used included: RAG1A, 5′-CAG CGT TTT GCT GAG CTC CT-3′ (RAG1 sense primer); RAG1B, 5′-GGC TTT CCA GAG AGT CCT CA-3′ (RAG1 antisense primer); hR2A, 5′-TTC TTG GCA TAC CAG CAG-3′ (RAG2 sense primer located at nt 32–49 of human RAG2 cDNA); hR2C, 5′-CTA TTT GCT TCT GCA CTG-3′ (RAG-2 antisense primer located at nt 207–224 of human RAG2 cDNA); Tub-5′, 5′-CAG GCT CAA TGT GGC AAC CAG ATC GGT-3′; Tub-3′, 5′-GGC GCC CTC TGT GTA GTG GCC TTT GGC CCA-34. PCR conditions were as follows: 29 cycles of 45 s at 94°C, 45 s at 52°C and 45 s at 72°C and finally elongation for 10 min at 72°C. The initial cycle included 5 min at 94°C denaturation and the final cycle included 10 min at 72°C extension. After PCR amplification, 15 µl of each of the final reaction products were electrophoresed on a 1.5% agarose gel run in 1× Tris/acetate/EDTA buffer. RT-PCR products were then transferred to ξ-probe nylon membrane (BioRad) by overnight capillary transfer. Membranes were then crosslinked using a Stratalinker 2400 UV crosslinker (Stratagene), pre-hybridized and hybridized with the following specific [γ-32P]ATP-end-labeled oligonucleotides: RAG1T, 5′-AAG TAT AGG TAT GAG GGA A-3′; Tub-P, 5′-ACC TGA GCG AAC AGA GTC CAT C-3′. After washing, membranes were exposed to BioMax MS X-ray film (Kodak). Experimental controls included amplification of RNA, to control for genomic DNA amplification, and amplification of α-tubulin transcripts, to demonstrate the presence of cDNA. Standard curves for quantitation purposes were generated through amplification of either human RAG1 cDNA cloned in the pBluescript SK− plasmid or RAG2 cDNA cloned into the pBluescript SK+ plasmid. Results were validated by performing two independent experiments.
DNA sequencing and sequence data analysis
The primers for sequencing were synthesized using a PerSeptive 8909 automated DNA oligosynthesizer (PerSeptive Biosystems Inc., Framingham, MA) and purified by oligonucleotide purification cartridge chromatography (Applied Biosystems, Weterstadt, Germany). DNA sequencing of the differential display products was performed on either separate strands and/or from the same strand by cycle sequencing, using the Thermo Cycle Sequencing Kit (Amersham, Arlington Heights, IL) on a LICOR 4000L automated sequencer (LICOR Corp.). Sequences were analyzed and verified using the Sequencher 3.0.1 DNA analysis program (Gene Codes Corp., Ann Arbor, MI). Nucleotide sequences were analyzed for homology or identity with known sequences in nucleotide databases (EMBL, GenBank, DDBJ and dbEST) using the database similarity search algorithms FASTA, BLAST-N and BLAST-P from the GCG software package v.8 (GCG, Madison, WI). DNA sequences of cDNA clones isolated in our screens have been submitted to the GenBank database (accession nos AF080573-AF080578).
Quantitation of differential RAG expression in OCI LY8 clonal cell variants and rationale for using OCI LY8-C3P and C3-A11N in DD RT-PCR
Our laboratory has previously demonstrated increased, constitutive RAG expression which accompanies spontaneous in vitro secondary rearrangements in the human mature B cell line OCI LY8-C3P (16). We quantitated the constitutive differences in RAG expression in OCI LY8-C3P and its clonally related variants, C3-A11N and A8-6P, both by northern analysis and by RT-PCR. By northern blotting, the increases in RAG1 mRNA signals in C3-A11N and A8-6P as compared with the parental clone OCI LY8-C3P was found to be 20.8- and 16.4-fold, respectively, as quantitated by PhosphorImager analysis (Fig. 1A). A similar increase in RAG1 and RAG2 transcripts in the C3-A11N variant as compared with the parental clone OCI LY8-C3P was also observed by semi-quantitative RT-PCR analysis (Fig. 1B). Based on dilutions of plasmid standards and theoretical total RNA yield calculations, we estimate that in this assay, OCI LY8 C3P contains an average of ∼100 RAG1 and ∼5 RAG2 mRNA copies/cell, whereas the C3-A11N variant contains an average of 1–2 × 103 RAG1 and ∼50 RAG2 mRNA copies/cell. The levels of RAG transcripts in C3-A11N as calculated by this method are in the same range of expression as in other RAG-enriched tissues or pre-B cell lines that have been reported (15).
The above variants are well suited for approaches aimed at isolating differentially expressed genes for two reasons. First, the low and high RAG-expressing OCI LY8 variants are clonally related. We reasoned that using such variants would increase the efficiency of isolating RAG-associated genes by decreasing the number of differentially expressed genes associated with cell line differences. In this context, we have previously demonstrated that, along with increases in RAG gene expression, OCI LY8 variants do not undergo alterations in general phenotypic or differentiation markers (such as in the cell surface markers CD10, CD19, CD20, CD38, B7-1, MHC I and MHC II) (34,35). Furthermore, there is no measurable difference in activation of early or late general signaling parameters amongst these variants (such as Ca2+ flux, anti-phosphotyrosine profiles, proliferation, c-fos mRNA induction and alterations in CD25 and CD71 expression) (35,36).
A second important criterion for using OCI LY8-C3P and C3-A11N to isolate differentially expressed genes is the stability of RAG gene expression in these variants. The relative differences in RAG expression, both between low and high RAG-expressing variants as well as between variant subclones, is stable over time (34). In contrast, subclones of other RAG-expressing cell lines have been reported to express variable levels of these gene transcripts over time (37). The C3-A11N clone is particularly well suited because it has exhaustively rearranged both its alleles and therefore cannot undergo further Igλ rearrangements in culture (34).
Selection approach for choosing candidate RAG co-expressing partial 3′ cDNAs
As a primary screen for isolating potential genes co-expressed with RAG1, we used DD RT-PCR to comparatively fingerprint mRNA transcripts between OCI LY8-C3P and its non-rearranging variant C3-A11N. DD RT-PCR was performed on OCI LY8-C3P and C3-A11N using standard degenerate primer combinations (30,32). To reduce artifacts, for each set of primers we used several internal controls, including the use of independent subclones, reactions without reverese transcriptase and multiple PCR reactions. To control for the possibility of residual genomic DNA contamination after DNase I treatment, we also ran -RT controls adjacent to the rest of the samples. Furthermore, to reduce crowding in the upper part of the sequencing gel (which has been shown to potentially increase heterogeneity of DD amplicons; 33) we also performed short and long electrophoreses in situations in which ‘bunching’ was observed. A representative differential display PCR gel is shown in Figure 2A. As described in Table 1, 22 reproducible differences between OCI LY8-C3P and C3-A11N were found in total (eight expressed in OCI LY8-C3P and 14 expressed in C3-A11N). Of these putative differentially expressed amplicons, 14 differences were found to be expressed exclusively in one variant and not the other (Table 1).
Because up to 85% of differential display amplicons can be artifactual (33), we verified the expression pattern originally seen in DD RT-PCR by northern analysis to exclude ‘false positives’ (i.e. cDNAs from transcripts with equivalent levels in OCI LY8-C3P and C3-A11N). This ‘secondary screening’ was done by radiolabeling the purified, reamplified uncloned cDNAs isolated from the differential displays to probe total RNA from the original low and high RAG-expressing OCI LY8 variants. Nine of the 22 putative differentially expressed amplicons were reconfirmed by northern blot analyses. (Fig. 3 and Table 2). These nine reconfirmed amplicons in all detected 11 differentially expressed transcripts. The differentially expressed transcripts that the cDNAs hybridized to varied in size (from 400 bp to 10.6 kb), thereby pointing to the likelihood of having isolated independent differentially expressed transcripts. The exception to this was the 5C2/5C3 pair, which hybridized to the same size differentially expressed transcripts as well as being found within several base pairs of each other on the sequencing gel used for display. The expected differences in RAG1 between variants was validated by sequentially hybridizing the same membrane strips with the radiolabeled RAG1 and β-actin cDNAs (Fig. 3). Use of uncloned cDNAs as probes in all cases yielded multiple non-specific bands, almost all of which were of more abundant message classes (see Table 2).
To further extend our results from the secondary screens, a ‘tertiary screen’ was performed to identify cDNAs of transcripts with a RAG1-specific expression pattern. To do this, uncloned amplicons from the secondary screen detecting differentially expressed transcripts in OCI LY8-C3P and C3-A11N were used to probe total RNA from an array of human RAG+ and RAG− cell lines (Fig. 4). RAG+ total RNA sources included another independently derived high RAG-expressing OCI LY8 variant, A8-6P, and the pre-B cell lines 697 and Nal-1. RAG− total RNA sources included the human mature B cell lines Raji, Daudi and OCI LY1, the mature T cell line Jurkat and the non-lymphoid cell lines HeLa, U937 and K562. The expected differences in RAG1 between variants were validated by sequentially hybridizing the same membrane strips with the radiolabeled RAG1 and β-actin cDNAs. From this screen, we found that four transcripts exclusively expressed in C3-A11N were also found in other RAG-expressing cell lines assessed: a 5.0 kb mRNA detected with the 3G1 amplicon, ∼600 bp and ∼1.0 kb mRNAs detected with the 5C2 and 5C3 amplicons, an ∼10.6 kb mRNA detected with the 2A4 amplicon and an ∼2.1 kb mRNA detected with the 2G2 amplicon (Fig. 4). One of the two transcripts differentially expressed in OCI LY8-C3P, an ∼2.3 kb transcript detected with the 4A1 cDNA, was also found predominantly in other non-RAG expressing cell lines.
Purification of single cDNAs from heterogeneous amplicon mixes
Based on the multiple hybridizing bands seen in our secondary and tertiary screening results, it was evident that the 22 DD amplicons included cDNAs detecting genes with alternative splicing or, conversely, included a mix of cDNAs detecting distinct genes (Figs 3 and 4 and Table 2). We initially attempted to clone these amplicons directly into PCRII and to subsequently use these cloned products as probes. However, with this approach, usually only one of the non-specific, more abundant transcripts was obtained, as determined by northern blot analysis (data not shown). Thus, in order to sequence the appropriate differentially expressed transcript, it was necessary to isolate the individual cDNAs detecting the differentially expressed transcripts from irrelevant contaminating cDNAs. This was done by eluting radiolabeled cDNA from the region in the the original secondary screening northern blots in which the transcripts of interest were estimated to be located and reamplifying this cDNA for further manipulation. Reamplified products were then electrophoresed on agarose gels and purified. Subsequently, these products were cloned into the PCRII vector and used as radiolabeled probes in northern analysis of total RNA from OCI LY8-C3P and C3-A11N to confirm single, differentially expressed transcripts.
As an example, Figure 5 shows the purified cDNA probes 5C2/5C3-1.0 and 5C2/5C3-0.6 specifically detecting differentially expressed transcripts from the 5C2 and 5C3 DD amplicons. In each case, these DD products yielded two non-specific transcripts and two identically sized differentially expressed transcripts in secondary and tertiary screens (Figs 3 and 4 and Table 2). Therefore, four cDNAs were cut out of the appropriate region in the northern membrane (two differentially expressed transcripts/amplicon). Both the 600 bp and 1.0 kb bands were excised and eluted from these gels and reamplified with the same primer sets as originally used and under the same conditions used for reamplifications from polyacrylamide gels in the original reconfirmation steps. In both cases, the 600 bp and 1.0 kb transcripts are from distinct genes (not alternative splicing) because each resulting single purified cDNA detects single 600 bp or 1.0 kb bands, respectively, when used as a radiolabeled probe in northern analysis (Fig. 5).
Sequence analysis of purified cDNA clones
As expected, 5C2 and 5C3 yield cDNAs with identical sequences, consistent with the same detection pattern of differentially expressed transcripts in secondary and tertiary screens (Figs 3 and 4 and Table 2) and also consistent with the fact that they were found in almost identical regions of the initial DD RT-PCR gel (Table 1). We have therefore labeled the two distinct cDNAs detecting the ∼600 bp and 1.0 kb transcripts 5C2/5C3-0.6 and 5C2/5C3-1.0, respectively. The sequences of the single cDNAs purified from the differentially expressed amplicons are shown in Table 3. The sequences include matches to: a portion of the human RAG2 exon 2 coding region (corresponding to the cDNA purified from the 2G2 amplicon; accession no. AFO80577); the human BSAP (Pax-5) complete cDNA (corresponding to the cDNA purified from the 2A4 amplicon; accession no. AFO80573); two human expressed sequence tags (cDNAs 4A1 and 5C2/5C3-1.0; accession nos AFO80574 and AFO80575, respectively). Two sequences (cDNAs 5C2/5C3-0.6 and 3G1; accession nos AFO80576 and AFO80578, respectively) were found to have no significant matches to any database sequences.
Based on these partial sequences, several important technical points can be made regarding the DD RT-PCR technique itself. Firstly, under the standard low stringency conditions used in our system, the primers often did not anneal to the theoretically intended target sequence, i.e. the polyadenylated 3′-region of differentially expressed transcripts (Table 3). For example, based on the sequence of the BSAP cDNA matching to the 2A4 cDNA, the arbitrary decamer AP-2 was used as both the sense and the antisense primer to amplify a portion of the BSAP coding region rather than the 3′-untranslated sequence. This finding is consistent with reports that many DD products actually amplify internal sequences rather than polyadenylated regions (38). We also found that at least four mismatches were tolerated with the arbitrary decamer in all six of our sequences and that these mismatches were predominantly clustered in the 5′-end of the decamer. The 5′-region of the decamer being preferential for mismatches is consistent with previous analysis (31), although the frequency with which mismatches were tolerated (both for different sequences and within the same sequence) was higher in our study. Finally, in 3/5 purified cDNAs (2G2, 4A1 and 5C2/5C3-0.6) the poly(A)-anchoring primer was also found to tolerate up to four mismatches and hence did not necessarily amplify the very 3′ poly(A) ends of differentially expressed transcripts. In the case of RAG2 (2G2), for example, an A-rich stretch in the RAG2 coding region resembling a polyadenylated tail was detected by the poly(A)-anchoring primer.
Lymphocyte differentiation is a complex series of events mediated by the expression of a number of lineage- and differentiation-restricted genes. We hypothesized that using DD RT-PCR to assess a unique pair of clonally related variant cell lines with differential expression of RAG transcripts could be used to isolate developmentally and B lineage-restricted genes by virtue of their co-expression with human RAG1 in B cell lines. There are several examples of pre-B- and/or pre-lymphocyte-specific genes with a RAG-like expression pattern, including the early B cell-specific transcription factors N-myc, LEF-1 and EBF, the V(D)J-modifying enzyme TdT, the cell surface marker CD10 and components of the pre-B cell receptor VpreB and λ5.
The six partial sequences of RAG1 co-expressing genes identified in this study confirm the validity of our overall screening approach. One of the cDNAs identified, 2G2, corresponds to the 3′-end of human RAG2 exon 2 (Table 3) and the transcript size of ∼2.1 kb detected in tertiary screens (Fig. 2) is consistent with the known size of the predominant human RAG2 transcript (16). RAG2 is one of the differentially expressed genes expected to be picked up between these two variants and thus serves as a good internal control. Because we did not use all possible primer combinations, one possibility why we did not see the more abundant RAG1 message in our screens is that the primer sets employed did not cover this transcript.
Interestingly, another of the differentially expressed cDNA sequences (2A4) has complete identity to an internal region of the human transcription factor BSAP (Pax-5) cDNA, an early B cell-specific member of the paired domain Pax family of transcription factors (Table 3). The 10.6 kb differentially expressed transcript detected by 2A4 in our secondary and tertiary screens in RAG+ sources (Figs 3 and 4) is consistent with the predicted size of the Pax-5 mRNA (39). Although it has been previously established that BSAP is expressed in pro-B, pre-B and mature B cells as well as the CNS and testis, we show here that human BSAP expression correlates with human RAG1 expression, suggesting a potential functional link between these two genes. Supporting these observations indirectly are the phenotypes of BSAP- and E2A-deficient mice. Although mice with disruptions in the Pax-5 gene were not assessed for RAG1 transcript levels, low levels of RAG transcripts are suggested by their phenotype, in which there is an arrest in B cell development very similar to that of the RAG knockout phenotype with the exception that residual heavy chain rearrangements can still be detected (40). Furthermore, the E2A knockout mouse, which is also blocked at the pro-B cell stage of differentiation, has reductions in both the RAG1 and Pax-5 transcripts (41). Because BSAP has an important role in B cell differentiation, and as such has been demonstrated to interact with several B cell-specific gene promoters (for example CD19, λ5 and mb-1) (42,43), it is therefore tempting to speculate that BSAP may be involved in inducing or enhancing the tissue-specific expression of RAG1 mRNA. Although the sequences of the core RAG1 and RAG2 promoter regions have no BSAP sites present (44), these cis elements have been demonstrated not to be important for developmental and tissue-specific expression of RAG transcripts (44–45), suggesting that other elements, perhaps including those containing BSAP motifs, are required. We are currently testing if stable transfection of human BSAP cDNA has a role in the regulation of RAG expression in OCI LY8-C3P.
Our screens also yielded two cDNAs (4A1 and 5C2/5C3-1.0) with matches to several human ESTs. Consistent with differential expression of the transcript detected by 4A1 in the low RAG-expressing variant OCI LY8-C3P (in secondary screens) and with the inverse correlation with RAG expression in tertiary screens, it is not surprising that 4A1 cDNA matches to several ESTs apparently only derived from non-lymphoid sources (Table 3). Conversely, the cDNA (5C2/5C3-1.0) detecting a 1.0 kb transcript with a similar expression pattern to RAG has a near identical match to several ESTs from germinal center B cell and splenic-derived cDNA libraries (Table 3). Although this EST maps to chromosome 11, like the RAG locus, it is not physically proximal to the RAG genes. We are currently sequencing the EST clones corresponding to 4A1 and 5C2/5C3-1.0 to define the open reading frames of the corresponding complete cDNAs.
The other two cDNA clones we have also sequenced (5C2/5C3-0.6 and 3G1) were found to have no homology to any database sequences. As with 5C2/5C3-1.0, these two cDNAs of unknown sequence detected transcripts that were found in the high RAG-expressing clone C3-A11N. We have cloned the corresponding full-length cDNA of one of these unknown sequences (3G1). This cDNA (accession no. AF026477) encodes a novel gene we have called hBRAG (human B cell RAG-associated gene), which encodes a type II transmembrane glycoprotein unrelated to other known type II protein-encoding multigene families. We have characterized the tissue distribution, genomic organization and chromosomal localization of this gene and have shown that stable transfection of the complete hBRAG cDNA into the low RAG-expressing B cell variant OCI LY8-C3P increased levels of RAG1 transcripts, but not in the non-lymphoid line K562, suggesting that this product is potentially involved in B cell-specific regulation of RAG1 transcripts (46). We are currently characterizing this protein at the biochemical level using recently generated antibodies to the hBRAG protein.
DD RT-PCR has two major technical limitations which we had to address. The first limitation is the high frequency at which false positive artifacts arise using the standard DD RT-PCR methodology (33). Such artifacts are not only problematic in that they may actually be masking bona fide differentially expressed transcripts in the same region of northern blots, but they also make screening for differentially expressed genes extremely inefficient. In our DD RT-PCR screens, 14 of 22 or 64% of DD amplicons were ‘false positives’, even when using clonally related cell variants in addition to all appropriate controls. While it has been reported that a large fraction of artifacts can be further reduced by increasing the stringency of the PCR annealing conditions or using more specific (less degenerate) primers (31), it has also been documented that such increased stringency parameters result in the out-competition of lower abundance class transcripts (to which most developmentally regulated genes belong) by more-abundant, non-specific transcript classes (33). During the course of optimizing the technique for our system, we found that while either increasing the stringency of the standard annealing step or reducing cycles numbers reduces contaminants that produce strong non-differential bands, it also compromises the detection of reproducible differentially expressed transcripts detectable by DD RT-PCR (data not shown). By increasing the screening stringency in steps subsequent to the DD RT-PCR itself, i.e. performing two separate rounds of northern blotting, we eliminated 17 of 22 amplicons found in the original DD RT-PCR, while at the same time generalizing these results to other RAG+ sources (Figs 3 and 4). The second major limitation encountered with the DD RT-PCR technique was that almost every amplicon used as a probe in northern blot analysis was a mixture of cDNAs, most of which detected non-specific genes. Even when running both long and short gels in order to obtain better separation, we still found heterogeneous transcript species under low stringency northern blotting conditions (Table 2 and Fig. 3). As noted by others, our results indicate that prior to subcloning and sequencing, purification of single cDNAs (at least under the standard DD RT-PCR conditions used in our assays) was a necessity. While we show that direct elution of the cDNA of interest from the northern membrane is an effective way of doing this (Fig. 5), several other reports have also discussed ways to purify single cDNAs from heterogeneous DD amplicons (47–53). In all, our approach for eliminating artifactual cDNA products allowed us to eliminate ∼75% of all DD amplicons and an additional ∼75% of cDNAs within each amplicon of interest. A flowchart summarizing our overall scheme for selecting RAG co-expressing cDNAs is shown in Figure 6.
How efficient is the DD RT-PCR technique at detecting developmentally and lineage-restricted genes? One report supports the notion that DD RT-PCR is biased towards identifiying a small number of intermediate and high message class transcripts (>300 and 12 000 mRNA copies/cell, respectively) (54). Because 90–95% of eukaryotic mRNA species are estimated to belong to the rare message class (≪50 copies/cell) (55), this would imply that DD RT-PCR misses the majority of mRNAs. Conversely, another study argues for the exquisite sensitivity of DD RT-PCR in detecting rare message transcripts (38). In our study, the fact that DD RT-PCR could identify RAG2, a low abundance transcript in C3-A11N (Fig. 1B), supports the latter report. Furthermore, the other differentially expressed transcripts detected in secondary and tertiary screens may also belong to the rare message class, as suggested by their expression relative to non-specific, high abundance transcripts within the same amplicon (Figs 3 and 4) and by the observation that differentially expressed transcripts were only detected under lower stringency conditions in our northern analysis (Fig. 3 and accompanying legend).
Our experiments have confirmed that DD RT-PCR can be utilized to isolate genes with a very specific pattern of expression, i.e. co-expressed either directly or indirectly with the RAG genes. As well as detecting RAG2 itself, we isolated BSAP (Pax-5), previously shown by others to be associated with lymphoid recombination, although not directly with RAG expression itself. Other novel cDNAs have also been cloned and we have recently demonstrated that one of these (hBRAG) may itself regulate the expression of human RAG1. The sequences of other cDNA clones are currently being analyzed and the relevance of the isolated genes to the RAGs and to early B cell development will be determined. Like others (56), we have shown that refinements to DD RT-PCR can yield a powerful approach to the isolation of specific and possibly novel developmentally or lineage-restricted genes.
This work was supported by the National Cancer Institute of Canada (NCIC grant 7286). L.K.V. is supported by a Medical Research Council Studentship.