Abstract

Motivation: Transcription start site selection and alternative splicing greatly contribute to diversifying gene expression. Recent studies have revealed the existence of alternative first exons, but most have involved mammalian genes, and as yet the regulation of usage of alternative first exons has not been clarified, especially in plants.

Results: We systematically identified putative alternative first exon transcripts in rice, verified the candidates using RT–PCR, and searched for the promoter elements that might regulate the alternative first exons. As a result, we detected a number of unreported alternative first exons, some of which are regulated in a tissue-specific manner.

Contact:  [email protected]

Supplementary information:  http://www.bioinfo.sfc.keio.ac.jp/research/intron

INTRODUCTION

Transcription start site (TSS) selection and alternative splicing greatly contribute to diversifying gene expression. Alternative first exons (AFEs), where the first exon of one splice variant of a gene is located within an intronic region of another variant, have been reported to make a substantial contribution to diversity. Some AFEs merely change the 5′-untranslated region (5′-UTR) (Oberbaumer et al., 1998; Hu et al., 1999; Kelner et al., 2000), and do not lead to protein differences. However, many genes have start codons in their AFEs that alter the N-termini of the translated proteins (Hugnot et al., 1992; Nakhei et al., 1998; Lee-Kirsch et al., 1999; Takano et al., 2000; Sun et al., 2001). Most of these cases have been reported in mammals and it has been suggested that these transcripts are regulated in a tissue-specific manner and/or developmental stage-specific manner.

Recent large-scale studies have provided overviews of AFEs in humans. Suzuki et al. (2001a) have systematically investigated the diversity of transcriptional initiation, using full-length human cDNAs and 5′ end expressed sequence tags (ESTs). Another computational study has shown that AFEs are frequent in the mouse (Zavolan et al., 2002).

Very few studies have looked at AFE transcripts in plants. The Arabidopsis SYN1 gene utilizes alternative promoters and splicing to produce two transcripts with different 5′ ends, one transcript (BP5) beginning within the first intron of the other (BP2) (Bai et al., 1999). SYN1 is one of the small number of reported AFE transcripts, but its regulation is yet to be analyzed.

Recently, a rice (Oryza sativa) draft genomic sequence and the sequences of a number of full-length cDNAs, have been reported. In the present study, we systematically identified novel candidates for rice AFEs and examined the tissue specificity of the candidates. We also compared our findings with those in the mouse, in order to analyze the regulation of usage of AFEs and the various types of TSSs, such as strict transcription start sites (STSS) and multiple strict transcription start sites (MSTSS). This is the first report on these features in rice, a model monocot.

METHODS

Systematic detection of MSTSSs, TSSs and AFEs

All of the acquired datasets and databases are listed in Table 1. Mapping of 5′ end ESTs on genomic sequences was performed with BLAST (Altschul et al., 1997) [E-value <10(−100) in rice, <10(−50) in mouse] followed by SIM4 (Florea et al., 1998). After locating each EST approximately on the genomic sequences, SIM4 (≥90% match) was used to determine its exact location and the gene structure. We excluded ESTs that did not map onto a unique genomic region from the fine mapping.

5′ end ESTs were grouped into a cluster if they mapped in the same genomic region and the distance between them was <100 bp, or they mapped on the same full-length cDNA. However, some ESTs could be mapped to incorrect regions because the length of the EST is not long enough for fine mapping. In this study, such mapping errors could have caused increased noise in the calculation of the standard deviations (SDs) of each TSS, and in the classification of TSS types into STSSs, MSTSSs and AFEs. We therefore removed the 5′ and 3′ farthest 10% of the number of ESTs as outliers for each cluster. This should not significantly affect the outcome of our analyses because we did not need a large quantity of TSS datasets but rather accurate ones. We found that the exclusion only altered the SDs slightly even when the ESTs removed were located close to the exact TSSs. In addition, a small proportion (about 10–20%) of 5′ end ESTs might not have been the exact TSSs due to the experimental methods used for determining TSSs. Since these outlier ESTs usually mapped to the 3′ farthest region of each cluster, we also removed the 3′ farthest 10% (20% in total) of the ESTs, and used the clusters containing 10 or more 5′ end ESTs for further analyses in order to exclude uncertain TSS clusters.

A small fraction of the 5′ end ESTs have IDs corresponding to full-length cDNAs as published by Kikuchi et al. (2003). In such cases we used the full-length cDNAs to annotate our clusters and to identify the predicted start codon. The annotation data and predicted ORF information are publicly available (Table 1). The longest ORF was used to predict the ORF of each full-length cDNA. We used ORF information except in instances of already-known rice genes, which comprised ∼9% (2603) of all the full-length cDNAs (28 469).

To classify each 5′ end EST cluster, we calculated the SD of the cluster as follows:
where xi is the distance from the 5′ end of each cluster (bp), , the mean of xi (bp) and n, the number of ESTs in each cluster. We then defined and classified each 5′ end EST cluster of STSS, MSTSS or AFE types as follows: STSS clusters contain TSSs whose SD is <5, because the mode of the SDs of human TSSs is reported to be ∼10 (Suzuki et al., 2001a); MSTSS clusters (1) include multiple STSSs in a cluster, (2) contain three or more 5′ end ESTs in the STSS subcluster and (3) contain STSS subclusters of length exceeding 25 bp; AFE clusters contain a subcluster whose first exon is within the intron of another subcluster. Candidate MSTSS clusters were extracted by single linkage clustering of TSS positions and classified as MSTSS clusters when the candidates fulfilled the criteria listed above.

Identification of promoter elements

The search for eukaryotic promoter elements was performed with MatInspector Rel. 2.1 (Quandt et al., 1995), which utilizes TRANSFAC 3.1 matrices (http://www.gene-regulation.com/) as a database. The matrix ID, preferred region and cutoff value of each element were as follows; TATA box: V$TATA_01, from −40 to −23, 0.77; Initiator: V$CAP_01, from −5 to +6, 0.87; GC box: V$GC_01, from −74 to −45, 0.78; CCAAT box: V$CAAT_01, from −105 to −70, 0.78 (Tsunoda and Takagi, 1999; Suzuki et al., 2001b). CCAAT boxes and GC boxes were searched in both plus and minus strands.

Reverse Transcription–Polymerase Chain Reaction (RT–PCR)

Total RNA was isolated from several rice tissues or cells (including seedlings grown in the dark for eight days after imbibition), from leaf sheath, leaf blade and panicles immediately after heading and two weeks later, and from the callus derived from the embryo, using RNeasy Plant Mini Kits according to the manufacturer's instructions (Qiagen, Valencia, CA). About 5 μg of DNaseI-treated RNA was reverse-transcribed with 0.5 μg of an oligo dT primer and SuperScript II reverse transcriptase (Invitrogen, Carlsbad, CA). RT–PCR was performed using 1 μl aliquots of 50 μl of the first strand cDNA products and Ex Taq polymerase (Takara, Shiga, Japan). In order to verify that we used equal amounts of cDNA template in each sample, PCR was conducted with primers specific for the 18S rRNA gene (5′-ACAATCTAAATCCCTTAACGAGGATC-3′ and 5′-ACTAGGACGGTATCTGATCGTCTTC-3′). The primers specific for the individual first exons are given in the Supplementary materials section. The reverse transcripts were used to verify the specificity of the whole rice genome sequences. PCR products were amplified for 30 cycles of Cluster 1; 36 cycles of Clusters 10, 6 and 5; 40 cycles of Cluster 11; 32 and 40 cycles for transcribed variants 1 and 2 of Cluster 7; 36 and 40 cycles for those of Cluster 8; 28 and 40 cycles for those of Cluster 9; 40 and 32 cycles for those of Cluster 2; 36 and 40 cycles for those of Cluster 3; 32 and 36 cycles for those of Cluster 4 and 36 and 40 cycles for those of Cluster 12. The amplified fragments were separated by electrophoresis on 1.5% agarose gels, stained with ethidium bromide, and analyzed with a fluorescent image analyzer, Molecular Imager FX Pro (Biorad, Richmond, CA). We also performed RT–PCR to validate the accuracy of determination of 5′ ends (Supplementary materials section).

Identification of TSS

Since we used 5′ end ESTs and full-length cDNAs constructed using the oligo-capping and cap-trapper methods (Maruyama and Sugano, 1994; Carninci et al., 1996), the identification of TSSs in this study is more reliable than those which used general 5′ ESTs. In addition, we evaluated the 5′ end accuracy of our data, by confirming the expression levels of the upstream regions of exons ‘a’ and ‘b’ in series 3.1, 3.2, 3.3 in Supplementary Figures S.1 and S.2, by means of RT–PCR. The results revealed very little, if any, mRNA expression upstream of all exon ‘a’ TSSs, indicating that the locations of the TSSs in our data are reliable.

RESULTS And DISCUSSION

TSSs of rice are less diverse than mouse

Our mapping procedures detected 1159 clusters among 91 425 rice 5′ end ESTs. We also carried out the same procedure in mouse to compare the features of TSSs in plants and mammals and detected 501 clusters among 141 935 mouse 5′ end ESTs (all mapping and clustering results are summarized in Table 2). Although the number of rice 5′ end ESTs was less than in the mouse, a larger number of rice clusters were detected. This probably reflects the difference between the number of genes in the rice and mouse genomes. Of all clusters, 556 STSSs and 69 MSTSSs were detected in rice, and 110 STSSs and 20 MSTSSs in mouse. We then calculated the SDs of TSSs for each cluster (Fig. 1). Note that the distribution of mouse SDs is similar to that in humans (Suzuki et al., 2001a), which suggests that high TSS diversity is a common feature in mammals. Nearly 50% of the clusters in rice had SDs <5, compared with only 22% in mouse, indicating that rice genes may have less diverse TSSs than the mouse genes.

Rice MSTSS clusters do not generate protein diversity

In rice MSTSS clusters, all the start codons of the predicted longest ORFs were located within the first exons, which have the most downstream TSSs in the clusters, suggesting that the choice of MSTSS would not affect the coding region. For example, Figure 2A shows rice glutathione S-transferase II, whose MSTSS has not been reported. A full-length cDNA (GenBank accession no. AF062403) of this gene has been identified (Wu et al., 1998) and located 76 bp downstream of the TSS in Figure 2A, suggesting that the choice of TSS in this MSTSS cluster only affects the length of the 5′-UTR. In addition, we found another unreported MSTSS, in rice calcium-dependent protein kinase (CDPK). These results show that rice MSTSSs may not create protein diversity. Hence we propose that, by altering the length of the 5′-UTR, MSTSSs may instead affect, for example, the stability of mRNA transcripts or the selection of cis-acting regulatory elements.

Tissue-specific expression of AFEs in rice

We detected 46 potential AFE clusters (4%) among 1159 clusters in rice and 69 potential AFE clusters (14%) among 501 clusters in mouse. Figure 2B gives an example of the AFE clusters found in mouse. The average length of the first introns in mouse AFE clusters was greater than in rice. In addition, the higher TSS diversity in mouse (Fig. 1) suggests that transcription initiation events (including AFE) are more dynamic in mammals than in plants.

We also used the tissue type information of each 5′ end EST and extracted 12 rice AFE clusters predicted to be regulated in a tissue-specific manner. We then verified their tissue specificity by RT–PCR in six tissues (seedling, leaf blade, leaf sheath, flower just after heading, flower two weeks after heading and callus). As shown in Figure 2C, we designed a sense primer specific for each first exon, and an antisense primer in a constitutive exon (second or third exon).

As a result, we observed five AFE clusters regulated in a tissue-specific manner (Fig. 3A). In the other seven clusters, clear tissue-specific expression was not observed for the six tissues. Overall, the mRNA level of flowers just after heading (flower 1) differed significantly among transcripts with different first exons.

Structure 1 in Figure 3A shows an AFE cluster whose gene encodes a starch-branching enzyme (SBE). SBE synthesizes amylopectin from amylase, and a wheat SBE is known to generate alternative 5′ end transcripts (Baga et al., 1999). Our RT–PCR results revealed that the exon 1b transcript was expressed constitutively, while the exon 1a transcript was expressed specifically in flower 1 and leaf blade. Figure 4 shows the locations of an already-known rice SBE gene (GenBank accession no. D11082) (Mizuno et al., 1992), as well as the full-length cDNAs and their ORFs in this cluster. The start codon of the already-known gene was located in the first exon, showing that AFE events change the protein-coding region in this cluster. In addition, it has been reported that differences between N-termini affect enzymatic properties in kidney bean SBE (Hamada et al., 2002). Our free energy analysis of 100 bp downstream of the TSSs revealed no significant difference between the two AFEs in another AFE cluster (data not shown). These results indicate that AFEs may play a role in the coupling between promoter selection and protein diversity.

Structures 2–4 in Figure 3A show clusters encoding unknown genes whose expression levels were modulated in a tissue- and/ or stage-specific manner. The clusters of Structures 3 and 4 have predicted ORFs in their first exons, suggesting that the AFEs of these genes affect the protein products. Structure 5 shows a cluster homologous to mitogen-activated protein kinase (MAPK). The RT–PCR results suggest that the exon 1a transcript is required for enzyme activity, especially in seedling. The start codon of the predicted longest ORF was detected only in exon 1a transcript. It is possible that the protein translated from the exon 1a transcript plays an important role in seedlings.

Many AFE genes have no TATA box or initiator in their promoters

Table 3 shows a summary of the existence of promoter elements, TATA boxes, initiators, GC boxes and CCAAT boxes for each cluster. Many clusters had no TATA box or initiator. Moreover, CCAAT boxes were frequently observed in the TATA-less promoters. These characteristics may facilitate AFE events, since Suzuki et al. (2001a) have reported that highly variable TSSs in mouse have TATA-less promoters. In addition, CCAAT boxes may play a similar role to TATA boxes in rice, as reported in other species (Hayhurst et al., 1995).

CONCLUSION

We identified novel candidates for AFEs in rice, some of which were regulated in a tissue-specific manner. In addition, we showed that these potential rice AFEs may play an important role in regulating gene expression in a tissue- and/ or stage-specific manner, and may also contribute to protein diversity. Since we detected a number of unreported MSTSSs and AFEs in known genes, there may be a large number of undiscovered variable TSSs in plants. We anticipate that further studies will reveal how each TSS is selected in the MSTSS and AFE genes and will contribute to our understanding of gene regulatory mechanisms.

Table 1

Acquired data

SpeciesSequenceDatasetsDatabase
O.sativa5′ end EST91 425(publicly unavailable)
    ssp. Japonica
    c.v. NipponbareFull-length cDNA28 469http://cdna01.dna.affrc.go.jp/cDNA
Genomic DNANational Institute of Agrobiological Sciences
M.musculus5′ end EST141 935http://fantom2.gsc.riken.go.jp
Genomic DNAMouse Genome Sequencing Consorsium
SpeciesSequenceDatasetsDatabase
O.sativa5′ end EST91 425(publicly unavailable)
    ssp. Japonica
    c.v. NipponbareFull-length cDNA28 469http://cdna01.dna.affrc.go.jp/cDNA
Genomic DNANational Institute of Agrobiological Sciences
M.musculus5′ end EST141 935http://fantom2.gsc.riken.go.jp
Genomic DNAMouse Genome Sequencing Consorsium
Table 1

Acquired data

SpeciesSequenceDatasetsDatabase
O.sativa5′ end EST91 425(publicly unavailable)
    ssp. Japonica
    c.v. NipponbareFull-length cDNA28 469http://cdna01.dna.affrc.go.jp/cDNA
Genomic DNANational Institute of Agrobiological Sciences
M.musculus5′ end EST141 935http://fantom2.gsc.riken.go.jp
Genomic DNAMouse Genome Sequencing Consorsium
SpeciesSequenceDatasetsDatabase
O.sativa5′ end EST91 425(publicly unavailable)
    ssp. Japonica
    c.v. NipponbareFull-length cDNA28 469http://cdna01.dna.affrc.go.jp/cDNA
Genomic DNANational Institute of Agrobiological Sciences
M.musculus5′ end EST141 935http://fantom2.gsc.riken.go.jp
Genomic DNAMouse Genome Sequencing Consorsium

Table 2

Results of mapping and clustering

RiceMouse
5′ end ESTs91 425141 935
Mapped ESTs (>90%)59 96156 130
Mapped clusters (no. of ESTs >10)1159501
STSSs556110
MSTSSs6920
AFEs4669
Checking results12
RiceMouse
5′ end ESTs91 425141 935
Mapped ESTs (>90%)59 96156 130
Mapped clusters (no. of ESTs >10)1159501
STSSs556110
MSTSSs6920
AFEs4669
Checking results12
Table 2

Results of mapping and clustering

RiceMouse
5′ end ESTs91 425141 935
Mapped ESTs (>90%)59 96156 130
Mapped clusters (no. of ESTs >10)1159501
STSSs556110
MSTSSs6920
AFEs4669
Checking results12
RiceMouse
5′ end ESTs91 425141 935
Mapped ESTs (>90%)59 96156 130
Mapped clusters (no. of ESTs >10)1159501
STSSs556110
MSTSSs6920
AFEs4669
Checking results12

Distribution of the SDs of TSSs in rice and mouse TSS clusters.
Fig. 1

Distribution of the SDs of TSSs in rice and mouse TSS clusters.

(A) An example of a rice MSTSS cluster. This gene encodes glutathione S-transferase. Dark and light blue bars represent exons and introns of 5′ end ESTs, respectively. The pink bar represents the genomic sequence of this region. (B) An example of a mouse AFE cluster. This gene encodes calpain 1. (C) Outline of AFE structure. Exon 1 is alternatively transcribed and exon 2 is transcribed constitutively. The arrows show the directions and positions of the primers used in our RT–PCR analysis. Details of the primers are given in Supplementary materials.
Fig. 2

(A) An example of a rice MSTSS cluster. This gene encodes glutathione S-transferase. Dark and light blue bars represent exons and introns of 5′ end ESTs, respectively. The pink bar represents the genomic sequence of this region. (B) An example of a mouse AFE cluster. This gene encodes calpain 1. (C) Outline of AFE structure. Exon 1 is alternatively transcribed and exon 2 is transcribed constitutively. The arrows show the directions and positions of the primers used in our RT–PCR analysis. Details of the primers are given in Supplementary materials.

(A) RT–PCR results for five clusters whose genes are expressed in a tissue-specific manner. Tissue type is given above each lane. The figures on the right side of the RT–PCR results represent the corresponding AFE clusters composed of mapped 5′ end ESTs. Filled triangles give the positions of known start codons and open triangles, those of predicted start codons. The details of Structure 1 shown in Figure 4. (B) An example of the RT–PCR results obtained for clusters whose expression level does not show clear tissue specificity in the six tissue types examined. (C) RT–PCR results with primers for 18S rRNA as a control.
Fig. 3

(A) RT–PCR results for five clusters whose genes are expressed in a tissue-specific manner. Tissue type is given above each lane. The figures on the right side of the RT–PCR results represent the corresponding AFE clusters composed of mapped 5′ end ESTs. Filled triangles give the positions of known start codons and open triangles, those of predicted start codons. The details of Structure 1 shown in Figure 4. (B) An example of the RT–PCR results obtained for clusters whose expression level does not show clear tissue specificity in the six tissue types examined. (C) RT–PCR results with primers for 18S rRNA as a control.

The complete genetic structure of the AFE cluster shown in Structure 1 in Figure 3A. The top red bar represents the already-known SBE gene and the other red bars represent the full-length cDNAs corresponding to the two 5′ end ESTs of this cluster. The yellow rectangles on the red bars represent the ORFs. The ORFs of the two full-length cDNAs are the longest ORFs predicted.
Fig. 4

The complete genetic structure of the AFE cluster shown in Structure 1 in Figure 3A. The top red bar represents the already-known SBE gene and the other red bars represent the full-length cDNAs corresponding to the two 5′ end ESTs of this cluster. The yellow rectangles on the red bars represent the ORFs. The ORFs of the two full-length cDNAs are the longest ORFs predicted.

Table 3

Results of promoter element searches using MatInspector*

ClusterAnnotationTATA boxInitiatorGC boxCCATT box
abcabcabcabc
1Starch branching enzyme++None+NoneNone+None
2N/ANoneNoneNone+None
3N/ANoneNoneNone++None
4N/ANone+NoneNone++None
5Mitogen-activated protein kinaseNone+None+None+None
6N/A+++
7ATP/DTP translocator++None+None+NoneNone
8N/ANoneNone+None+None
9N/A+None++NoneNone++None
10Elongation factor++None++NoneNone+None
11N/A+None++None+NoneNone
12N/A+++
ClusterAnnotationTATA boxInitiatorGC boxCCATT box
abcabcabcabc
1Starch branching enzyme++None+NoneNone+None
2N/ANoneNoneNone+None
3N/ANoneNoneNone++None
4N/ANone+NoneNone++None
5Mitogen-activated protein kinaseNone+None+None+None
6N/A+++
7ATP/DTP translocator++None+None+NoneNone
8N/ANoneNone+None+None
9N/A+None++NoneNone++None
10Elongation factor++None++NoneNone+None
11N/A+None++None+NoneNone
12N/A+++

*The consensus sequence was either ‘+’ (observed) or ‘−’ (not observed).

Table 3

Results of promoter element searches using MatInspector*

ClusterAnnotationTATA boxInitiatorGC boxCCATT box
abcabcabcabc
1Starch branching enzyme++None+NoneNone+None
2N/ANoneNoneNone+None
3N/ANoneNoneNone++None
4N/ANone+NoneNone++None
5Mitogen-activated protein kinaseNone+None+None+None
6N/A+++
7ATP/DTP translocator++None+None+NoneNone
8N/ANoneNone+None+None
9N/A+None++NoneNone++None
10Elongation factor++None++NoneNone+None
11N/A+None++None+NoneNone
12N/A+++
ClusterAnnotationTATA boxInitiatorGC boxCCATT box
abcabcabcabc
1Starch branching enzyme++None+NoneNone+None
2N/ANoneNoneNone+None
3N/ANoneNoneNone++None
4N/ANone+NoneNone++None
5Mitogen-activated protein kinaseNone+None+None+None
6N/A+++
7ATP/DTP translocator++None+None+NoneNone
8N/ANoneNone+None+None
9N/A+None++NoneNone++None
10Elongation factor++None++NoneNone+None
11N/A+None++None+NoneNone
12N/A+++

*The consensus sequence was either ‘+’ (observed) or ‘−’ (not observed).

We would like to thank Hitomi Itoh, Hiromi Komai, Goki Kashima and members at the Institute for Advanced Biosciences for their help and enlightening discussions. We also thank TMRI for providing the genomic sequence data. This work was supported by the Ministry of Agriculture, Forestry and Fisheries of Japan (Rice Genome Project SY-1104). This work was also supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan, through ‘the 21st Century COE Program’ and ‘Special Coordination Funds Promoting Science and Technology’.

REFERENCES

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.

1997
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res.
25
3389
–3402

Baga, M., Glaze, S., Mallard, C.S., Chibbar, R.N.

1999
A starch-branching enzyme gene in wheat produces alternatively spliced transcripts.
Plant Mol. Biol.
40
1019
–1030

Bai, X., Peirson, B.N., Dong, F., Xue, C., Makaroff, C.A.

1999
Isolation and characterization of SYN1, a RAD21-like gene essential for meiosis in Arabidopsis.
Plant Cell
11
417
–430

Carninci, P., Kvam, C., Kitamura, A., Ohsumi, T., Okazaki, Y., Itoh, M., Kamiya, M., Shibata, K., Sasaki, N., Izawa, M., et al.

1996
High-efficiency full-length cDNA cloning by biotinylated CAP trapper.
Genomics
37
327
–336

Florea, L., Hartzell, G., Zhang, Z., Rubin, G.M., Miller, W.

1998
A computer program for aligning a cDNA sequence with a genomic DNA sequence.
Genome Res.
8
967
–974

Hamada, S., Ito, H., Hiraga, S., Inagaki, K., Nozaki, K., Isono, N., Yoshimoto, Y., Takeda, Y., Matsui, H.

2002
Differential characteristics and subcellular localization of two starch-branching enzyme isoforms encoded by a single gene in Phaseolus vulgaris L.
J. Biol. Chem.
277
16538
–16546

Hayhurst, G.P., Bryant, L.A., Caswell, R.C., Walker, S.M., Sinclair, J.H.

1995
CCAAT box-dependent activation of the TATA-less human DNA polymerase alpha promoter by the human cytomegalovirus 72-kilodalton major immediate–early protein.
J. Virol.
69
182
–188

Hu, Z.Z., Zhuang, L., Meng, J., Leondires, M., Dufau, M.L.

1999
The human prolactin receptor gene structure and alternative promoter utilization: the generic promoter hPIII and a novel human promoter hP(N).
J. Clin. Endocrinol. Metab.
84
1153
–1156

Hugnot, J.P., Gilgenkrantz, H., Vincent, N., Chafey, P., Morris, G.E., Monaco, A.P., Berwald-Netter, Y., Koulakoff, A., Kaplan, J.C., Kahn, A., et al.

1992
Distal transcript of the dystrophin gene initiated from an alternative first exon and encoding a 75-kDa protein widely distributed in nonmuscle tissues.
Proc. Natl Acad. Sci., USA
89
7506
–7510

Kelner, M.J., Bagnell, R.D., Montoya, M.A., Estes, L.A., Forsberg, L., Morgenstern, R.

2000
Structural organization of the microsomal glutathione S-transferase gene (MGST1) on chromosome 12p13.1–13.2. Identification of the correct promoter region and demonstration of transcriptional regulation in response to oxidative stress.
J. Biol. Chem.
275
13000
–13006

Kikuchi, S., Satoh, K., Nagata, T., Kawagashira, N., Doi, K., Kishimoto, N., Yazaki, J., Ishikawa, M., Yamada, H., Ooka, H., et al.

2003
Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice.
Science
301
376
–379

Lee-Kirsch, M.A., Gaudet, F., Cardoso, M.C., Lindpaintner, K.

1999
Distinct renin isoforms generated by tissue-specific transcription initiation and alternative splicing.
Circ. Res.
84
240
–246

Maruyama, K. and Sugano, S.

1994
Oligo-capping: a simple method to replace the cap structure of eukaryotic mRNAs with oligoribonucleotides.
Gene
138
171
–174

Mizuno, K., Kimura, K., Arai, Y., Kawasaki, T., Shimada, H., Baba, T.

1992
Starch branching enzymes from immature rice seeds.
J. Biochem.
112
643
–651

Nakhei, H., Lingott, A., Lemm, I., Ryffel, G.U.

1998
An alternative splice variant of the tissue specific transcription factor HNF4alpha predominates in undifferentiated murine cell types.
Nucleic Acids Res.
26
497
–504

Oberbaumer, I., Moser, D., Bachmann, S.

1998
Nitric oxide synthase 1 mRNA: tissue-specific variants from rat with alternative first exons.
Biol. Chem.
379
913
–919

Quandt, K., Frech, K., Karas, H., Wingender, E., Werner, T.

1995
MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.
Nucleic Acids Res.
23
4878
–4884

Sun, Q.A., Zappacosta, F., Factor, V.M., Wirth, P.J., Hatfield, D.L., Gladyshev, V.N.

2001
Heterogeneity within animal thioredoxin reductases. Evidence for alternative first exon splicing.
J. Biol. Chem.
276
3106
–3114

Suzuki, Y., Taira, H., Tsunoda, T., Mizushima-Sugano, J., Sese, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Morishita, S., et al.

2001
Diverse transcriptional initiation revealed by fine, large-scale mapping of mRNA start sites.
EMBO Rep.
2
388
–393

Suzuki, Y., Tsunoda, T., Sese, J., Taira, H., Mizushima-Sugano, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Nakamura, Y., et al.

2001
Identification and characterization of the potential promoter regions of 1031 kinds of human genes.
Genome Res.
11
677
–684

Takano, J., Watanabe, M., Hitomi, K., Maki, M.

2000
Four types of calpastatin isoforms with distinct amino-terminal sequences are specified by alternative first exons and differentially expressed in mouse tissues.
J. Biochem.
128
83
–92

Tsunoda, T. and Takagi, T.

1999
Estimating transcription factor bindability on DNA.
Bioinformatics
15
622
–630

Wu, J., Cramer, C., Hatzios, K.K.

1998
Isolation of a full-length cDNA encoding the second glutathione S-Transferase from Rice (Oryza sativa) (Accession No. AF062403). (PGR98-136).
Plant Physiol.
118
329

Zavolan, M., van Nimwegen, E., Gaasterland, T.

2002
Splice variation in mouse full-length cDNAs identified by mapping to the mouse genome.
Genome Res.
12
1377
–1385