Summary: Most genes in human, mouse and rat produce more than one transcript isoform. The Affymetrix Exon Array is a tool for studying the many processes that regulate RNA production, with separate probesets measuring RNA levels at known and putative exons. For insights on how exons levels vary between normal tissues, we constructed the Affy Exon Tissues track from tissue data published by Affymetrix. This track reports exon probeset intensities as log ratios relative to median values across the dataset and renders them as colored heat maps, to yield quick visual identification of exons with intensities that vary between normal tissues.
Availability: Affy Exon Tissues track is freely available under the UCSC Genome Browser (http://genome.ucsc.edu/) for human (hg18), mouse (mm8 and mm9), and rat (rn4).
Supplementary information:Supplementary data are available at Bioinformatics online.
The mammalian transcriptome is complex. By recent estimates, as many as 94% of human genes undergo alternative splicing (Wang, et al., 2008). Alternative splicing is consequential as well as frequent, with effects ranging from altering protein structure to targeting mRNA for early decay (Hartmann and Valcarcel, 2009). Furthermore, mammalian genomes contain an abundance of non-coding RNA genes (Chu and Rana, 2007). In short, to understand the consequences of transcription, one must look beyond overall expression levels of known genes.
Affymetrix exon arrays facilitate transcriptome analysis with probesets that measure RNA abundance for individual exons, conserved genomic regions, and blocks from syntenic alignments (see http://www.affymetrix.com/support/technical/technotes/exon_array_design_technote.pdf). These arrays have offered new insights on how transcript isoforms may be influenced by a myriad of factors including tissue type (Clark et al., 2007), genetic variation (Kwan et al., 2007; Zhang et al., 2009), differentiation (Yeo et al., 2007), and disease (Soreq et al., 2008; Thorsen et al., 2008).
Alternative splicing can arise through normal, regulated processes; or through abnormalities such as mutation, disease, and environmental stress (Yeo et al., 2005). Before one can understand the abnormal conditions, it is valuable to understand the scope of normal alternative splicing by comparing splicing patterns between normal tissues. To facilitate this, we have provided the Affy Exon Tissues tracks in the UCSC Genome Browser (Kuhn et al., 2009), depicting exon probeset intensities in normal tissues in human, mouse, and rat.
The Affy Exon Tissues track consists of two parts: genomic coordinates of the exon array probesets; and a heat map indicating exon probeset intensities in normal tissues, based on data available from http://www.affymetrix.com/support/technical/sample_data/exon_array_data.affx. Briefly, normal tissues were assayed in triplicate, and were analyzed with the Affymetrix Power Tools software (http://www.affymetrix.com/partners_programs/programs/developer/tools/powertools.affx) to produce normalized, background-corrected probeset intensities. For each probeset, we computed its median intensity for each tissue, and then the median of these median values. For each experiment, we calculated the log ratio between the probeset intensity and this median value. For numeric stability, we added a fixed, background-level pseudocount to each observation, which also renders probesets with no expression as constant-valued. The genome browser renders these log ratios as blue–white–red (shown), green–red, or yellow–blue heat maps, with the color selection controlled via the track's details page. Additional details are provided in the Supplementary Material.
Figure 1a shows the Affy Exon Tissues track for TPM2 in mm9. The constitutive exons (those included in all transcripts) indicate that TPM2 is expressed most strongly in muscle and embryo, with some expression in ovary. TPM2 has a well-documented pattern of tissue-dependent splicing, with one isoform produced in skeletal muscle tissue and another in non-muscle tissue (Gooding and Smith, 2008). This pattern is apparent in the two mutually-exclusive exons (third and fourth from the left): one is highly-expressed (red) in muscle and embryo (a heterogeneous tissue), while the other is highly-expressed in ovary.
For contrast, Figure 1b shows an unannotated conserved region on chromosome 1 in mm9. While it does not overlap with any known gene, its red (up-regulated) log intensity in brain suggests brain-specific expression. This illustrates how this data can offer insights on regions with no annotation but strong conservation.
The Affy Exon Tissues track displays exon probeset intensities in human, mouse, and rat tissues, including breast, cerebellum, heart, kidney, liver, muscle, pancreas, prostate, spleen, testes, and thyroid. In contrast to traditional microarray tracks such as the GNF Expression Atlas (Su et al., 2004), which provide one measure of overall expression per gene and cannot report any transcript variation, the Affy Exon Tissues track offers the ability to compare intensities of neighboring probesets and observe alternative promoter usage, polyadenylation, and splicing. Exon probeset intensities are rendered as heat maps to offer rapid visual identification of exons that vary under normal cellular conditions.
Besides the Affy Exon Tissues track, the UCSC Genome Browser currently hosts the hg18 Sestan Brain exon expression track, which contrasts exon probeset intensities between sections of the brain (Johnson et al., 2009). This set of tracks may expand further as additional datasets become available, offering further insights into transcript variation in the mammalian genomes.
Many people contributed to this track. The authors would like to thank Affymetrix in general and Alan Williams in particular for the track data. This track reflects the work of many individuals in the UCSC Genome Browser team, including Bob Kuhn, Donna Karolchik and Jim Kent. M.S.C. thanks Manuel Ares Jr, for his mentorship and encouragement.
Funding: National Human Genome Research Institute (2 P41 HG002371-06 to UCSC Center for Genomic Science, 3 P41 HG002371-06S1 ENCODE supplement to UCSC Center for Genomic Science); National Cancer Institute (Contract No. N01-CO-12400 for Mammalian Gene Collection); NIH GM-040478 (to M.S.C.)
Conflict of Interest: none declared.