Aberrant mRNA Transcripts and the Nonsense-Mediated Decay Proteins UPF2 and UPF3 Are Enriched in the Arabidopsis Nucleolus W OA

The eukaryotic nucleolus is multifunctional and involved in the metabolism and assembly of many different RNAs and ribonucleoprotein particles as well as in cellular functions, such as cell division and transcriptional silencing in plants. We previously showed that Arabidopsis thaliana exon junction complex proteins associate with the nucleolus, suggesting a role for the nucleolus in mRNA production. Here, we report that the plant nucleolus contains mRNAs, including fully spliced, aberrantly spliced, and single exon gene transcripts. Aberrant mRNAs are much more abundant in nucleolar fractions, while fully spliced products are more abundant in nucleoplasmic fractions. The majority of the aberrant transcripts contain premature termination codons and have characteristics of nonsense-mediated decay (NMD) substrates. A direct link between NMD and the nucleolus is shown by increased levels of the same aberrant transcripts in both the nucleolus and in Up-frameshift ( upf ) mutants impaired in NMD. In addition, the NMD factors UPF3 and UPF2 localize to the nucleolus, suggesting that the Arabidopsis nucleolus is therefore involved in identifying aberrant mRNAs and NMD. nuclei, and whole cells using the GeneRacer system (Invitrogen) according to the manufacturer’s protocols. In brief, total RNA was dephosphorylated prior to decapping of 5 9 ends of mRNA by tobacco acid pyrophosphatase and ligation to the RNA oligonucleotide adaptor (5 9 -CGACUGGAGCACGAGGACACUGACAUGGACUGAAGGA-GUAGAAA-3 9 ) with T4 RNA ligase. This procedure enriched for full-length, poly(A) oligo(dT)-adaptor [GCTGTCAACGATACGCTACGTAACGGCAT- 18-24 and 9 oligonucleotide ﬁdelity cloning sequenced.


INTRODUCTION
The eukaryotic nucleus is highly organized and contains nuclear bodies and domains involved in a variety of aspects of regulation of gene expression (Lamond and Spector, 2003;Misteli, 2005). The most prominent nuclear subcompartment is the nucleolus, which is the site of transcription and processing of precursor rRNAs and the assembly of mature rRNAs and ribosomal proteins into ribosomal subunits (Fatica and Tollervey, 2002;Granneman and Baserga, 2004). Besides the production of ribosomal subunits, the nucleolus is involved in many other RNA metabolism processes, as well as other cellular functions. For example, the nucleolus has roles in the maturation, assembly, and export of ribonucleoprotein particles (RNPs), such as the signal recognition particle and telomerase RNPs, some tRNAs and U6snRNA, and functions in cell cycle control, cell growth, and aging, and cellular stress sensing (Pederson, 1998;Rubbi and Milner, 2003;Olsen, 2004;Raš ka et al., 2006;Boisvert et al., 2007). Recently, some microRNAs and precursor microRNAs have been found to be concentrated in the nucleolus of rat myoblasts (Politz et al., 2006(Politz et al., , 2009. The nucleolus is also involved in many animal and plant virus infections, with specific viral proteins interacting with host nucleolar proteins and localizing to the nucleolus (Williams et al., 2005;Boyne and Whitehouse, 2006;Hiscox, 2007). In plants, nucleolar trafficking of the ORF3 protein of Groundnut Rosette Virus involves interactions with Cajal bodies and fibrillarin and is required for viral RNP production and systemic virus spread (Kim et al., 2007a(Kim et al., , 2007b. Finally, novel functions in RNA metabolism have been proposed for the plant nucleolus and bodies often associated with the nucleolus (Cajal and D-bodies): production of heterochromatic small interfering RNAs involved in transcriptional gene silencing Pontes et al., 2006) and maturation of microRNAs (Fang and Spector, 2007;Fujioka et al., 2007;Song et al., 2007). Thus, the nucleolus is multifunctional and is associated with processing of many RNAs and dynamic interactions and trafficking of RNA components between nuclear compartments, the nucleoplasm, and cytoplasm (Lamond and Spector, 2003;Olsen, 2004;Gorski et al., 2006;Raš ka et al., 2006;Boisvert et al., 2007).
Plant nucleoli differ from animal nucleoli in their structural organization. The plant nucleolus contains a much larger proportion of dense fibrillar component (DFC) than animal nucleoli. The DFC is surrounded by the granular component and plant nucleoli often have a nucleolar cavity (Brown and Shaw, 1998;Shaw and Brown, 2004). Nevertheless, the fundamental processes of ribosome biogenesis are very similar in all eukaryotes, as reflected by 70% of the 217 proteins identified in proteomic analyses of Arabidopsis thaliana nucleoli having orthologs in the human nucleolar proteome (Pendle et al., 2005). In addition to nucleolar and ribosome biogenesis proteins, splicing and translation factors as well as six exon-junction complex (EJC) proteins (Y14, Mago, eIF4A-III, RNPS1, UAP56, and REF/Aly) were also identified in the plant nucleolar proteome (Pendle et al., 2005). EJCs are deposited upstream of splice junctions on mRNAs following splicing. They contain proteins involved in splicing, mRNA export, and localization and nonsense-mediated decay (NMD), thereby linking transcription and splicing to export, surveillance, and translation (Maquat, 2004;Tange et al., 2004;Lejeune and Maquat, 2005;Behm-Ansmant et al., 2007b;Chang et al., 2007;Isken and Maquat, 2007). In particular, the plant EJC proteins found in the nucleolus were either orthologs of the core of the EJC complex that binds mRNA or were factors that interact with the TAP export factor (Maquat, 2004;Tange et al., 2004;Lejeune and Maquat, 2005;Andersen et al., 2006;Stroupe et al., 2006;Chang et al., 2007;Isken and Maquat, 2007). We further showed that green fluorescent protein (GFP) fusion proteins of these and other EJC proteins localized to the nucleolus in contrast with their animal orthologs, suggesting that messenger RNPs (mRNPs) may be present in the plant nucleolus (Pendle et al., 2005). In this article, we show that the nucleolus and the nucleoplasm contain different profiles of mRNA transcripts and that aberrantly spliced mRNAs are more abundant in the nucleolus. In addition, at least some of the aberrantly spliced transcripts are substrates for NMD, and the Up-frameshift (UPF) NMD proteins UPF2 and UPF3 are localized in the nucleolus, suggesting that the Arabidopsis nucleolus is involved in recognition of aberrantly spliced mRNAs and the NMD pathway.

The Plant Nucleolus Is Enriched in Aberrantly Spliced mRNAs
To examine whether mRNAs are present in plant nucleoli, we constructed cDNA libraries from poly A+ RNA isolated from whole cells, isolated nuclei, and purified nucleoli from Arabidopsis culture cells. Around 1,000 cDNA clones were completely sequenced (approximately 300 each from the whole cell and nuclear libraries and 400 from the nucleolar library) of which 497 were full-length ( Figure 1). The cDNA sequences were grouped into three classes of mRNA: 1) transcripts from single exon genes (which do not contain introns and therefore have not undergone splicing); 2) fully spliced mRNAs where all introns in a transcript have been removed; and 3) "aberrantly" spliced mRNAs ( Figure  1). The latter class consists of potentially mis-spliced mRNAs (splicing errors), incompletely spliced mRNAs (where one or more introns have not been removed) and alternatively spliced mRNA transcripts. Given our relatively limited knowledge of alternative splicing events for many plant genes, it is difficult to distinguish among is-splicing, incomplete splicing and alternative splicing.
The distribution of mRNA classes in the whole-cell, nuclear, and nucleolar libraries was strikingly different (Figure 1). While the proportion of single exon gene transcripts was similar in the three libraries (ranging from 16 to 20%), the proportion of fully spliced transcripts was highest in the whole-cell library (82%) and less abundant in the nuclear (68%) and nucleolar (42%) libraries ( Figure 1). By contrast, aberrantly spliced mRNAs represented only 2% of the transcripts in the whole-cell library, and this proportion rose to 13 and 38% in the nuclear and nucleolar libraries, respectively. The abundance of aberrantly spliced transcripts in the whole-cell library (2%) is similar to that observed generally in plant cDNA libraries and EST collections. The nucleolar preparation used for nucleolar cDNA library construction was the same as that used in the proteomic analysis of Arabidopsis nucleoli (Pendle et al., 2005). The profile of identified proteins and subsequent GFP fusion protein localizations showed this fraction to be highly enriched in nucleolar proteins with ;10% contamination from nuclear/cytoplasmic proteins (Pendle et al., 2005). Although some contamination is expected in biochemical fractionation, contamination from the nucleoplasm or cytoplasm cannot account for the presence of mRNAs in the nucleolar fraction because of the significantly different composition of mRNA types. If the mRNAs isolated in the nucleolar library were due to nucleoplasmic/cytoplasmic contamination, the proportions of fully spliced, aberrantly spliced, and single exon transcripts in the nucleolus would reflect those found in the whole-cell or nuclear libraries. This is clearly not the case because while the proportion of single exon transcripts remains constant, the level of fully spliced transcripts is halved and that of aberrantly spliced transcripts increases 20-and 3-fold in the nucleolar library over the levels found in whole-cell and nuclear libraries, respectively ( Figure 1). Thus, the plant nucleolus contains mRNAs that are enriched in aberrantly spliced transcripts such that almost half of the transcripts from intron-containing genes are aberrantly spliced. The distribution of fully spliced, aberrantly spliced, and single exon transcripts in whole-cell, nuclear, and nucleolar cDNA libraries expressed as percentage of total full-length cDNAs from the same library (226, 139, and 132 total cDNAs from the nucleolar, nuclear, and wholecell libraries, respectively). Only full-length, nonredundant cDNAs were scored.

of 13
The Plant Cell The Majority of Aberrantly Spliced mRNAs Are Potential NMD Substrates We next examined the nature of the aberrant mRNAs that accumulated in the nucleolus and whether there were implications for gene expression. The most common event was the presence of an unspliced intron, occurring in almost 75% of the aberrantly spliced transcripts (Figures 2A, 2G, and 3A). The remainder exhibited a range of different splicing phenotypes such as the use of cryptic 59 or 39 splice sites in intron or exon sequences or exon skipping ( Figures 2B to 2F, 2H, 2I, and 3A; see Supplemental Table 1 online). Some transcripts contained more than one aberrant splicing event (e.g., Figure 2D).
By identifying the presence and position of premature termination codons (PTCs), 91% of the aberrantly spliced transcripts were found to be putative NMD substrates on the basis of current models of long 39 untranslated region (UTR)-and intron-based PTC recognition in plants (Kerté sz et al., 2006;Schwartz et al., 2006;Hori and Watanabe, 2007;Keré nyi et al., 2008) (Figure 3B; see Supplemental Table 1 online). All had an increased distance between the PTC and 39 UTR, and two-thirds also contained one or more splice junctions (and therefore EJCs) downstream of the PTC (Figures 2 and 3B). The remaining aberrant transcripts (;9%) were not expected to produce NMD substrates as they caused in-frame changes ( Figure 2I) or a change in reading frame in the last exon without significantly affecting the position of the stop codon ( Figure 3B). Finally, some aberrant splicing events affected introns in the 59 or 39 UTRs ( Figures 2G and 2H). In particular, those in the 59 UTR either had no effect on the reading frame or generated a short open reading frame (ORF) upstream of the normal translation initiation codon that could affect translational efficiency (Meijer and Thomas, 2002) or trigger NMD (He et al., 2003;Mendell et al., 2004) (see Supplemental Table 1 online).

Distribution of mRNAs in Nuclear, Nucleolar, and Nucleoplasmic Fractions
Having shown by sequencing of cDNA libraries that the nucleolus contained mRNAs and was enriched in aberrant mRNAs, we validated these results by RT-PCR of total RNA from isolated nuclei and nucleolar and nucleoplasmic fractions. Initial RT-PCR analyses showed multiple higher molecular weight products for many genes (Figures 4 and 5), making a standard quantitative RT-PCR approach impossible as multiple primer pairs would be required to assess each product separately. Therefore, we used gene-specific primers (designed to the 59 and 39 regions of the coding sequence) to amplify all splicing isoforms in the same reaction to compare directly the relative levels of fully spliced and aberrantly spliced transcripts from the same gene. Extensive testing of PCR conditions was performed on a number of genes with known alternatively spliced isoforms and with a range of expression levels. For the majority of genes tested, the ratios of alternatively spliced isoforms remained constant (i.e., in the logarithmic amplification range) over 22 to 24 cycles (see Supplemental Figure 1 online; see Methods). Gene-specific primers were designed to a number of genes whose transcripts had been cloned in the nucleolar cDNA library either as aberrant transcripts or normal, fully spliced transcripts and to genes that were not represented in any of the libraries (effectively randomly selected). Following an RT reaction with oligo(dT) using total RNA from nuclei, and nucleolar and nucleoplasmic fractions, PCR was performed for 24 cycles. The purity of the nucleoplasmic and nucleolar preparations was assessed using RT-PCR with primers to amplify the spliceosomal small nuclear RNAs, U1 and U2, and the small nucleolar RNA, U3 ( Figure 4). U1 and U2 were enriched in the nucleoplasmic fractions, while U3 was enriched in the nucleolar fractions. The presence of some U1 and U2 in plant nucleoli has been shown previously by in situ hybridization (Beven et al., 1995) and localization of U1 and U2 small nuclear RNPs (snRNPs) (Lorković and Barta, 2008). Thus, although some cross-contamination is expected, the fractions are enriched as expected for nucleoplasmic or nucleolar snRNAs.
For virtually all of the genes tested, fully spliced transcripts were more abundant in nucleoplasmic RNA, while the higher molecular weight transcripts (unspliced/aberrantly spliced) were more abundant in nucleolar RNA (Figures 4 and 6, lanes 1 and 2). We showed that the higher molecular weight RT-PCR products represented differently spliced transcripts by cloning and sequencing the products from some genes (see Supplemental Figure 2 online). The enrichment of aberrant, higher molecular weight products and the reduction of fully spliced products in the nucleolus was found consistently in RT-PCR reactions with RNA from numerous different nuclear, nucleoplasmic, and nucleolar preparations from culture cells and was significantly different for most genes tested ( Figure 5). Thus, the RT-PCR results corroborate those obtained from cloning and sequencing from cDNA libraries ( Figure 1). The relative proportions of higher molecular weight transcripts to fully spliced transcripts in the nucleus were estimated from the relative band intensities in the RT-PCR reactions of nuclear preparations ( Figure 4, N lanes) and varied greatly among different genes ranging from between 20 and 50% for the genes studied (Figures 4 and 5). Higher molecular weight aberrant transcripts were detectable for most genes tested in nuclear and, in particular, nucleolar RNA preparations. They were more difficult to detect in RNA from whole cells or seedling material under the PCR conditions used. This reflects the much higher levels of spliced mRNAs derived from the cytoplasm in this material and again is consistent with the 1 to 2% of plant ESTs consisting of transcripts with, for example, unspliced introns (e.g., Figure 1, whole-cell library). Thus, production of aberrant transcripts (whether due to splicing errors or alternative splicing) is a common occurrence likely to require NMD for their turnover, and these mRNA transcripts are enriched in the nucleolus.

Transcripts Enriched in the Nucleolus Are NMD Sensitive
The majority of aberrant mRNAs from the cDNA libraries contained PTCs and were predicted to be degraded by NMD ( Figure   Figure 3. Distribution of Splicing and NMD Transcript Phenotypes. (A) Distribution of different splicing phenotypes among 160 aberrantly spliced transcripts from the three cDNA libraries.
(B) Distribution of transcripts by potential consequence of the aberrant splicing event(s). NMD phenotypes were classified as potential NMD substrates with or without a downstream EJC, or non-NMD substrates where no PTC is generated or where the PTC remains close to the 39 UTR due to a frame change in the last exon. ss, splice site.

of 13
The Plant Cell 3B; see Supplemental Table 1 online). To demonstrate the link between nucleolar accumulation of particular transcripts and NMD, we performed RT-PCR reactions on total RNA isolated from whole seedlings of wild-type plants and mutant plants of the NMD proteins UPF1 and UPF3. We used two mutants, upf1-5 and upf3-1, which have been shown to be impaired in NMD such that mRNAs that are normally turned over by NMD accumulate in these mutants (Hori and Watanabe, 2005;Arciga-Reyes et al., 2006). RT-PCR was performed using gene-specific primers to At2g21660 and At3g61860 encoding the Gly-rich RNA binding protein, GRP7, and the SR protein, RSp31, respectively. Aberrant transcripts of both these genes were isolated in the nucleolar cDNA library and were detectable in seedling RNA using 24 cycles of PCR. Their alternatively spliced transcripts have also been characterized (Kalyna et al., 2006;Schö ning et al., 2007Schö ning et al., , 2008. In particular, the levels of GRP7 transcripts are regulated by an autoregulatory loop that promotes the production of an alternatively spliced variant that is turned over by NMD (Schö ning et al., 2007(Schö ning et al., , 2008. RT-PCR of RNA from the upf mutants showed the accumulation of two higher molecular weight products for At2g21660 in the mutants ( Figure 6A, lanes 4 and 5) compared with the wild type ( Figure 6A, lane 3). These same products are enriched in the nucleolar fraction compared with the nucleoplasmic fraction ( Figure 6A, lanes 1 and 2). The lower product (674 bp) is the alternatively spliced product of GRP7 that is regulated by alternative splicing and NMD; the identity of the larger product is unknown. RT-PCR of At3g61860 also showed two higher molecular weight products, the smaller of which was enriched in both the nucleolar fraction and the upf mutants and is therefore NMD sensitive ( Figure 6B). The larger of the two products represents a PTC+ transcript generated by use of alternative 59 and 39 splice sites. This transcript is stable and not turned over by NMD. Other naturally occurring NMD transcripts were analyzed in the original descriptions of the upf mutants (Hori and Watanabe, 2005;Arciga-Reyes et al., 2006), and RT-PCR analysis of one such gene, At2g45670, again showed that the PTC+/NMD transcript is enriched in the nucleolus and accumulates in the upf mutants (see Supplemental Figure 3 online). Thus, there is a correlation between the accumulation of aberrant mRNAs in the nucleolus and their turnover by NMD for at least some gene transcripts.

Plant NMD Proteins Are Associated with the Nucleolus
Since most of the aberrant mRNAs in the nucleolus were potential NMD substrates, we asked whether factors required for NMD were also present in the nucleolus. GFP fusion constructs to the NMD proteins UPF1, UPF2, and UPF3 were coexpressed with a fibrillarin-monomeric red fluorescent protein (mRFP) fusion construct (as a marker for the nucleolus) in Arabidopsis suspension culture cells. Although the fusion constructs were expressed from the cauliflower mosaic virus 35S promoter, cells that fluoresced at low levels were imaged to minimize potential problems due to overexpression of GFP fusion proteins (Pendle et al., 2005). GFP-UPF3 was localized predominantly in the nucleolus, GFP-UPF2 was found in both the nucleolus and cytoplasm, while GFP-UPF1 was largely cytoplasmic ( Figures  7A to 7C, respectively). GFP-UPF1 exhibited a heterogeneous distribution in the cytoplasm ( Figure 7C) with bright fluorescent signal concentrated in many relatively small foci, which might indicate the accumulation of UPF1 in specific compartments, possibly processing bodies (P-bodies) where mRNA decay factors are concentrated and where degradation of aberrant mRNA can occur (Eulalio et al., 2007;Parker and Sheth, 2007). The nucleolar localization of UPF3-GFP was confirmed in stably transformed live seedlings (see Supplemental Figure 4 online).

DISCUSSION
Here, we show the unexpected presence of mRNAs in the plant nucleolus and enhanced levels of aberrantly spliced mRNAs in the nucleolus compared with the nucleoplasm. The majority of the aberrant mRNAs in the nucleolus were potential NMD substrates by virtue of containing PTCs. Our data, using upf mutants impaired in NMD, demonstrate a direct link between aberrantly spliced mRNA transcripts that are enriched in the nucleolus and NMD such that at least some of the nucleolar aberrant mRNAs are targets for NMD. Finally, in addition to the presence of mRNAs in the nucleolus and the nucleolar association of plant EJC proteins (Pendle et al., 2005), we showed that the NMD factors UPF2 and UPF3 also localize to the nucleolus. These observations suggest that the plant nucleolus has a novel function in RNA metabolism by being involved in detection of aberrant mRNAs and the NMD pathway. Histograms showing the mean and standard errors of the relative abundances (expressed as a percentage) of different higher molecular weight, aberrantly spliced products (AS1-5) and fully spliced (FS) products for the different fractions (white, nuclear; gray, nucleoplasmic; black, nucleolar). The number of different fractions used for RT-PCR is given in the inset for each histogram (N, nuclear; Np, nucleoplasmic; No, nucleolar). The combined means of the aberrantly spliced bands and mean of the fully spliced band were compared between the nuclear and nucleolar preparations by analysis of variance. Probability values for AS>FS in the nucleolus and FS>AS in the nucleoplasm are presented (sig, significant, P < 0.05; hs, highly significant, P < 0.01; ns, not significant).

of 13 The Plant Cell
The association of mRNAs with the nucleolus has been reported previously. In mammalian cells, evidence for mRNAs in the nucleolus is limited to a few spliced cellular mRNAs (e.g., c-myc), and the reason for their nucleolar localization is unknown (Bond and Wold, 1993;Pederson, 1998;Olsen, 2004). The nucleolus has also been implicated in mRNA export in eukaryotes based on the nucleolar accumulation of poly(A) + RNA in mutants of export components (Pederson, 1998;Ideue et al., 2004;Olsen, 2004), although some of these observations may reflect polyadenylation of transcripts undergoing degradation following disruption of export (Carneiro et al., 2007). On the other hand, some viral mRNAs in mammalian cells are exported via the nucleolus. For example, herpes virus saimiri spliced mRNA and human immunodeficiency virus type 1 unspliced and singly spliced mRNAs are bound by the viral proteins ORF57 and Rev, respectively, and transported to the nucleolus. These proteins also cause the redistribution of export factors allowing assembly of export-competent viral RNP complexes in the nucleolus (Williams et al., 2005;Boyne and Whitehouse, 2006;Hiscox, 2007). In the case of unspliced or partially spliced human immunodeficiency virus mRNAs, this export pathway has been suggested to protect the partially spliced mRNAs from the NMD machinery (Hiscox, 2007). More recently, a model has been proposed for nucleolar involvement in the formation of mRNPs that are localized to specific regions of the cytoplasm for translation. Localized mRNAs, such as yeast ASH1 mRNA, enter the nucleolus bound to specific RNA binding proteins. In the nucleolus, translation repressor proteins are loaded onto the mRNA before the mRNP is exported to the cytoplasm and its final destination, where translation is activated (Du et al., 2008;Jellbauer and Jansen, 2008). A similar trafficking pathway may also operate in mammals, for the formation of mRNPs associated with Staufen, an RNA binding protein found in the nucleolus and involved in transport of mRNAs in neurons . Thus, the eukaryotic nucleolus is involved in the assembly and export of some cellular and viral mRNAs/mRNPs. Here, the levels of aberrant mRNA transcripts that we have observed in the plant nucleolus are unprecedented. They also derive from a wide range of genes, suggesting that their presence in the nucleolus is part of a process to identify aberrant mRNAs, which is not restricted to specific genes or gene sets.
The majority of aberrant mRNA transcripts in the plant nucleolus contain PTCs and are putative or known targets of NMD. These characteristics raise questions of how PTC-containing (PTC+) aberrant mRNAs are identified in plants, how they become enriched in the nucleolus, and what the link to NMD is.  NMD identifies PTC+ transcripts that encode truncated proteins that are either nonfunctional or potentially detrimental to cell function and targets them for rapid degradation (Maquat, 2004;Lejeune and Maquat, 2005;Behm-Ansmant et al., 2007b;Chang et al., 2007;Isken and Maquat, 2007;Shyu et al., 2008). NMD is a nucleus-associated and translation-dependent process and, in animal cells, the site of mRNA surveillance is thought to be on the cytoplasmic side of the nuclear pore complex in a pioneer round of translation as mRNPs exit the nucleus, with degradation occurring in the cytoplasm (Maquat, 2004;Lejeune and Maquat, 2005;Behm-Ansmant et al., 2007b;Chang et al., 2007;Isken and Maquat, 2007;Singh et al., 2007;Shyu et al., 2008). The mechanisms of PTC recognition vary among different species but can be summarized as relying on signals downstream of the PTC and, in particular 39 UTRs and associated proteins, or being splicing and EJC dependent (Bü hler et al., 2004(Bü hler et al., , 2006Amrani et al., 2006;Rehwinkel et al., 2006;Behm-Ansmant et al., 2007aChang et al., 2007;Isken and Maquat, 2007;Shyu et al., 2008;Stalder and Mü hlemann, 2008). In plants, NMD is also triggered by the distance between the PTC and 39 UTR (long 39 UTRs) and/or the presence of a downstream exon-exon junction (intron-based NMD) (Kerté sz et al., 2006;Schwartz et al., 2006;Hori and Watanabe, 2007;Keré nyi et al., 2008). These features are evident not only in NMD-sensitive protein-coding gene transcripts, but also in mRNA-like noncoding RNAs, which are also turned over by NMD in Arabidopsis (Kurihara et al., 2009). Plant orthologs of EJC proteins and the NMD proteins UPF1, UPF2, UPF3, and SMG-7 have been identified (Pendle et al., 2005;van Hoof and Green, 2006;Behm-Ansmant et al., 2007b), and the involvement of plant Mago, Y14, UPF1, UPF2, UPF3, and SMG-7 in the NMD pathway has been shown by the stabilization of PTC+ transcripts either in viable mutants or when the proteins are reduced by silencing (Hori and Watanabe, 2005;Arciga-Reyes et al., 2006;Yoine et al., 2006;Kerté sz et al., 2006;Wu et al., 2007;Keré nyi et al., 2008). In mammals, UPF2 interacts with UPF1 and UPF3, and the order of association of UPF proteins with EJCs is UPF3, UPF2, and finally UPF1, which is important in activating NMD (Kashima et al., 2006). In plants, UPF2 also links UPF3 and UPF1 (Keré nyi et al., 2008). Despite these functional similarities, the localization of the Arabidopsis UPF3 and UPF2 to the nucleolus contrasts with that of their mammalian orthologs in that human UPF3 is primarily nuclear, and human UPF2 and UPF1 are primarily cytoplasmic proteins (Lykke- Andersen et al., 2000;Serin et al., 2001). The fact that Arabidopsis UPF1 is cytoplasmic suggests that NMD occurs in the cytoplasm. Although we are, at present, unable to distinguish whether the localization of UPF2 and UPF3 in the nucleolus reflects functional activity in the NMD pathway or some other function, the clear difference in localization of these proteins between plants and animals is unexpected. Thus, while there is some underlying conservation in both mechanisms and components of the NMD pathway between plants and other eukaryotes, our results suggest that there are potentially fundamental differences in the organization of the NMD pathway in plants that involve a function for the nucleolus.
The distinct profiles of fully spliced and aberrantly spliced transcripts in the nucleoplasm and nucleolus suggest that, in plants, different classes of mRNAs are distinguished prior to exit from the cytoplasm and lead to enrichment of aberrant mRNAs in the nucleolus. Aberrant transcripts could be distinguished from

of 13
The Plant Cell normal mRNAs in a number of ways. First, aberrant mRNAs could be discriminated by an NMD-like mRNA surveillance mechanism occurring either in the nucleoplasm to target aberrant mRNAs to the nucleolus or in the nucleolus itself. There is some evidence in favor of nuclear NMD in mammals (Bü hler et al., 2002(Bü hler et al., , 2004(Bü hler et al., , 2006. However any mechanism involving the detection of stop codons would presumably have to involve an engaged ribosome, in a nuclear or nucleolar pioneer round of translation, since the ribosome is currently the only known factor that can interpret the reading frame and thus detect stop codons. Evidence has been published for nuclear translation, but this idea is still hotly disputed (Iborra et al., 2001;Dahlberg and Lund, 2004). On the other hand, if mRNA surveillance occurs as mRNPs exit the nuclear pore complex as in animals, the aberrant transcripts would have to be reimported into the nucleus/nucleolus. Although export and reimport of RNAs from the cytoplasm is well established in the biogenesis of small nuclear RNAs and some tRNAs (Hopper and Shaheen, 2008), it is difficult to see the advantage to the cell of exporting aberrant mRNAs to allow cytoplasmic surveillance and reimporting them to the nucleolus. A second possibility is that aberrant mRNPs are identified by virtue of differences in their mRNP composition. Plant introns are UA rich (up to 80%) with on average 20% lower GC content than exon sequences. U-rich sequences are required for efficient splicing and are thought to bind U-rich binding proteins as an early intron recognition step prior to spliceosome formation (Lorković et al., 2000). Given that the majority of the aberrant nucleolar mRNAs found in our study contained either unspliced introns or intron fragments, U-rich binding proteins could mark the aberrant transcripts and target them to the nucleolus by an unknown mechanism.
Third, normally spliced and aberrantly spliced mRNPs could be distinguished at the level of export, with the export of aberrant mRNPs inhibited through the interaction or lack of interaction between particular proteins on the mRNP and the nuclear pore complex. In yeast, spliced transcripts are discriminated by MLp proteins associated with the nuclear pore complex, which interact with the hnRNP-like protein Nab2p, present on spliced mRNAs, allowing them to be exported (Galy et al., 2004). In animals, splicing commitment factors that recognize splice sites or intronic sequences prevent intron-containing transcripts from being exported (Stutz and Izaurralde, 2003). A similar mechanism of differential export could operate in plants, again dependent on the presence of specific proteins (e.g., U-rich binding proteins) bound to introns or intron fragments in aberrant mRNAs. The latter possibilities would provide a general mechanism for the identification of transcripts containing introns or intron fragments irrespective of which gene they are transcribed from.
How do aberrant transcripts become enriched in the nucleolus and what is the link to NMD? Current ideas of nuclear organization suggest that the dynamic distribution of molecules and complexes relies on overall diffusion with differential accumulation in different compartments depending on mean residence time in the different compartments due to molecular interactions (Gorski et al., 2006;Raš ka et al., 2006). Thus, while the presence of fully spliced and single exon mRNAs in the plant nucleolus may reflect free diffusion of mRNPs between the nucleoplasm and nucleolus, the enhanced levels of aberrant mRNA transcripts indicate that these mRNAs are retained for longer in the nucle-olus. Aberrant mRNPs may reach the nucleolus by targeting due the presence of specific proteins on mRNPs or by diffusion to the nucleolus where they interact with factors that remodel these mRNPs for degradation by the NMD pathway. The localization of UPF3 and UPF2 in the nucleolus suggests a model where the NMD proteins are loaded onto the aberrant mRNPs prior to being exported to the cytoplasm where UPF1 (mainly cytoplasmic) is recruited to activate degradation. The assembly of UPF3/UPF2 onto the aberrant transcripts would parallel the model recently proposed for assembly of translationally repressed, localized mRNPs in yeast and animals, and for export-competent viral mRNPs, in the nucleolus before export to the cytoplasm (Hiscox, 2007;Jellbauer and Jansen, 2008).
Taken together, the presence of EJC, UPF2, and UPF3 proteins and aberrant NMD-substrate mRNAs in the nucleolus suggests that the plant nucleolus is involved in the pathway by which aberrant mRNAs are turned over by NMD. This links the nucleolus to mRNA surveillance and NMD processes and clearly differentiates plants from animals in this respect. That plants have evolved a novel system to deal with aberrant mRNA transcripts may reflect the fundamental differences in plant and animal gene structure and the nature of the transcripts produced by splicing errors. For example, plant introns are UA rich and proteins have been proposed to bind these sequences early in spliceosome assembly (Lorković et al., 2000). In addition, however, plant genes differ significantly from animal genes in intron size. The average size of Arabidopsis introns is ;170 nucleotides, while that of animals is ;5.5 kb (see Barbazuk et al., 2008) with some animal introns being tens of thousands of nucleotides long. The effect of this difference is that errors in splicing in animals usually lead to skipping of an exon, requiring translation to detect changes in reading frame and PTCs in the resultant mRNAs. In plants, errors in splicing usually lead to nonremoval of introns (intron retention) and, as we have shown from sequencing, the majority (;75%) of the aberrant mRNAs contain unspliced introns or intron fragments. Thus, plants may identify intron-containing transcripts and prepare them for degradation by the NMD pathway in the nucleus and nucleolus, while other PTC+ transcripts without intron fragments would be identified on export by the pioneer round of translation as in animals.

Plant Materials and Growth Conditions
The upf1-5 and upf3-1 mutants were obtained from Brendan Davies (Hori and Watanabe, 2005;Arciga-Reyes et al., 2006). Wild-type and mutant plants were grown at 258C in 16-h-light/8-h-dark conditions in a controlled environment cabinet. For nuclei and nucleoli preparation, an Arabidopsis thaliana suspension culture derived from a Landsberg erecta line was used as a source of material. Cultures were grown at 258C in the light with constant shaking on an orbital incubator at 200 rpm in AT medium (4.4% Murashige and Skoog [MS] salts, 3% sucrose, 0.05 mg/L kinetin, and 0.5 mg/L 1-naphthaleneacetic acid (NAA), pH 5.8).

Nucleolar, Nuclear, and Whole-Cell cDNA Libraries
Total RNA was extracted from whole cells, isolated nuclei, and nucleoli using an RNeasy kit (Qiagen), and mRNA was purified from total RNA using Oligotex beads (Qiagen) according to the manufacturer's instructions. cDNA libraries were generated using 100 ng of mRNA prepared from nucleoli, nuclei, and whole cells using the GeneRacer system (Invitrogen) according to the manufacturer's protocols. In brief, total RNA was dephosphorylated prior to decapping of 59 ends of mRNA by tobacco acid pyrophosphatase and ligation to the RNA oligonucleotide adaptor (59-CGACUGGAGCACGAGGACACUGACAUGGACUGAAGGA-GUAGAAA-39) with T4 RNA ligase. This procedure enriched for fulllength, capped poly(A) + RNA. First-strand cDNA was generated using an oligo(dT)-adaptor primer [GCTGTCAACGATACGCTACGTAACGGCAT-GACAGTG(T) [18][19][20][21][22][23][24] ] and Superscript III RT. The second-strand cDNA was generated using primers specific to 39 and 59 oligonucleotide adaptor sequences and Platinum Taq polymerase high fidelity (Invitrogen). cDNA fragments >500 bp were enriched by Sepharyl-S500 spin column (Amersham Biosciences) chromatography, inserted into the pCR TAcloning vector (Invitrogen), and fully sequenced.
Sequence and Gene Structure Analysis of cDNA Clones cDNA sequences were compared with the current gene models of the Arabidopsis Genome Initiative gene data set containing all Arabidopsis transcription unit sequences with intron and UTR sequences using BLASTN at TAIR (http://www.arabidopsis.org). Resulting alignments were compared with the full-length genomic sequences and gene structure models from The Institute for Genomic Research Arabidopsis thaliana database (http://www.tigr.org/tdb/e2k1/ath1/). Computer-assisted translation of aberrant transcripts with ExPASy (http://www.expasy.ch/ tools/dna.html) identified the position of PTCs and ORFs. The analysis accurately identified the exact extent and nature of the aberrant or alternative splicing events.

Expression Analyses of Gene Transcripts
Whole nuclei and nucleolar and nucleoplasmic fractions were extracted with phenol/chloroform and total RNA recovered by ethanol precipitation and treated with RNase-free DNase prior to use. For RT-PCR, 5 mg each of nuclear, nucleoplasmic, or nucleolar total RNA was used for the first-strand cDNA synthesis primed by oligo(dT) 25 , and one-tenth volume of mixture was amplified by PCR using gene-specific primers. PCR was performed at 24 cycles, which was in the exponential amplification range determined by measuring the levels of RT-PCR products with increasing PCR cycle number (Simpson et al., 2008). The reaction products were fractionated on a 1.2% agarose gel, and band intensities compared using the public domain program ImageJ (Wayne Rasband; http://rsb.info.nih.gov/ij/). RT-PCR conditions were determined in control experiments using gene-specific primers to genes known to produce alternative splicing isoforms. First-strand synthesis was performed on 5 mg of total RNA of Columbia-0 seedlings using ready-to-go you-prime first-strand beads (Amersham) following incubation at 658C and chilling on ice for 2 min. Then, 2 mM oligo(dT) 18 and beads were added and incubated for 1 min at room temperature followed by 1 h at 378C. The room temperature reaction was diluted to 100 mL and 1 mL used in gene-specific PCR reactions (25 mL) in 13 PCR buffer (10 mM Tris-HCl, pH 8.3, 50 mM KCl, and 3 mM MgCl 2 ), 0.2 mM each of dATP, dGTP, dCTP, and dTTP (Promega), 1.5 mM each of gene-specific primers, and Taq DNA Polymeraase (Roche). The forward primer was labeled with 6-carboxyfluoresceine to visualize RT-PCR products on an ABI3730 capillary sequencing machine. PCR reactions were performed by initial incubation for 30 min at 488C followed by 20, 22, 24, or 26 cycles of 948C for 15 s, 508C for 30 s, 708C for 1 min, and finally 728C for 10 min. Primer sequences are given in Supplemental Table 2 online. RT-PCR reactions were purified by mini-Elute purification (Qiagen), and 1 mL was mixed with 10 mL Hi Di Formamide with 0.05 mL of GeneScan 500 LIZ internal size standards (Applied Biosystems) and separated on an ABI 3730 (Applied Biosystems). Relative fluorescent peak areas for the expected alternatively spliced isoforms were extracted and analyzed using GENEMAPPER software (Applied Biosystems). The relative ratios of the two to three isoforms was calculated for each gene for three technical samples and the means plotted against PCR cycle number.

Construction of GFP and mRFP Fusion Proteins of Nucleolar Components
Cloning was performed using the Gateway system (Invitrogen). The UPF3 and UPF2 full-length ORFs were PCR amplified using Platinum Pfx DNA polymerase (Invitrogen), with sequence-specific primers containing Gateway attB adapters: 59-AAAAAGCAGGCTCGATGAAGGAACCTTTG-CAGAA-39 and 59-AGAAAGCTGGGTAACAAGTACCGGATGATGGTT-39 for UPF3 and 59-AAAAAGCAGGCTTCATGGATCATCCAGAAGATGAA-TCCCAC-39 and 59-AGAAAGCTGGGTCCTTTCGTCGGGCATGATACG-AACCACC-39 for UPF2. Amplification was performed using cDNA obtained by reverse transcription of RNA from Arabidopsis inflorescence material (isolated using the RNeasy kit [Qiagen]) with Omniscript polymerase (Quiagen) and oligo(dT). The UPF1 ORF region was obtained from the UPF1/pDONR20 clone (kindly provided by Masato Yoine, Nagoya University, Japan). All ORFs were then inserted via BP recombination into a Gateway entry vector pDONR207 (Invitrogen). The inserts were then recombined via LR reaction into the binary plant expression vector GFP-N-bin containing an N-terminal GFP translational fusion. As a marker for the nucleolus, the plasmid, pROK2.mRFPattR, containing a translational fusion with the nucleolar protein, fibrillarin, and mRFP was used (Kim et al., 2007b). The binary expression vectors containing GFP and mRFP fusions were transformed into Agrobacterium tumefaciens and transiently expressed in Arabidopsis cells as described previously (Koroleva et al., 2005).
Arabidopsis plants (Columbia-0 ecotype) were transformed by the floral dip method (Clough and Bent, 1998) and T1 seedlings selected on Petri plates with MS-agar media containing 50 mg/L kanamycin. Kanamycin-resistant seedlings were further selected using fluorescence microscopy. The seedlings were grown vertically on MS-agar plates in a growth room under constant light at 258C.

Microscopy and Image Analysis
For microscopy, a small volume of cell suspension was placed on a slide and covered with a cover slip. Imaging was performed using a 403 oil lens on Leica SP and SP2 laser scanning confocal microscopes, equipped with Leica software (Leica Microsystems). To overcome potential problems of overexpression of GFP fusion proteins, only cells that fluoresced 10 of 13 The Plant Cell at low levels were imaged. This approach was used to image >90 nucleolar and EJC proteins showing numerous distinct patterns of subnuclear localization (Pendle et al., 2005). Each image was collected as an average of three sequential scans performed using the excitation wavelengths 488 nm (80%) and 543 nm (20%); for detection of GFP and mRFP signals, channels of 500 to 550 and 630 to 680 nm, respectively, were used. Low levels of UPF3-GFP expression, including in stable transformants, were visualized by wide-field imaging using a Nikon Eclipse 600 epifluorescence microscope equipped with a Hamamatsu Orca-ER cooled CCD camera and a Prior Proscan x-z stage. Image stacks were collected using MetaMorph software (Universal Imaging) and were deconvoluted using AutoDeblur (AutoQuant Imaging). Image processing and preparation of montages were performed using Adobe Photoshop CS2.

Accession Numbers
Sequence data from this article can be found in the Arabidopsis Genome Initiative or GenBank/EMBL databases under the following accession numbers: UPF1 (At5g47010), NM_124072; UPF2 (At2g39260), 1009042019; and UPF3 (At1g33980) NM_103120 and in Supplemental Table 2 online.

Supplemental Data
The following materials are available in the online version of this article. Supplemental Table 1. Aberrant cDNA Clones from Nucleolar, Nuclear, and Whole-Cell cDNA Libraries.

Supplemental
Supplemental Table 2. Primers Used in RT-PCR Reactions.