Widespread Exon Junction Complex Footprints in the RNA Degradome Mark mRNA Degradation before Steady State Translation[OPEN]

Exon junction complex footprints account for a sizable proportion of major RNA degradation intermediates and can serve as markers for mRNA degradation before steady state translation. Exon junction complexes (EJCs) are deposited on mRNAs during splicing and displaced by ribosomes during the pioneer round of translation. Nonsense-mediated mRNA decay (NMD) degrades EJC-bound mRNA, but the lack of suitable methodology has prevented the identification of other degradation pathways. Here, we show that the RNA degradomes of Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), worm (Caenorhabditis elegans), and human (Homo sapiens) cells exhibit an enrichment of 5′ monophosphate (5′P) ends of degradation intermediates that map to the canonical EJC region. Inhibition of 5′ to 3′ exoribonuclease activity and overexpression of an EJC disassembly factor in Arabidopsis reduced the accumulation of these 5′P ends, supporting the notion that they are in vivo EJC footprints. Hundreds of Arabidopsis NMD targets possess evident EJC footprints, validating their degradation during the pioneer round of translation. In addition to premature termination codons, plant microRNAs can also direct the degradation of EJC-bound mRNAs. However, the production of EJC footprints from NMD but not microRNA targets requires the NMD factor SUPPRESSOR WITH MORPHOLOGICAL EFFECT ON GENITALIA PROTEIN7. Together, our results demonstrating in vivo EJC footprinting in Arabidopsis unravel the composition of the RNA degradome and provide a new avenue for studying NMD and other mechanisms targeting EJC-bound mRNAs for degradation before steady state translation.


INTRODUCTION
Many eukaryotic pre-mRNAs undergo splicing, in which intervening sequences (introns) are excised and the exon junction complex (EJC) is deposited in a region 20 to 24 nucleotides upstream of the exon-exon junction (Le Hir et al., 2000). During the pioneer round of translation, EJCs residing upstream of the termination codon are disassembled and removed through the ribosome-associated protein Partner of Y14 and Mago (PYM; Gehring et al., 2009). For most eukaryotic mRNAs, the termination codon is present in the last exon. Hence, after the pioneer round of translation, these mRNAs no longer carry any EJCs and subsequently enter steady state cycles of translation, which contribute to the bulk production of proteins. However, if a termination codon situated more than 50 to 55 nucleotides upstream of an exon-exon junction is encountered by a ribosome during the pioneer round of translation, it is generally recognized as a premature termination codon (PTC) and promotes nonsense-mediated mRNA decay (NMD; Nagy and Maquat, 1998).
Although many studies have demonstrated that the EJCs downstream of a termination codon are crucial for NMD, NMD can also occur in an EJC-independent manner. In the absence of EJCs, a long 39 untranslated region (UTR) sequence is able to promote NMD of mRNAs harboring a natural termination codon (Behm-Ansmant et al., 2007). Budding yeast (Saccharomyces cerevisiae) lacks the genes encoding core EJC components but possesses the NMD pathway (Conti and Izaurralde, 2005;Bannerman et al., 2018). Additionally, in the worm Caenorhabditis elegans and the fly Drosophila melanogaster, PTC definition is independent of exon-exon boundaries and EJCs are dispensable for NMD (Gatfield et al., 2003;Longman et al., 2007). While silencing of the EJC core components Mago and Y14 or overexpression of PYM inhibits NMD triggered by 39 UTR-located introns in plants (Kerényi et al., 2008;Nyikó et al., 2013), the number of studies of plant EJCs is limited and little is known about the binding sites of plant EJCs.
Translation termination at a PTC promotes the assembly of the UP FRAMESHIFT1 (UPF1) surveillance complex that actives NMD (Hug et al., 2016). In animal cells, PTC-containing mRNAs are destabilized through endonucleolytic cleavage near the PTC catalyzed by SMG6/EST1A (a member of the gene family named after C. elegans Suppressor with Morphological effect on Genitalia or yeast Ever Shorter Telomeres1; Gatfield and Izaurralde, 2004;Huntzinger et al., 2008;Eberle et al., 2009) or through deadenylation or decapping (Chen and Shyu, 2003;Lejeune et al., 2003;Couttet and Grange, 2004). Although the NMD factors UPF1, UPF2, and SMG7 are conserved in animals and plants, plants appear to lack SMG6 orthologs and presumably remove PTCcontaining mRNAs through exonucleolytic pathways (Kerényi et al., 2008). Previously, identification or validation of NMD targets often relied on the detection of changes in gene expression in NMD mutants or NMD factor-depleted cells because the specific degradation products of NMD targets were poorly defined (He et al., 2003;Kalyna et al., 2012;Drechsel et al., 2013;Colombo et al., 2017;Lloyd et al., 2018). Nevertheless, relying on changes in gene expression could be problematic because impairment of the NMD pathway can alter the expression of many genes that are indirectly regulated by this pathway.
High-throughput approaches for profiling 59 monophosphate ends of RNA degradation intermediates (hereafter referred to as 59P ends), such as Degradome sequencing (Degradome-Seq; Addo-Quaye et al., 2008), Parallel Analysis of RNA Ends (PARE; German et al., 2008), Genome-wide Mapping of Uncapped and Cleaved Transcripts (GMUCT; Gregory et al., 2008), and 59P sequencing (Pelechano et al., 2015), provide a more reliable way to detect the direct targets of RNA degradation pathways. These approaches have been applied for transcriptome-wide identification of endonucleolytic cleavage events directed by microRNA (miRNA) in diverse plant species and catalyzed by SMG6 in human (Homo sapiens) cells as well as the substrates of exoribonu-clease4 (XRN4), which catalyzes 59 to 39 RNA decay in Arabidopsis (Arabidopsis thaliana). For plant miRNA targets, 59P ends predominantly correspond to the middle of miRNA complementary sites (Addo-Quaye et al., 2008;German et al., 2008;Li et al., 2010;Pantaleo et al., 2010;Shamimuzzaman and Vodkin, 2012;Li et al., 2013a). Arabidopsis miRNA-directed cleavage products overaccumulate in the Arabidopsis xrn4 mutant (German et al., 2008). Global profiling of human SMG6-dependent 59P ends by PARE revealed a sequence motif enriched at the SMG6 cleavage site (Schmidt et al., 2015). Notably, depletion of SMG6 increased the number of 59P ends located at the mRNA cap site for many SMG6 substrates, suggesting competition between endonucleolytic cleavage and decapping in the human NMD pathway. Because SMG6 appears to be absent in plants, plant NMD targets are presumably degraded through XRNs and the exosome after decapping and deadenylation. Intriguingly, Arabidopsis xrn4 not only overaccumulates intermediates with 59P ends located at the cap sites of NMD targets but also intermediates with 59P ends located in the coding region (CDS) or 39 UTR of some targets (Nagarajan et al., 2019), suggesting that endonucleolytic cleavage might also account for the turnover of plant NMD targets.
Endonucleolytic cleavage is not the only mechanism that can result in predominant accumulation of 59P ends at specific sites. Blocking of 59 to 39 RNA decay by RNA binding proteins likely also yields 59P ends clearly delineated by the 59 edge of the RNA binding protein (Hou et al., 2014). Consistent with this speculation, previous analyses of yeast and plant 59P end data indicated that a portion of 59P ends are ribosome-protected mRNA fragments generated through cotranslational RNA decay (Pelechano et al., 2015;Hou et al., 2016;Yu et al., 2016). The in vivo ribosome footprints in the RNA degradome thus provide an alternative to ribosome profiling as way to infer ribosome dynamics. Stressinduced pauses in translation at specific codons and gene regions associated with collided ribosomes have been uncovered through analyses of RNA degradome data. As many eukaryotic mRNAs are posttranscriptionally processed by splicing, EJCs might also make a noteworthy contribution to the pool of RNA degradation intermediates if they are able to hinder XRN-mediated RNA decay and leave footprints during in vivo mRNA degradation as ribosomes do. If this is the case, EJC footprints in the RNA degradome can serve as signatures of mRNA degradation before steady state translation, as EJCs residing upstream of the termination codon are presumably removed after the pioneer round of translation. Theoretically, only mRNA degradation occurring before or during the pioneer round of translation, such as NMD, can lead to the accumulation of EJC footprints in CDSs.
Here, we provide multiple lines of evidence supporting the notion that the RNA degradome contains EJC footprints. We observed EJC footprints in some miRNA targets in addition to NMD targets, demonstrating that RNA degradome data can be applied to the study of mRNA degradation before steady state translation.

59P Ends Are Enriched in the Canonical EJC Region
If EJCs are able to block 59 to 39 decay of RNA and leave footprints in the RNA degradome as ribosomes do, it is expected that enrichment of 59P ends in the canonical EJC region will be observed. To test this possibility, we analyzed the previously reported 59P end data of five evolutionarily distant species. Consistent with this prediction, metagene analyses of both Arabidopsis PARE data (German et al., 2008) and rice (Oryza sativa) Degradome-Seq data (Li et al., 2010) revealed that the relative occurrence of 59P ends was higher in a region 25 to 30 nucleotides upstream of the 39 end of the exon ( Figure 1A). As these two data sets only contain the 59P ends of polyadenylated degradation fragments, we also examined Arabidopsis PARE data for nonpolyadenylated degradation fragments generated by Nagarajan et al. (2019). Interestingly, 59P end enrichment was also detected in the same region of nonpolyadenylated degradation fragments but to a lesser degree compared with that in polyadenylated degradation fragments (Supplemental Figure 1A). Analysis of the Arabidopsis 59P ends obtained using the GMUCT method (Willmann et al., 2014) gave an enrichment pattern in this region similar to that observed for the PARE data (German et al., 2008;Supplemental Figure 1B). Similarly, a 59P peak spanning the same region was also detected when analyzing the GMUCT data for human HEK293 cells (Willmann et al., 2014; Figure 1A). Analysis of Degradome-Seq data for worm (Park et al., 2013), in which EJCs are not required for triggering NMD (Longman et al., 2007), also revealed an enrichment of 59P ends in a similar region with a 1 nucleotide offset ( Figure 1A). However, analysis of the PARE data for budding yeast (Harigaya and Parker, 2012), which lacks EJCs and has only a few intron-containing genes (Bannerman et al., 2018), revealed no enrichment of 59P ends in this region ( Figure 1A). The common patterns observed in the 59P end data sets of four evolutionarily distant species indicate that this phenomenon is truly conserved across species possessing EJCs. Given that the canonical deposition site of human EJCs is centered at a position 24 nucleotides upstream of the exon-exon junction (Saulière et al., 2012;Singh et al., 2012) and that deposition of an EJC results in the protection of an 8-nucleotide fragment from complete RNase digestion (Le Hir et al., 2000; Figure 1A, top), the 59P peaks spanning the 225 to 230 region correspond to the 59 edges of EJCs deposited in the canonical region. Therefore, these 59P peaks are very likely to be EJC-protected 59 termini of RNA degradation intermediates naturally produced in cells.
In addition to the analysis of 59P end distribution in a 50nucleotide region upstream of the exon-exon junction, we also investigated the most abundant 59P peaks (maximum 59P peaks) occurring in the canonical EJC region (positions 226 to 230) within transcripts of intron-containing genes. Analysis of previously published 59P end data revealed that 13.1 (2411 genes), 9.4 (1650 genes), 8.2 (710 genes), and 8.1% (1142 genes) of the maximum 59P peaks within transcripts of Arabidopsis, rice, human cells, and worm, respectively, were located in the canonical EJC region ( Figure 1B). These values were significantly higher than expected (P < 10 2100 , x 2 test), as the total length of the canonical EJC regions represents less than 2% of the total length of spliced transcripts. By contrast, the percentages of maximum 59P peaks falling in the nearby 5-nucleotide regions in Arabidopsis, rice, and human cells were mostly lower than 2%, which is close to the expected value for a 5-nucleotide region calculated by dividing the total length of each 5-nucleotide region by that of spliced transcripts ( Figure 1B). Hence, EJC footprints appear to account for a noticeable portion of major mRNA degradation intermediates.

Exoribonucleases Are Involved in in Vivo EJC Footprinting
To investigate whether EJC footprints in the RNA degradome are generated through 59 to 39 decay, as was previously reported for ribosome footprints (Pelechano et al., 2015;Yu et al., 2016), we profiled 59P ends in the Arabidopsis wild type and two mutants defective in 59 to 39 RNA decay, xrn4-6 and fiery1-6 (fry1-6), and compared the abundances of 59P ends mapping to canonical EJC regions. Although 59P ends remained enriched in the canonical EJC region in the two mutants ( Figure 2A), xrn4-6 had significantly fewer 59P ends in the canonical EJC region than the wild type in a global analysis of Arabidopsis exons (P < 0.001, Kruskal-Wallis test followed by a pairwise Wilcoxon test with false discovery rate [FDR] correction; Figure 2B). The decrease in the number of 59P ends in the canonical EJC region was more evident in fry1-6 (P < 0.001, Kruskal-Wallis test followed by a pairwise Wilcoxon test with FDR correction), wherein the activities of XRN2, XRN3, and XRN4 are likely all suppressed (Gy et al., 2007). Notably, the 59P ends enriched around positions 220 to 215 are likely also fragments protected by RNA binding proteins, because the number of 59P ends mapping to this region also appeared to be reduced in xrn4-6 and fry1-6. By contrast, the abundance of 59P ends around the transcription start site (TSS) of genes harboring exons was largely increased in xrn4-6 and slightly but significantly elevated in fry1-6 compared with the wild type (P < 0.001, Kruskal-Wallis test followed by a pairwise Wilcoxon test with FDR correction; Figure 2C). This result strongly suggests that EJCs can sterically hinder XRN-mediated 59 to 39 decay, resulting in the EJC footprints observed in the RNA degradome. Moreover, fry1-6 exhibited (A) Distribution of the relative frequency of 59P end occurrences in a 50nucleotide region upstream of the exon-exon junction. PARE data for Arabidopsis wild-type inflorescences (German et al., 2008) and yeast (Harigaya and Parker 2012), Degradome-Seq data for rice wild-type seedlings (Li et al., 2010) and adult stage worms (Park et al., 2013), and GMUCT data for HEK293 T cells (Willmann et al., 2014) were used in the metagene analyses. The illustration at the top shows the canonical position of an EJC and the size of a fragment protected by an EJC. The numbers of exons $ 50 nucleotides in size and with a 59P end count $ 1 that were included in the analysis are shown in parentheses. (B) Distribution of the maximum 59P peaks within the transcripts of introncontaining genes in a 50-nucleotide region upstream of the exon-exon junction. The numbers of intron-containing genes that had the maximum 59P peak with a count $ 3 and were included in the analyses are shown in parentheses. a more pronounced reduction of EJC footprints than xrn4-6, implying that, in addition to the cytoplasm-localized XRN4 (Kastenmayer and Green, 2000), nuclear XRNs (XRN2 and XRN3) or other proteins with exoribonuclease activity may also contribute to in vivo EJC footprinting.

Prominent EJC Footprints Are Prevalent in NMD Targets
Given that ribosomal arrest by a PTC can trigger decapping, followed by 59 to 39 RNA decay, the EJCs deposited on the exons downstream of a PTC were predicted to hinder XRN-mediated (A) Distribution of 59P ends in a 50-nucleotide region upstream of the exon-exon junction in wild-type, xrn4-6, and fry1-6 seedlings. The distributions for two independent biological replicates (R1 and R2) are shown. The numbers of exons $ 50 nucleotides in size with a 59P end count $ 1 that were included in the analysis are shown in parentheses. (B) and (C) Comparison of 59P end abundance in the canonical EJC region (B) and at the TSS (C) of the genes harboring the selected exons in Arabidopsis wild-type, xrn4-6, and fry1-6 seedlings. The numbers of EJC regions (count > 10 TP40M) and TSSs (count > 5 TP40M) included in the analysis are shown in parentheses. Different letters above the box plots indicate significant differences between groups (P < 0.001) determined by the Kruskal-Wallis test followed by the pairwise Wilcoxon test with FDR correction. Box boundaries and whiskers indicate quartiles and the range of 59P end abundance, respectively. Positional distributions are shown for 59P ends in three Arabidopsis genes producing PTC-containing mRNAs due to AS. The 59P ends were identified in the PARE data generated by this study. Within the 59P end plots, red peaks indicate peaks in the canonical EJC region, black arrows indicate maximum peaks in 10-d-old (10d) wild-type seedlings, and blue arrows indicate peaks corresponding to the putative TSS. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, thin lines indicate introns, orange boxes indicate regions altered by AS, and red asterisks indicate PTCs. Only the 59P ends residing in exons are displayed in the plots. Data for replicate 2 are shown in Supplemental Figure 2. degradation and lead to the accumulation of degradation intermediates with 59P ends in the EJC region. Indeed, three Arabidopsis genes, TFIIIA, LPEAT2, and RS2Z33, which are known to produce PTC-containing mRNAs due to alternative splicing (AS; Yoine et al., 2006;Gloggnitzer et al., 2014), displayed prominent 59P peaks in the EJC regions of the exons 39 to the PTC (Figure 3; Supplemental Figure 2). Notably, in the wild type, the maximum 59P peaks in these three genes were all located in the second exon 39 end downstream of the PTC, although the maximum 59P peak in LPEAT2 was not in the canonical EJC region but at a position 37 nucleotides upstream of the exon-exon junction. Moreover, the abundances of these maximum peaks, including the two in the canonical EJC region, were reduced in both xrn4-6 and fry1-6 compared with the wild type, but to a greater extent in fry1-6. However, a comparable or higher number of 59P ends corresponding to the putative TSS and regions outside the canonical EJC site were observed in xrn4-6 and fry1-6 than in the wild type, which excluded the possibility that a change in gene expression accounted for the reduction of EJC footprints in the mutants.
A previous analysis of AS events suppressed in a double mutant of the NMD factors UPF1 and UPF3 of Arabidopsis revealed a large number of genes producing splicing variants with classical NMD features (Drechsel et al., 2013). We thus used this gene list to test the association between the NMD-suppressed AS events and the occurrence of maximum 59P peaks in the canonical EJC region. Using a threshold applied in the previous report (upregulated in the double mutant of lba1 upf3-1 with FDR < 0.1; lba1 is also known as upf1-1; Drechsel et al., 2013), 1082 genes with NMDsuppressed AS events were identified. Among them, 37.2% (402 genes) and 32.8% (355 genes) possessed a maximum 59P peak in the EJC region in replicates 1 and 2 of the Arabidopsis 10-d-old wild-type seedling PARE data we generated, respectively (Figures 4A and 4B). By contrast, the total incidence of maximum 59P peaks in the EJC region for all intron-containing genes in the Arabidopsis genome was 15.3% (3379 genes) and 13.8% (3064 genes) in replicates 1 and 2, respectively, which is significantly lower than the incidence of genes with NMD-suppressed AS events (P < 10 2100 , binomial test). This result implies that EJC footprints are the major and common degradation products of putative NMD targets. The predominance of EJC footprints in many putative NMD targets also validated their degradation during the pioneer round of translation. Because it has been shown that many intronretention events are not sensitive to NMD (Kalyna et al., 2012;Drechsel et al., 2013), we also compared the incidence of maximum 59P peaks in the EJC region for four types of AS events suppressed by NMD. Interestingly, a significantly higher incidence of maximum 59P peaks in the EJC region was observed for alternative 59 or 39 splice site events than for intron-retention events (P < 10 25 , binomial test), although these events were all suppressed by NMD ( Figure 4B). However, for 2306 intron-containing genes harboring a maximum 59P peak in the EJC region in both replicates of the PARE data we generated, only 293 genes (12.7%) had NMD-suppressed AS events ( Figure 4A). A large number of maximum 59P peaks in the EJC region did not overlap with NMD-suppressed AS events reported previously (Drechsel et al., 2013), implying that either the list was not complete or there are additional pathways involved in the degradation of EJCbound RNA.
AS events in CYP38, PPD6, and STR14 were suppressed by NMD (Drechsel et al., 2013) but were not annotated in The Arabidopsis Information Resource (TAIR) database. The maximum 59P peaks in these three genes were located in the EJC region, and these peaks had a 59P count $ 1000 tags per 40 million (TP40M; Figure 4C; Supplemental Figure 3). Although the 59P end profiles of these three genes highly resembled those of validated plant miRNA targets reported previously (Addo-Quaye et al., 2008;German et al., 2008), the maximum 59P peaks were dampened in xrn4-6 and fry1-6. This indicates that these extremely abundant 59P peaks do not correspond to 59P ends derived from endonucleolytic cleavage but instead the protected fragments of EJCs. Notably, these peaks were located in the first or third exon 39 end downstream of the predicted PTCs caused by AS. This again supports the notion that the dominant EJC footprints in the exon 39 ends downstream of predicted PTCs could be evidence of NMD triggered by those PTCs.

PTC-Triggered RNA Decay Leads to the Production of EJC Footprints
To verify that a PTC conditions the production of EJC footprints, we took advantage of existing nonsense mutants of PHYB and PHO2 (Reed et al., 1993;Delhaize and Randall, 1995), in which a PTC was introduced by ethyl methanesulfonate mutagenesis. The PTCs in the nonsense mutants should direct mRNA decay through the NMD pathway and are predicted to trigger the production of EJC footprints downstream of the PTC. We compared degradation intermediates of PHYB and PHO2 mRNAs in the wild type and the nonsense mutants phyB-9 and pho2 using a modified RNA ligase-mediated (RLM) 59 RACE protocol. In both nonsense mutants, the 59 ends of the most abundant amplification products of the mRNA degradation fragments were mainly mapped to positions 27 to 28 nucleotides upstream of the exon-exon junction downstream of the PTCs (Figures 5A and 5B). However, these degradation intermediates were not prominent in the wild type, although the miR159-guided 39 cleavage product of MYB65, the positive control for modified RLM 59 RACE, was detected in both the wild type and the nonsense mutants. The modified RLM 59 RACE results for phyB-9 and pho2 mimic the PARE data of the three known NMD targets with PTCs introduced by AS (Figure 3), and together these results support the conclusion that PTCs elicit the accumulation of EJC footprints in downstream exons. Moreover, PTCs caused by AS or point mutations tended to result in the generation of 59P ends mainly in the first or second EJC region downstream of PTCs (Figures 3, 4C, 5A, and 5B), suggesting that EJC footprints may serve as an immediate readout of PTC-mediated NMD irrespective of the cause of the PTC.
We further validated the contribution of XRN4 to EJC footprint production from nonsense mRNAs by generating the phyB-9 xrn4-6 and pho2 xrn4-6 double mutants. As expected, a lower number of putative EJC footprints was observed in the double mutants than in the nonsense mutants ( Figure 5C). This result again confirms the role of XRN4 in the turnover of PTC-containing mRNAs and the generation of EJC footprints.

Analysis of EJC Footprints Reveals NMD Targets
Using the common features of EJC footprints in the NMD targets shown in Figures 3, 4, and 5, we globally validated potential targets of NMD triggered by AS or introns in the 39 UTR that were annotated in the Arabidopsis and rice genome databases. For protein-coding transcripts with AS events, if the maximum 59P peak was located in the first or second canonical EJC region downstream of an AS site, they were identified as putative targets of NMD triggered by AS, which frequently results in a PTC. In (A) Venn diagram showing the overlap between genes producing NMD-suppressed AS variants (NMD 1 AS) and genes possessing a maximum 59P peak (M59P) in the EJC region. The NMD-suppressed AS variants were extracted from the lba1 upf3-1 data reported by Drechsel et al. (2013). Two replicates (R1 and R2) of PARE data for 10-d-old (10d) seedlings generated by this study were used for the identification of genes possessing a maximum 59P peak with a count $ 3 located in the EJC region. (B) Relative frequency of genes possessing a maximum 59P peak in the EJC region. Frequencies are shown for genes containing introns (intron 1 gene) and those producing transcripts derived from the following AS events: NMD 1 AS, intron retention, exon skipping, and alternative 59 and 39 splice sites (alt. 59ss and alt. 39ss). The number of genes in each category possessing a maximum 59P peak in the EJC region is shown above the bar. Asterisks indicate significant differences between categories based on the binomial test (*, P < 10 25 ; **, P < 10 215 ; ***, P < 10 2100 ). (C) Positional distribution of 59P ends (from PARE data generated by this study) in three Arabidopsis genes producing NMD-suppressed AS variants. Within the 59P end plots, red peaks indicate peaks in the canonical EJC region and black arrows indicate the maximum peaks in 10-d-old wild-type seedlings. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, thin lines indicate introns, red asterisks indicate PTCs, and orange boxes indicate regions altered by NMD-suppressed AS events according to Drechsel et al. (2013). Only the 59P ends residing in exons are displayed in the plots. Data for replicate 2 are shown in Supplemental Figure 3. addition, transcripts possessing an intron-containing 39 UTR were also identified as putative NMD targets if the maximum 59P peak was located in the first or second canonical EJC region of the exon 39 end at least 50 nucleotides downstream of the annotated termination codon. Analysis of the PARE data for Arabidopsis wild-type seedlings and flowers that we generated and the previously published Degradome-Seq data for rice wild-type seedlings (Li et al., 2010) revealed 254 Arabidopsis transcripts and 168 rice transcripts as putative NMD targets ( Figure 6A; Supplemental Data Sets 1 and 2). Remarkably, 19.3 and 44% of the putative NMD targets identified from the analysis of Arabidopsis and rice 59P end data, respectively, have an intron-containing 39 UTR ( Figure 6A). Compared with the large number of NMD-sensitive AS events reported by Drechsel et al. (2013), the number of Arabidopsis NMD targets we identified is small, because many AS events reported by Drechsel et al. (2013), including AS of CYP38, PPD6, and STR14, were not annotated in TAIR 10. However, 224 of the targets we identified in Arabidopsis did not show evidence of NMD-suppressed AS in the previous report (upregulated in the double mutant of lba1 upf3-1 with FDR < 0.1; Supplemental Data Set 1; Drechsel et al., 2013). Some of the novel NMD targets, such as Arabidopsis ARR8 and KCBP and rice transcripts encoding an SPA4-like protein (LOC_Os01g52640) and a transposon protein (LOC_Os04g03884), had high numbers of EJC footprints in the 39 UTR ( Figures 6B and 6C). For the Arabidopsis NMD targets we obtained, the analysis of gene Ontology (GO) terms revealed the highest enrichment for transcripts involved in "regulation of alternative mRNA splicing, via spliceosome" (22. 27-fold enrichment, P < 0.001, Fisher's exact test with Bonferroni correction; Supplemental Table 1). This enriched GO term is consistent with the previous finding that in Arabidopsis, NMD is an important mechanism regulating the splicing of genes encoding Ser/Arg proteins, which play key roles in splicing (Palusa and Reddy, 2010). Other significantly enriched GO terms unrelated to splicing included "organic cyclic compound metabolic process," "nitrogen compound metabolic process," and "chloroplast" (P < 0.05, Fisher's exact test with Bonferroni correction). For the rice NMD targets we identified, however, no enriched GO terms were recovered. The difference in GO enrichment analysis results between the two species might be due to differences in the 59P end data sets we analyzed, the frequency of introns in the 39 UTR, or the annotations of splicing variants in the two genomes. It may also suggest that the role of the NMD pathway in gene regulation could be distinct among plant species.

Dysfunction of SMG7 Reduces the Number of EJC Footprints in Putative NMD Targets
To verify that EJC footprints in NMD targets are degradation products generated by the NMD pathway, we profiled 59P ends in the Arabidopsis wild type and mutants of key NMD factors and compared the abundances of EJC footprints between them. In Arabidopsis, the crucial roles of UPF1 and SMG7 in the NMD pathway have been demonstrated (Arciga-Reyes et al., 2006;Riehs-Kearnan et al., 2012). Given that null mutants of UPF1 are seedling lethal (Arciga-Reyes et al., 2006), smg7-1, a severe but viable mutant of SMG7 with a T-DNA insertion in the CDS, was chosen for the comparison (Riehs et al., 2008). However, homozygous smg7-1 plants are sterile and morphologically different from wild-type plants when grown at 22°C due to an autoimmune response (Riehs et al., 2008;Riehs-Kearnan et al., 2012). To reduce the possibility that differences might be caused by the autoimmune response, we selected smg7-1 homozygous plants by genotyping and grew them at 28°C, under which the autoimmune response is repressed and wild-type and smg7-1 plants more closely resemble each other (Gloggnitzer et al., 2014). We then collected the inflorescences of wild-type and smg7-1 plants and isolated RNA for PARE library construction. Similar to the wild type, smg7-1 displayed an enrichment of 59P ends in the canonical EJC region ( Figure 7A). However, in smg7-1, the abundance of 59P ends was significantly lower in the canonical EJC region but higher at the TSS compared with the wild type (P < 0.01, Wilcoxon ranksum test; Figures 7B and 7C). Furthermore, the median of the log 2 fold change in maximum EJC footprint abundance between smg7-1 and the wild type was significantly lower for transcripts experiencing NMD-suppressed AS events than for other transcripts (A) and (B) Modified RLM 59 RACE assays of RNA degradation intermediates generated from wild-type and nonsense mutant mRNAs of PHYB (A) and PHO2 (B). Arrowheads indicate the RACE products excised and cloned for Sanger sequencing (left panels). The positional distribution of the 59P ends of the cloned RACE products from nonsense mutants is shown in the right panels. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, thin lines indicate introns, asterisks indicate PTCs, and half arrows indicate gene-specific primers used for modified RLM 59 RACE. The 39 remnant of MYB65 derived from miR159guided cleavage was used as a positive control for modified RLM 59 RACE. (C) Comparison of EJC footprints from nonsense mutant mRNAs between phyB-9 and phyB-9 xrn4-6 and between pho2 and pho2 xrn4-6 using modified RLM 59 RACE assays. Arrowheads indicate the RACE products sequenced in (A) and (B).
(P < 10 24 , Kolmogorov-Smirnov two-sided test; Figure 7D), supporting the notion that SMG7 plays a crucial role in the production of EJC footprints from NMD targets. Consistent with the analysis of PARE data (Figures 3 and 4C), the analysis of modified RLM 59 RACE data also confirmed the involvement of SMG7 in the production of EJC footprints for three NMD targets in both flowers and 14-d-old seedlings (Supplemental Figure 4). The reduction in the number of EJC footprints in smg7-1 seedlings supports the notion that the effect was due to the NMD function of SMG7 instead of its unique function in meiosis (Riehs et al., 2008). Like the 59P peaks in the EJC regions of six NMD targets shown in Figures 3  and 4C, the maximum 59P peak in LPEAT2, which was positioned outside the canonical EJC region, was also dampened in smg7-1 (Figure 3; Supplemental Figure 2), indicating that it might also be a degradation product of the SMG7-dependent NMD pathway. Moreover, for three previously reported NMD-sensitive transcripts that had xrn4-enhanced 59P peaks in the CDS or 39 UTR (Nagarajan et al., 2019), xrn4-enhanced 59P peaks in eRF1-1 (Eukaryotic Release Factor1-1, NMD factor and target) were detected in the flowers of the wild type but were absent in those of smg7-1 (Supplemental Figure 5). This result further confirms the notion that the xrn4-enhanced 59P peaks in eRF1-1 are also intermediates of degradation initiated by the NMD pathway (Nagarajan et al., 2019).

Some miRNA Targets Possess EJC Footprints
Surprisingly, 59P peaks corresponding to putative EJC footprints were also observed downstream of miRNA-guided cleavage sites in the targets of Arabidopsis miR159 (MYB33), miR160 (ARF10), and miR396 (GRF1; Figure 8A; Supplemental Figure 6). Likewise, rice miR159, miR160, and miR396 targets also harbored EJC footprints in the exon ends downstream of the miRNA target sites ( Figure 8B). The accumulation of putative EJC footprints in the three Arabidopsis miRNA targets was reduced in xrn4-6 and nearly abolished in fry1-6; however, the amplitude of miRNA-guided cleavage peaks was increased in both xrn4-6 and fry1-6 compared with the wild type. We also confirmed the decreased number of EJC footprints and the increased number of miRNA-guided cleavage 39 remnants for these three miRNA targets in xrn4-6 and fry1-6 by performing modified RLM 59 RACE assays (Supplemental Figure 7). We also assayed XRN2 and XRN3 function in the production of EJC footprints from these three miRNA targets. The levels of 39 remnants of miRNA-guided cleavage and EJC footprints were comparable between the wild type, xrn2-1, and xrn3-3 (Supplemental Figure 7). We thus further examined the level of EJC footprints in the XRN triple mutant xrn2-1 xrn3-3 xrn4-6. Like xrn4-6, the triple mutant also accumulated a higher number of miRNA-guided cleavage 39 remnants but fewer EJC footprints compared with the wild type. However, the accumulation of these two types of degradation fragments in the triple mutant was neither different from that in xrn4-6 nor comparable to that in fry1-6. This result does not completely rule out a role for XRN3 in EJC footprinting because xrn3-3 is a knockdown mutant with a T-DNA insertion in the promoter region (Gy et al., 2007). These results confirm that XRN4 degrades the 39 cleavage remnants of plant miRNA targets (Souret et al., 2004) and suggest that in addition to XRN4, other enzymes (A) Fractions of putative NMD targets where NMD is triggered by AS or an intron-containing 39 UTR (Intron 1 39 UTR). The numbers of total putative NMD targets recovered from the analysis are in parentheses. (B) and (C) Positional distribution of 59P ends for targets of NMD triggered by intron-containing 39 UTRs in Arabidopsis (B) and rice (C). 59P ends in Arabidopsis were identified using the Arabidopsis PARE data (replicate 1) generated by this study, and those in rice were identified using the Degradome-Seq data for rice wild-type seedlings reported by Li et al. (2010). Within the 59P end plots, red peaks indicate peaks in the canonical EJC region and black arrows indicate the maximum peaks. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, and thin lines indicate introns. Only the 59P ends residing in exons are displayed in the plots.
with 59 to 39 exonuclease activity also contribute to the turnover of miRNA cleavage remnants and EJC footprinting.
Notably, not all of the 59P peaks evident at miRNA target sites were associated with prominent EJC footprints in downstream exon ends. For instance, while a high number of 59P ends corresponding to the miR398 target site of Arabidopsis CSD1 and CSD2 were observed, no or poor 59P peaks appeared in the canonical EJC regions 39 to the miR398 target site ( Figure 9A; Supplemental Figure 8). More intriguingly, the miR398 target site is close to the canonical EJC region in CSD1 and overlaps with the canonical EJC region in CSD2. Hence, EJCs might mask the miR398 target site on the CSD1 and CSD2 mRNAs that have not been translated and prevent miRNA regulation before the pioneer round of translation. Consistent with this result, rice miR398regulated CSD1 and two miR444-regulated MADS genes, which possess a miRNA target site in or near the canonical EJC region, also lacked evident EJC footprints ( Figure 9B). In addition to the miR398 and miR444 targets, a miR408 target encoding plantacyanin (ARPN) in Arabidopsis also possessed a prominent 59P peak at the miRNA-guided cleavage site but poor peaks in the EJC region. However, unlike the miR398 and miR444 targets, the miRNA target site on ARPN is located at a position ;200 nucleotides upstream of an exon-exon junction that is unlikely to be masked by EJCs. Taken together, these results suggest that there (A) Distribution of 59P end counts in a 50-nucleotide region upstream of the exon-exon junction. Counts are shown for two replicates (R1 and R2) each from wild-type and smg7-1 flowers. The numbers of analyzed exons that are $ 50 nucleotides in size and with a 59P end count $ 1 TP40M are shown in parentheses. (B) and (C) Comparison of 59P end counts in the canonical EJC region (B) and at the TSS (C) of the genes harboring the selected exons between the wild type and smg7-1. Asterisks indicate significant differences between groups (P < 0.01, Wilcoxon rank-sum test). Box boundaries and whiskers indicate quartiles and the range of 59P end abundance, respectively. The numbers of analyzed EJC regions (count > 10 TP40M) or TSSs (count > 5 TP40M) are shown in parentheses.
(D) Scatter and density plots depicting log 2 fold change (smg7-1/wild type) and log 2 average 59P end counts for the maximum EJC peaks in genes. Each data point represents a maximum EJC peak with an average 59P end count $ 10. EJC peaks for genes producing NMD-suppressed AS variants (NMD 1 AS), miRNA targets, and other genes (other) are shown. Arabidopsis genes producing NMD-suppressed AS variants and genes that are validated miRNA targets were extracted from data sets reported by Drechsel et al. (2013) and Zheng et al. (2012), respectively. The number of analyzed genes in each category is shown in parentheses. The density plots to the right cover all data points in each category, and straight lines indicate the median of each distribution. Asterisks indicate significant differences in the comparisons of the NMD 1 AS group and the miRNA group with the group of other genes (Kolmogorov-Smirnov two-sided test). The colors and locations of asterisks indicate the associated category and the direction of the shift in the distribution. might be multiple factors controlling the interaction between plant miRNAs and EJC-bound targets.

Arabidopsis miRNAs Direct EJC-Bound mRNA Degradation
Unlike NMD targets, Arabidopsis miRNA targets did not exhibit changes in the number of EJC footprints in smg7-1 ( Figure 8A; Supplemental Figure 6). Indeed, a global analysis of maximum EJC footprint abundance between smg7-1 and the wild type revealed a median log 2 fold change value close to 0 for miRNA targets. The log 2 fold change of miRNA targets is significantly higher than that of other transcripts (P < 0.01, Kolmogorov-Smirnov twosided test; Figure 7D). This trend is opposite to that observed for NMD-suppressed AS events and suggests that most EJC footprints in miRNA targets are likely produced through an NMD-independent pathway.
Given that EJC footprints were evident downstream of miRNA target sites but rare in the upstream regions, we hypothesized that plant miRNAs are able to target EJC-bound mRNAs for degradation and trigger the production of EJC footprints. To test this hypothesis, we first assayed the role of AGO1, which mediates the function of most plant miRNAs, in the production of EJC footprints from the three miRNA targets shown in Figure 8A in the null ago1-3 mutant (Arribas-Hernández et al., 2016). In ago1-3, there were fewer miRNA-guided cleavage remnants and EJC footprints corresponding to the miR160 target ARF10 and the miR396 target GRF1 ( Figure 10A). Interestingly, for the miR159 target MYB33, which exhibited no change in mRNA level in ago1-3 (Arribas-Hernández et al., 2016), the levels of miRNA-guided cleavage remnants and EJC footprints in ago1-3 were comparable to those in the wild type. Also, the mutation of AGO1 did not affect EJC footprint accumulation in the two NMD targets CYP38 and LPEAT2. Hence, these results confirm the link between the function of AGO1 and the production of EJC footprints in some miRNA targets.
To further validate the ability of plant miRNAs to direct EJC footprint accumulation, we created an artificial miRNA target by fusing the 59 UTR of Arabidopsis NITROGEN LIMITATION AD-APTATION (NLA), which contains a target site of miR827, to an intron-containing GUS gene and introduced this construct (I-GUS827) into Arabidopsis ( Figure 10B). As miR827 is barely detectable when phosphate levels are sufficient but is highly induced upon phosphate starvation ( Figure 10C; Hsieh et al., 2009), we thus were able to test the ability of miR827 to induce Positional distribution is shown for 59P ends of selected Arabidopsis (A) and rice (B) miRNA targets with prominent EJC footprints based on the Arabidopsis PARE data generated in this study and the rice Degradome-Seq data generated by Li et al. (2010). Within the 59P end plots, red peaks indicate peaks in the canonical EJC region and red arrowheads indicate peaks corresponding to the miRNA-guided cleavage site. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, and thin lines indicate introns. Only 59P ends residing in exons are displayed in the plots. Data for replicate 2 of Arabidopsis are shown in Supplemental Figure 6. EJC footprint production by comparing RNA degradation intermediates produced from the I-GUS827 construct under phosphate-sufficient and -starvation conditions using modified RLM 59 RACE. As expected, distinct amplification products of the RNA degradation intermediates with 59P ends predominantly mapping to positions 26 to 27 nucleotides upstream of the exon-exon junction were detected in the plants grown under phosphate-starvation conditions but not in those grown with sufficient phosphate (Figures 10D and 10E). In contrast to the I-GUS827 construct, the control construct containing an intron-free GUS gene fused with the NLA 59 UTR (GUS827) did not result in the accumulation of RNA degradation fragments truncated at the same site under phosphate-starvation conditions ( Figure 10D). Nevertheless, the number of RNA degradation fragments with the 59 end determined by miR827-guided cleavage was comparable between plants expressing the two constructs. This result validates the ability of plant miRNAs to target the mRNAs associated with EJCs for degradation and initiate the accumulation of transcripts with 59P ends in the EJC region.
Unexpectedly, AS was observed in the I-GUS827 construct we created. In the AS form, the dominant sequence matching cloned 59P ends was shifted farther upstream along with the alternative 59 splice site compared with that in the normal splice form ( Figure 10E). However, the distance between the predominant 59P peak and the exon-exon junction remained 27 nucleotides in the AS form. As the deposition of EJC on mRNA is dependent on splicing, the shift of the 59P peak in the AS form reinforces our hypothesis that the 59P ends matching a region 26 to 30 nucleotides upstream of the exon-exon junction mostly represent in vivo EJC footprints.

Overexpression of an EJC Disassembly Factor Eliminates EJC Footprints
To provide direct evidence that the 59P ends in the EJC region are EJC-protected fragments, we cloned an EJC disassembly factor, PYM, from Arabidopsis and transiently expressed a PYM overexpression construct in Arabidopsis protoplasts to disassemble Positional distribution is shown for 59P ends of selected Arabidopsis (A) and rice (B) miRNA targets with poor EJC footprints based on the Arabidopsis PARE data generated in this study and the rice Degradome-Seq data generated by Li et al. (2010). Within the 59P end plots, red peaks indicate peaks in the canonical EJC region and red arrowheads indicate peaks corresponding to the miRNA-guided cleavage site. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, and thin lines indicate introns. Only 59P ends residing in exons are displayed in the plots. Data for replicate 2 of Arabidopsis are shown in Supplemental Figure 8. EJCs (Figure 11). We predicted that disassembly of EJCs by PYM overexpression would promote the degradation of existing EJCprotected degradation intermediates and inhibit the production of new EJC footprints. For NMD targets, the production of intermediates with 59P ends matching the canonical EJC regions of CYP38 and PPD6 was abolished when PYM was overexpressed ( Figure 11). Degradation fragments corresponding to the maximum 59P peak of LPEAT2, which was positioned outside the canonical EJC region and SMG7-dependent (Figure 3), were also absent in PYM-overexpressing protoplasts. This result indicates that this maximum 59P peak is an EJC footprint and supports the notion that EJCs are also deposited on noncanonical sites in Arabidopsis. In addition to investigating the SMG7-dependent 59P peaks in NMD targets (Figures 3 and 4C), we also examined the effect of PYM overexpression on the SMG7-independent 59P peaks in the EJC region of miRNA targets shown in Figure 8A. Compared with the number of degradation intermediates in control protoplasts, the number of 39 cleavage remnants directed by miRNAs in the three miRNA targets (MYB33, ARF10, and GRF1) was increased, whereas degradation intermediates with 59P ends corresponding to the EJC region were barely detectable in the PYM-overexpressing protoplasts (Figure 11). Taken together, these results strongly indicate that the 59P ends mapping to the canonical EJC regions of these transcripts are in vivo EJC footprints irrespective of their dependency on SMG7.

DISCUSSION
The RNA degradome is composed of assorted RNA degradation intermediates derived from diverse RNA degradation pathways. While endonucleolytic cleavage results in the production of cleavage remnants with well-defined 59P ends, growing evidence supports the idea that the hindrance of XRN-mediated 59 to 39 decay by RNA binding proteins such as Pumilio/fem-3 mRNA (A) RLM 59 RACE assays of degradation intermediates generated from three miRNA targets and two NMD targets in the wild type and ago1-3. Open and solid arrowheads indicate miRNA-guided cleavage sites and EJC binding sites, respectively, in gene models and the corresponding RACE products separated by gel electrophoresis. The arrow indicates the maximum 59 peak in the LPEAT2 gene model and in the corresponding RACE product on an electrophoretic gel. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, thin lines indicate introns, orange boxes indicate regions altered by AS, asterisks indicate PTCs, and half arrows indicate gene-specific primers. The steady state mRNA level of Arabidopsis ACT8 was used to control the input amount of total RNA and reverse transcription among samples. An equal amount of tobacco acid pyrophosphatase-treated total RNA from HeLa cells was spiked into each sample, and the TSS of human b-actin was used to control for differences in ligation efficiency among samples. binding factors and ribosomes can also precisely delineate the 59P ends of degradation products (Hou et al., 2014(Hou et al., , 2016Pelechano et al., 2015;Yu et al., 2016). Here, we demonstrate that, like ribosomes, EJCs also have the capacity to stop 59 to 39 decay and leave precise footprints in the RNA degradome ( Figure 12). Because EJCs are deposited on RNA during splicing and are displaced by ribosomes during the pioneer round of translation ( Figures 12A and 12B), EJC footprints can serve as markers for the degradation of mRNA before steady state translation. If a spliced mRNA is degraded after the pioneer round of translation, 59 to 39 mRNA decay should not yield EJC footprints upstream of the termination codon while cotranslational mRNA decay during steady state translation may yield ribosome footprints ( Figure 12B; Pelechano et al., 2015;Hou et al., 2016;Yu et al., 2016). By contrast, NMD targets are mainly degraded during the pioneer round of translation and may produce EJC footprints downstream of PTCs ( Figure 12C). Additionally, some plant miRNA targets seem to be attacked by miRNAs before translation, which may result in EJC footprints located 39 to the miRNA target sites ( Figure 12D). As some EJC footprints appear not to be linked to NMD or miRNA pathways ( Figures 4A and 7D), other posttranscriptional gene regulation mechanisms may also account for EJC-bound mRNA degradation before or during the pioneer round of translation. In summary, our study demonstrates that the RNA degradome contains EJC footprints, which account for a sizable portion of major RNA degradation intermediates. The analysis of EJC footprints in the RNA degradome confirms the degradation of plant NMD targets during the pioneer round of translation and reveals EJC-bound mRNAs as targets of some plant miRNAs.

Location and Abundance of EJC Footprints in the Plant RNA Degradome
Although cross-linked immunoprecipitation followed by nextgeneration sequencing has not been adopted to profile EJC binding regions on a genome-wide scale in Arabidopsis and rice, the identification of a conserved hot spot for 59P ends located in a region 25 to 30 nucleotides upstream of the exon-exon junction (Figure 1) implies that the deposition of EJCs on plant and animal RNAs presumably follows similar rules. Two transcriptome-wide studies of human EJC binding regions using cross-linked immunoprecipitation followed by next-generation sequencing revealed that only about half of the EJC binding sites mapped to the canonical position (Saulière et al., 2012;Singh et al., 2012). The binding of EJCs to noncanonical sites likely also occurs in plants and accounts for the 59P peaks positioned outside the canonical EJC regions. For instance, the maximum 59P peak in Arabidopsis LPEAT2 appeared to be the footprint of an EJC deposited at a noncanonical site, as it was located immediately downstream of the PTC, and the peak was dampened in smg7-1 and largely reduced when the EJC disassembly factor PYM was overexpressed (Figures 3 and 11). If half of EJCs are deposited at noncanonical sites in plants as in humans, the contribution of EJC footprints to maximum 59P peaks located within transcripts is likely to have been underestimated, as we only considered the maximum 59P peaks positioned in the canonical EJC region. Given that EJC footprints in the RNA degradome can be prominent, abundant, and prevalent, as we demonstrated in this study, investigations of inference by endonucleolytic cleavage directed by miRNAs or other mechanisms using RNA degradome data should be undertaken cautiously. Dependency on XRN activity could serve as a key to differentiate between protected fragments derived from RNA binding proteins and endonucleolytic cleavage products.
The number of 59P ends mapping to EJC regions varied dramatically within PTC-containing mRNAs and miRNA targets (Figures 3 and 8). Intriguingly, the maximum EJC-protected 59P peaks tended to reside in the first or second exon end downstream of a PTC or an miRNA-directed cleavage site. This result implies that in both cases 59 to 39 decay is more frequently blocked by the first EJC met by the XRN. However, in some cases, the exons farther downstream also had 59P ends mapping predominantly to Figure 11. Overexpression of PYM Abolishes EJC Footprints in NMD and miRNA Targets.
Arabidopsis PYM was cloned and transiently overexpressed in Arabidopsis protoplasts. The steady state mRNA levels of PYM and a housekeeping gene, ACT8, in control (2) and PYM-transfected (1) protoplasts were measured using the cDNA from the modified RLM 59 RACE assays. The amounts of RNA degradation intermediates from three NMD targets and three miRNA targets in control and PYM-overexpressing protoplasts were compared using modified RLM 59 RACE assays. An equal amount of tobacco acid pyrophosphatase-treated total RNA from HeLa cells was spiked into each sample, and the TSS of human b-actin was used to infer ligation efficiency. Open and solid arrowheads indicate the miRNA-guided cleavage sites and EJC binding sites, respectively, in gene models and the corresponding RACE products separated by gel electrophoresis. The arrow indicates the maximum 59 peak in the LPEAT2 gene model and the corresponding RACE product on an electrophoretic gel. The identity of all marked RACE products was confirmed by Sanger sequencing. Within the gene models, light gray boxes indicate UTRs, dark gray boxes indicate CDSs, thin lines indicate introns, orange boxes indicate regions altered by AS, asterisks indicate PTCs, and half arrows indicate gene-specific primers.
the EJC region, although to a lesser degree, implying that sometimes XRN-mediated decay might be able to overcome the hindrance of EJCs. Notably, in PTC-containing transcripts such as Arabidopsis LPEAT2 and RS2Z33, some exons possessed poor 59P peaks in the canonical EJC region while both the 59 and 39 neighboring exons had prominent EJC peaks (Figure 3). Since no other evident 59P peaks were present in these exons, which lacked canonical EJC footprints, it appears that EJCs are less frequently deposited on these exons than their neighboring exons. This possibility is supported by the previous finding that the loading of human EJCs varies among exons even within the same transcript (Saulière et al., 2012;Singh et al., 2012). Differential EJC loading probably also accounts for the poor accumulation of 59P ends in some EJC regions downstream of a PTC or miRNA target site within the same transcripts in plants.

Application of EJC Footprints to the Study of NMD
Previously, the identification or the validation of endogenous NMD substrates required NMD mutants or treatment with NMD inhibitors such as cycloheximide, which is not feasible or problematic in some species or tissues. Moreover, the upregulation of some assayed RNAs is an indirect consequence of NMD pathway impairment. Our study shows that EJC footprints can be a direct readout of NMD; these footprints will be useful in the validation of potential NMD substrates and inference of NMD activity. The presence of a predominant 59P peak in the EJC region located at the first or second exon end downstream of the predicted PTC is strong evidence of NMD-mediated degradation. Using 59P end data for wild-type species alone, we have validated hundreds of NMD targets in Arabidopsis and rice, where NMD is potentially triggered by AS or intron-containing 39 UTRs (Supplemental Data Sets 1 and 2). As RNA degradome data have been generated in many plant species for miRNA target identification, the analysis of publicly available RNA degradome data sets may advance our understanding of gene regulation mediated by the NMD pathway in different species. Nevertheless, this application might be limited to NMD targets with at least one intron after the PTC; NMD events triggered by long intron-free 39 UTRs are unlikely to be confirmed by this approach. Furthermore, RNA degradome data may allow assessment of the impact of stress on the transcriptome through the NMD pathway. In plants, abiotic stress conditions such as high temperature, drought, and salt stress have been shown to largely alter the accumulation of splicing variants, which mostly have features associated with NMD (Chang et al., 2014;Feng et al., 2015;Thatcher et al., 2016;Jiang et al., 2017). However, whether NMD activity is regulated by these stresses has not been addressed. Measuring EJC footprints from well-characterized NMD targets provides a straightforward approach to investigate the regulation of NMD activity by stress.

Targeting of EJC-Bound mRNAs by Plant miRNAs
Here, we showed that the cleavage events guided by some, but not all, Arabidopsis and rice miRNAs were notably associated with the production of EJC footprints (Figure 8; Supplemental Figure 6). If plant miRNAs only direct cleavage of the mRNAs that have finished the pioneer round of translation, the mRNA decay initiated by plant miRNAs would not lead to the accumulation of EJC footprints in the CDS. This result therefore suggests that at least a fraction of some plant miRNAs such as miR159, miR160, and miR396 target EJC-bound mRNAs that have not completed the pioneer round of translation ( Figure 12D). Although prominent EJC peaks are only evident downstream of the miRNA-guided cleavage site in some targets (Figure 8), EJCs upstream of the miRNA target site likely remain associated with the mRNA during miRNA-target interaction ( Figure 12D). After cleavage initiated by miRNAs, if these upstream EJCs can stop 39 to 59 RNA decay by the exosome, they may also leave footprints in the RNA degradome. However, 39 to 59 RNA decay of nonpolyadenylated 59 cleavage remnants will generate degradation intermediates with a 39 end delineated by the 39 edge of the EJC. Hence, even if the EJC footprints upstream of the miRNA-guided cleavage sites are present in the RNA degradome, they cannot be captured by PARE or similar methods, which profile the 59 ends but not 39 ends of degradation fragments with a polyadenylated tail.
While several previous studies have demonstrated that AGO1 is associated with polysomes (Lanet et al., 2009;Li et al., 2013bLi et al., , 2016, the miRNA-mediated regulation of EJC-bound mRNAs appears not to require prior translation. There are two scenarios for the targeting of EJC-bound mRNAs by plant miRNAs. Since the loading of mature plant miRNAs to AGO1 can occur in the nucleus (Bologna et al., 2018), it is possible that some miRNA-induced silencing complexes can function before being exported from the nucleus, which is rich in EJC-bound mRNAs. Alternatively, cytoplasmic miRNA-induced silencing complexes might be able to recognize their targets, which remain associated with EJCs, and direct cleavage before translation. Considering these two scenarios, the miRNA targets with a lower efficiency of nuclear export or translation during the pioneer round may have a higher chance of being cleaved before translation and accumulating EJC footprints. The analysis of the 59P ends of miRNA targets in nuclear and cytoplasmic fractions will help to answer this question. By contrast, for miRNAs such as miR398 and miR444, which pair with target sites close to the EJC region and direct cleavage without triggering EJC footprint production (Figure 9), prior translation of targeted mRNAs is likely to be essential for their regulation. Other miRNA targets that possess target sites outside the EJC region but lack prominent EJC footprints, such as Arabidopsis ARPN (Figure 9), may have few EJCs deposited on the transcripts. Alternatively, their cleavage remnants might be mainly degraded through the 39 to 59 RNA decay pathway, which is not able to yield EJC-protected 59 RNA ends.

Interplay between the miRNA and NMD Pathways
In human cells, it was demonstrated that miRNAs are able to be loaded on EJC-bound mRNAs, thereby repressing the translation and decay of some NMD targets (Choe et al., 2010). As our study also demonstrated that some plant miRNAs can target EJC-bound mRNAs, this raises the possibility that some plant transcripts are targeted by both RNA degradation pathways. Interestingly, a previous analysis of miRNA and NMD targets in Arabidopsis showed that there was little overlap between the two groups and that these groups were enriched in distinct biological processes (Zhang et al., 2013). In spite of these findings, interplay between these two RNA degradation pathways might occur in other plant species. In Solanum and Fabaceae species, 22nucleotide miRNAs repress the expression of a large number of nucleotide binding site leucine-rich repeat (NBS-LRR) resistance genes (Zhai et al., 2011;Shivaprasad et al., 2012). Interestingly, normal transcripts of some NBS-LRR genes, which harbor long or intron-containing 39 UTRs, are NMD targets in Arabidopsis (Gloggnitzer et al., 2014). If NMD also regulates NBS-LRR genes in Solanum or Fabaceae species, the miRNA and NMD pathways might be interchangeable or cooperate to inhibit the plant immune system.
As total RNA is generally used for genome-wide profiling of 59P ends, the existence of EJC footprints in the RNA degradome allows us to infer that mRNA degradation occurs before steady state translation in the cytoplasm and nucleus by known and unknown mechanisms. Our work leads to new possibilities for obtaining a greater understanding of mRNA degradation before steady state translation, which is currently underappreciated and has many unknowns.
All plants except for smg7-1 and its wild-type controls were grown at 22°C with a 16-h-light (110-140 mmol m 22 s 21 , PAR)/8-h-dark photoperiod before harvesting for RNA extraction. For smg7-1 and its wild-type control, the seedlings were first grown at 22°C, but after genotyping, smg7-1 homozygous individuals and wild-type control plants were transferred to 28°C in order to inhibit autoimmune responses in smg7-1. For PARE library construction and modified RLM 59 RACE assays, RNA was extracted from the inflorescences of soil-grown plants or from seedlings grown on 0.8% (w/v) Bacto-agar plates containing half-strength Murashige and Skoog medium (pH 5.7) and 1% (w/v) Suc. For phosphate-starvation treatment of wild-type plants and Arabidopsis transgenic lines carrying artificial miRNA targets, seedlings were grown on Pi-sufficient agar plates for 7 d and then transferred to Pi-deficient or Pi-sufficient agar plates for an additional 7 d before harvesting for RNA extraction. The Pi-sufficient and Pi-deficient agar plates contained 1% (w/v) Bacto-agar, half-strength modified Hoagland nutrient solution, and 1% (w/v) Suc supplemented with 250 mM and 10 mM KH 2 PO 4 , respectively.

PARE Library Construction and Sequencing
For each genotype, ;80 mg of total RNA isolated from two separate biological replicates using PureLink Plant RNA Reagent (Thermo Fisher Scientific) was used for PARE library construction following the protocol described previously (Zhai et al., 2014). PARE libraries were sequenced on the Illumina HiSeq 2500 platform.

Preprocessing and Mapping of PARE and GMUCT Reads
Previously published PARE data for Arabidopsis and budding yeast (Saccharomyces cerevisiae), Degradome-Seq data for rice (Oryza sativa) and worm (Caenorhabditis elegans), and GMUCT data for Arabidopsis and human (Homo sapiens) HEK293 cells were downloaded from the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/). Accession numbers and details for the publicly available 59P end data analyzed are given in Supplemental Data Set 3. For PARE and Degradome-Seq data, trimmed reads of 20 to 21 nucleotides with a quality score $ 30 were mapped to the corresponding genome and cDNA sequences downloaded from TAIR 10 (https://www.arabidopsis.org/), the MSU Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/; RGAP 7), the Saccharomyces Genome Database (https://www.yeastgenome.org/; S288C), and the WormBase database (https://www.wormbase.org/; WS269) using Bowtie 1.2.1.1 with zero mismatches (Langmead et al., 2009). PARE and Degradome-Seq reads with more than 10 genomic hits or that mapped to the chloroplast genome, mitochondria genome, rRNAs, tRNAs, small nuclear RNAs, or small nucleolar RNAs were discarded. For the human GMUCT data, trimmed reads of 30 to 50 nucleotides with a quality score $ 30 were mapped to the human genome and cDNA sequences downloaded from the Ensembl database (http://www.ensembl. org/; GRCh38p12) using Bowtie 1.2.1.1 with two mismatches. Human GMUCT reads with more than 10 genomic hits or that mapped to the mitochondria genomes, rRNAs, tRNAs, small nuclear RNAs, or small nucleolar RNAs were discarded. The processing and mapping of the Arabidopsis GMUCT data was similar to that of the human GMUCT data, but the Arabidopsis sequences were used for mapping and filtering. The abundance of PARE, Degradome-Seq, or GMUCT sequences was first normalized to the total remaining reads to TP40M and then assigned to the genomic or gene position corresponding to the first nucleotide of the sequence. As many genes have multiple isoforms, only exons with a size $ 50 nucleotides in the longest isoform of a gene were included in the following exon analyses.

Metagene Analysis of 59P Ends
For metagene analysis of 59P ends mapping to the 50-nucleotide region upstream of the exon-exon junction, the normalized occurrence of 59P ends (P i ) starting at each position was calculated using the formula where C i,j is the 59P end count starting at position i of exon j and n is the number of analyzed exons with a size $ 50 nucleotides and a 59P end raw count $ 1. Then, the relative frequency of normalized 59P end occurrences in the 50-nucleotide region was plotted.
For comparison of 59P ends mapping to the canonical EJC region in the wild type and mutants, exons possessing a total 59P end count # 10 TP40M in the canonical EJC region (26 to 30 nucleotides upstream of the exonexon junction) in all of the samples were excluded. The abundance of 59P ends at the TSS of genes harboring the exons included in the comparison of 59P ends in the EJC region was calculated as the total 59P end count in the 21-nucleotide region symmetrically flanking the TSS based on the TAIR 10 annotation. The TSS regions with a total 59P end count # 5 TP40M in all of the samples were excluded from the comparison.

Identification and GO Analysis of NMD Targets with 59P End Data
The wild-type Arabidopsis PARE data generated in this study and the Degradome-Seq data for wild-type rice produced by Li et al. (2010) were used in the identification of putative NMD targets. These targets were identified through the analysis of EJC footprints using a custom R script (https://github.com/LabHMChenABRC/EJC-DegradomeAnalysis). We first selected protein-coding genes with at least one intron-containing gene model and a maximal 59P peak with a count $ 10 TP40M in representative gene models. If the position of the maximum 59P peak matched one of the following two criteria, the transcript was identified as a putative NMD target.
(1) The maximum 59P peak was located in the first or second canonical EJC region downstream of an AS site based on the existence of a nonrepresentative gene model with an altered translation start or termination site compared with that of the representative gene model. NMD of this type of target was classified as being triggered by AS.
(2) The maximum 59P peak was located in the first or second canonical EJC region of the exon 39 end at least 50 nucleotides downstream of the annotated stop codon. NMD of this type of target was classified as being triggered by an intron-containing 39 UTR. The analysis of enriched GO terms for putative NMD targets was performed using PANTHER version 14 (Mi et al., 2019).

Artificial miRNA Target Construction and Arabidopsis Transformation
The Cauliflower mosaic virus 35S promoter, a 330-bp fragment of the Arabidopsis NLA 59 UTR, and the intron-free GUS from the pBI121 vector or the intron-containing GUS gene from the pBISN1 vector were fused and then cloned into the pCAMBIA1390 vector for Arabidopsis transformation. Homozygous T3 transgenic lines were selected and used for phosphate treatments. The PCR primers used for cloning artificial miRNA targets are listed in Supplemental Table 2.

Modified RLM 59 RACE Assay and Quantification of Steady State mRNA Levels
The GeneRacer kit (Thermo Fisher Scientific) was used to detect the 59P ends of specific genes. Total RNA (2-3 mg) isolated using the PureLink Plant RNA Reagent (Thermo Fisher Scientific) or mirVana miRNA Isolation Kit (Thermo Fisher Scientific) was used as a template in modified RLM 59 RACE assays that were performed according to the manual of the Gen-eRacer kit, except that the calf intestinal phosphatase and tobacco acid pyrophosphatase treatments were skipped. After RNA was ligated to the 59 RNA adapter, an oligo(dT) primer was used to synthesize cDNA, which served as the template for PCR analysis with a GeneRacer 59 primer and a gene-specific primer. For the modified RLM 59 RACE of PHO2, nested PCR was performed with a GeneRacer 59 nested primer and a genespecific nested primer. PCR products were gel purified and cloned into the pJET1.2 vector or pCR4-TOPOTA vector (Thermo Fisher Scientific) for sequencing. Control HeLa total RNA included in the GeneRacer kit was treated with tobacco acid pyrophosphatase for a spike-in control. The same cDNA was used for the quantification of steady state mRNA levels of PYM and ACT8 with pairs of gene-specific primers. The PCR primers are listed in Supplemental Table 2.

RNA Gel Blot Analysis of Small RNAs
Total RNA (5 mg) isolated using the PureLink Plant RNA Reagent (Thermo Fisher Scientific) was used for small RNA gel blot analysis following the procedures described previously by Lee et al. (2015). The probes used for the detection of miR827 and U6 are listed in Supplemental Table 2.

PYM Overexpression in Protoplasts
The Arabidopsis PYM coding region amplified from cDNA was cloned into pJD301 to replace the firefly luciferase sequence using the NEBuilder HiFi DNA Assembly Master Mix according to the manufacturer's recommendations (New England Biolabs). The PCR primers used for PYM cloning are listed in Supplemental Table 2. Protoplast isolation and transfection were performed as described previously with minor modifications (Hou et al., 2016). Arabidopsis mesophyll protoplasts were isolated from 17-dold rosette leaves. Together with 5 mg of transfection control plasmids, 30 mg of pJD301 or pJD-PYM was transfected into 4 3 10 4 protoplasts using polyethylene glycol solution. The transfected protoplasts were incubated at 22°C in the dark for 19 to 20 h. Then, 2 3 10 5 transfected protoplasts were pooled and lysed for RNA extraction with the mirVana miRNA Isolation Kit (Thermo Fisher Scientific).

Accession Numbers
The PARE data generated in this study are available in the GEO database under the accession number GSE118215. The previously published PARE data, Degradome-Seq data, and GMUCT data used in this study are available in the GEO database under the accession numbers shown in Supplemental Data Set 3. Sequences of individual genes included in 59P end analysis using PARE data or modified RLM 59 RACE assays can be found in TAIR or the MSU Rice Genome Annotation Project database under the locus numbers indicated. The Arabidopsis mutants analyzed and genes cloned can be found in the TAIR database under the following accession numbers: AT5G42540 for xrn2-1, AT1G75660 for xrn3-3, AT1G54490 for xrn4-6, AT5G63980 for fry1-6, AT2G18790 for phyB-9, AT2G33770 for pho2, AT5G19400 for smg7-1, AT1G02860 for NLA, AT1G48410 for ago1-3, and AT1G11400 for PYM.

Supplemental Data
Supplemental Figure 1. 59P end analyses of polyadenylated and deadenylated degradation fragments, using PARE and GMUCT data, all show an enrichment in the canonical EJC region.
Supplemental Figure 2. The EJC regions downstream of PTCs harbor SMG7-dependent 59P peaks in Arabidopsis. Figure 3. Genes producing NMD-suppressed AS variants possess a maximum 59P peak in the EJC region downstream of the PTC.

Supplemental
Supplemental Figure 4. Dysfunction of SMG7 reduces the number of EJC footprints in putative NMD targets.
Supplemental Figure 6. Some Arabidopsis miRNA targets possess dominant EJC footprints downstream of miRNA-guided cleavage sites. Figure 7. XRN4 contributes to the turnover of 39 cleavage remnants and production of EJC footprints in miRNA targets.

Supplemental
Supplemental Figure 8. Some Arabidopsis miRNA targets possess poor EJC footprints downstream of miRNA-guided cleavage sites.
Supplemental Table 1. GO terms enriched in putative Arabidopsis NMD targets identified from the analysis of EJC footprints in the RNA degradome.
Supplemental Table 2. Sequences of primers for modified RLM 59 RACE, quantification of steady state mRNA, cloning, and RNA gel blot analysis.
Supplemental Data Set 1. Putative Arabidopsis NMD targets identified from the analysis of EJC footprints in the RNA degradome.
Supplemental Data Set 2. Putative rice NMD targets identified from the analysis of EJC footprints in the RNA degradome.
Supplemental Data Set 3. Summary of 59P end data sets used in this study.