Identification of U2AF(35)-dependent exons by RNA-Seq reveals a link between 3′ splice-site organization and activity of U2AF-related proteins

The auxiliary factor of U2 small nuclear RNA (U2AF) is a heterodimer consisting of 65- and 35-kD proteins that bind the polypyrimidine tract (PPT) and AG dinucleotides at the 3′ splice site (3′ss). The gene encoding U2AF35 (U2AF1) is alternatively spliced, giving rise to two isoforms U2AF35a and U2AF35b. Here, we knocked down U2AF35 and each isoform and characterized transcriptomes of HEK293 cells with varying U2AF35/U2AF65 and U2AF35a/b ratios. Depletion of both isoforms preferentially modified alternative RNA processing events without widespread failure to recognize 3′ss or constitutive exons. Over a third of differentially used exons were terminal, resulting largely from the use of known alternative polyadenylation (APA) sites. Intronic APA sites activated in depleted cultures were mostly proximal whereas tandem 3′UTR APA was biased toward distal sites. Exons upregulated in depleted cells were preceded by longer AG exclusion zones and PPTs than downregulated or control exons and were largely activated by PUF60 and repressed by CAPERα. The U2AF(35) repression and activation was associated with a significant interchange in the average probabilities to form single-stranded RNA in the optimal PPT and branch site locations and sequences further upstream. Although most differentially used exons were responsive to both U2AF subunits and their inclusion correlated with U2AF levels, a small number of transcripts exhibited distinct responses to U2AF35a and U2AF35b, supporting the existence of isoform-specific interactions. These results provide new insights into function of U2AF and U2AF35 in alternative RNA processing.


INTRODUCTION
Eukaryotic genes contain intervening sequences or introns that are removed from mRNA precursors by a large and highly dynamic RNA-protein complex, termed the spliceosome (1). The spliceosome consists of small nuclear ribonucleoproteins (snRNPs), including the U1, U2, U4/U5/U6 of the major U2 spliceosome and the U11, U12, U4atac / U6atac/U5 of the less abundant U12 spliceosome, and a large number of non-snRNP proteins (1). Spliceosomes assemble on each intron in an ordered manner, starting with recognition of the 5 splice site (5 ss) by U1 snRNP or the 3 ss by the U2 pathway (2), which involves binding of the U2 auxiliary factor (U2AF) to the 3 ss region to facilitate U2 snRNP recruitment to the branch point (BP) (3,4). U2AF is a stable heterodimer composed of a U2AF2-encoded 65-kD subunit (U2AF65), which contacts the polypyrimidine tract (PPT), and a U2AF1-encoded 35-kD subunit (U2AF35). U2AF35 interacts with almost invariant AG dinucleotides at 3 ss and stabilizes U2AF65 binding to RNA (5)(6)(7)(8).
The U2AF1 gene is alternatively spliced, giving rise to conserved mRNA isoforms termed U2AF1a, U2AF1b and U2AF1c (9). In mice, U2AF1a is more abundant than U2AF1b and contains a highly conserved 67-bp exon 3 in the mRNA whereas U2AF1b incorporates exon Ab of the same size (9). The U2AF1c isoform includes both exons that introduce a premature termination codon (PTC) in the mRNA, which is targeted by RNA surveillance (9). Both U2AF35a and U2AF35b contain a central U2AF65 recognition domain of the UHM (U2AF homology motif) type flanked by two C3H-type zinc finger (ZF) domains and a C-terminal arginine/serine-rich (RS) region (10)(11)(12)(13). Both U2AF35a and U2AF35b interact with U2AF65 and could stimulate its binding to a PPT (9). During evolution, the two U2AF35 proteins have been under high selection pressure, suggesting that they may play specific functions in vertebrates and plants (9,14,15), but their putative functional differences in 3 ss selection are not known.
3748 Nucleic Acids Research, 2015, Vol. 43, No. 7 Although U2AF35 is essential for viability of both yeast and higher eukaryotes (16)(17)(18)(19), the 3 ss AG was found to be dispensable for the in vitro splicing of pre-mRNAs with strong PPTs (20). Binding of U2AF65 and U2AF35 to weak 3 ss was promoted by splicing activators under splicing conditions (21,22), however, several splicing events assumed to depend critically on U2AF35 did not show any defect under conditions of limited U2AF35 availability in vivo (14,23) and some alternative 3 ss responsive to U2AF35 depletion were intrinsically stronger than their nonresponsive counterparts (24). Thus, the distinction between U2AF35dependent and -independent introns has remained obscure. In addition, overexpression of U2AF65 and depletion of U2AF35 resulted in activation of the same cryptic 3 ss, suggesting that their balance is important for 3 ss selection (24). However, global RNA processing changes in response to U2AF35 depletion or varying ratios of U2AF35/U2AF65 have not been examined.
In this study, we have characterized the transcriptome of human embryonic kidney (HEK) 293 cells lacking U2AF35 and each U2AF35 isoform.

Cell cultures, transfections and library preparations
HEK293 cells were grown under standard conditions in DMEM supplemented with 10% (v/v) bovine calf serum (Life Technologies). For depletion experiments (Supplementary Figure S1), the cells were treated with small interfering RNAs (siRNAs) or splice-switching oligonucleotides (SSOs) targeting splice sites of mutually exclusive U2AF1 exons 24 h after seeding. Transfections were carried out in 6-or 12-well plates using jetPRIME (Polyplus) according to manufacturer's recommendations. The cells were harvested after 24 and 48 h or received the second hit after 48 h when splitting the cells into new plates. The remaining cultures were harvested 24 and 48 h after the second hit. For RNA sequencing (RNA-Seq), total RNA was extracted using RNeasy Plus (Qiagen) from cells harvested 72 h after the first hit. The NEBNext poly(A) mRNA magnetic isolation module (E7490L) and the Human/mouse/rat Ribo-Zero TM rRNA Removal Kit (Cambio/Epicentre) for RNA were employed according to manufacturers' recommendations. The libraries were prepared using the NEBNext R Ultra DNA Library Prep Kit for Illumina R (E7370L), sizeselected and multiplexed before paired-end sequencing on the HiSeq 2500 Ultra-High-Throughput Sequencing System (Illumina). siRNAs to downregulate the remaining proteins were as previously described (24 and references therein); the siRNA duplex to heterogeneous nuclear ribonucleoprotein C (hn-RNP C) was reported earlier (25).

Detection of spliced products
Total RNA was transcribed using the Moloney murine leukemia virus reverse transcriptase (Promega) and oligo(dT) primers according to the manufacturer's recommendations. Alternatively spliced U2AF35 exons were visualized by complete HinfI digests of amplified polymerase chain reaction (PCR) products, as described previously (9). Signal intensities of amplified fragments were measured as described (24).

RNA-Seq analysis
Apart from the knockdown of U2AF35 and its isoforms (Supplementary Table S1 (30) using default stringencies and parameters, except for modification of the UCSC reference (hg19) (31) by introducing the U2AF1 isoform s, which lacks exons Ab and 3. Sequences recognized as originating from mtRNA, rRNA and tRNA were subsequently removed. Analysis of differential exon usage was performed using DEXSeq (1.12.1) (32) and MISO (mixture-of-isoforms) (26). DEXSeq-detected exons were selected based on statistical significance of differential usage (q < 0.05). Unlike DEXSeq, MISO (v. 0.4.9; hg19) (26) computes percentage of spliced in (psi, ) splice junction-spanning reads and examines their significance using Bayesian factors. The filtering cut-offs were set to default parameters, on the basis of difference and event significance ( > 0.2 and K > 10). Statistically significant events were individually verified in genome browsers to exclude false positives as a result of misannotated transcripts, low expression, overlapping transcripts and apparent misclassifications.
Differential gene and isoform expression between sample sets was analyzed with Cufflinks (v. 2.1.1) (33), which normalizes the reads using a fragments per kilobase of exon model per million reads (FPKM) measure. Gene and isoform expression assessment was aided by the transcriptome reference (hg19, UCSC) with no novel transcript discovery and was followed by CummeRbund (v. 2.2.0) (33) analysis of differential gene and isoform expression (in R environment; v. 3.0.2). Selection of significantly differentially expressed genes was made on the basis of FDR-adjusted Pvalues (q < 0.05). RNA-Seq data for U2AF35 depletion experiments are available at ArrayExpress under the accession number E-MTAB-2682. Finally, gene-and exon-level functional enrichment analyses of differentially expressed events were performed using DAVID (34,35).
Nucleic Acids Research, 2015, Vol. 43, No. 7 3749 Validation of U2AF35-dependent events RNA was extracted using TRI reagent, treated with DNase I (LifeTechnologies) and reverse-transcribed as described above. Target transcripts were chosen based on P-and FPKM-values that are shown in full in Supplementary Tables S2 and S3, respectively. PCR primers (Supplementary  Table S4) were designed to amplify two or more isoforms with different sizes. Exogeneous transcripts were amplified using RT-PCR with vector primers PL3 and PL4 (36) or their combinations with transcript-specific primers.

Plasmid constructs
Splicing reporter minigenes were cloned into pCR3.1 (Invitrogen) using primers shown in Supplementary Table S4. IgM minigenes were a generous gift of Martha Peterson, University of Kentucky. Plasmid DNA was extracted using Wizard R Plus SV Minipreps (Promega). Expression constructs of U2AF35 isoforms were described previously (24). PUF60 was subcloned into pCI-neo (Promega) with the Xpress tag at the N terminus, employing the pET28a-PUF60-His construct (24) as a template. All constructs were sequenced prior to transfections to exclude undesired mutations.

Sequence features of U2AF(35)-dependent exons and introns
Sequences individually validated in genome browsers were examined using algorithms that predict both traditional and auxiliary splice-site recognition motifs. Intrinsic splicesite strength was computed using both maximum entropy and frequency matrix scores (37)(38)(39)(40). Prediction of BPs, PPTs and AG exclusion zones (AGEZs) was carried out using a support vector machine (SVM) algorithm (41). De novo motif discovery, motif enrichment and motif location analyses were performed using the MEME suite of programmes (42) with sequences flanking differentially used 3 ss, 5 ss and internal exons as the input.

Measurements of single-strandedness across 3 ss
We computed the PU (probability of unpaired) values for all substrings of high-confidence upregulated and downregulated sequences in their natural context, extending input sequences by 30 nt in each direction. The PU values employ the equilibrium partition function of RNAfold (43) and were as defined previously (44). Input sequences were fixed relative to the position of upregulated and downregulated 3 ss. Their PU values were averaged for each intron position. The means were compared by the Wilcoxon-Mann-Whitney test. Delta PU values are defined here as the difference between mean PU values of upregulated and downregulated exons at the indicated positions.

Identification of U2AF(35)-dependent exons
We knocked down U2AF35 and each alternatively spliced U2AF35 isoform in HEK293 cells ( Figure 1A and B) and examined genome-wide exon/transcript usage by RNA-Seq. We achieved ∼90% depletion with siRNAs targeting both isoforms and a reversal in the relative abundance of U2AF35a and U2AF35b using isoform-specific siRNAs, with a maximum at 72 and 96 h post-transfection (Figure 1C and Supplementary Figure S1). In addition to siR-NAs, we employed SSOs targeting 3 splice sites of alternatively spliced U2AF1 exons, resulting in a less robust and more delayed response in knockdown levels (Supplementary Figure S1). Using the Illumina HiSeq 2500 platform, we obtained a total of 546 390 339 reads mapped to the annotated human transcriptome and genome, averaging to 68M per RNA sample (Supplementary Table S1). The fraction of U2AF1 mRNAs ranged between ∼0.001% for depleted samples and ∼0.01% for controls, with isoformspecific knockdowns giving intermediate levels. The range of U2AF1/U2AF2 ratios varied between 0.1 and 2.3 (Figure 1D and E). Surprisingly, the U2AF1 depletion was associated with a 2-fold increase of each alternatively spliced U2AF2 isoform ( Figure 1F). The U2AF1a-specific depletion led to a lower overall U2AF1 expression than the U2AF1b knockdown while expression of isoforms recognized by RNA surveillance remained low ( Figure 1E). The relative abundance of U2AF1b slightly increased in cells treated with siRNAs targeting both isoforms compared to untreated cells ( Figure 1E and G).
As U2AF35 contacts the 3 ss AG dinucleotide as a part of complexes assembled ad hoc on each intron (5-7), we used DEXSeq and MISO algorithms to identify exons differentially used in depleted cells. DEXSeq is based on generalized linear models and relies on biological controls to identify differential exon usage (32), whereas MISO employs a Bayesian approach and splice junction-spanning reads to detect specific alternative splicing events (26). Altogether, DEXSeq identified a total of 484 upregulated and 575 downregulated exons in siRNA U2AF35ab-depleted cultures (termed ab-), with no bias toward either (P > 0.05, binomial test), whereas the number of MISO-detected events was ∼60% higher (Supplementary Table S2 and data not shown). Gene-level expression analyses with Cufflinks (33) revealed 1507 upregulated and 2011 downregulated genes (Supplementary Table S3). Overlap of genes with differentially used exons and Cufflinks was highly significant (Figure 1H), suggesting that these exons may contribute to overall gene expression alterations observed in depleted cultures.

Characterization of global alternative polyadenylation changes induced by U2AF35 depletion
Among differentially used exons, start and terminal exons were more frequent than expected while internal exons were significantly more common among downregulated than upregulated events (Figure 2A). We observed a similar bias of start/end exons for published RNA-Seq data of cell cultures depleted of other RNA-binding proteins, including hnRNP C, a U2AF65 competitor (25), but not for cells depleted of a DNA-binding factor (Supplementary Table S5). Almost a half of differentially used exons were represented more than once per transcript, consistent with the presence of multiexon segments upregulated or downregulated in depleted cells ( Figure 2B). Browser verification of individual events revealed that activation or repression of terminal exons was largely due to the altered usage of previously annotated   3752 Nucleic Acids Research, 2015, Vol. 43, No. 7 alternative polyadenylation (APA) sites (45), with intronic APA sites as the most common APA category (Figure 2C-F and Supplementary Table S6). Categorization of 155 APA sites individually confirmed in a genome browser as affected by U2AF(35) depletion and verified against APA repositories revealed that while intron-proximal and -distal APA sites were about equally represented when APA was associated with alternative 3 ss, intronic APA sites promoted in ab-cultures were largely proximal whereas tandem 3 UTR APA sites were biased toward distal sites ( Figure 2C-E). Unexpectedly, breakdown of 211 DEXSeq-positive exons in genes that were either upregulated or downregulated in depleted cultures (i.e. Cufflinks-positive) showed preferential involvement of the first exons while terminal and internal exons were about equally represented ( Figure 2G). Their individual browser inspection revealed that the excess was attributable to APA and 5 ss of the first introns while altered usage of annotated alternative transcription initiation sites was rare, further supporting a prominent impact of U2AF35 depletion on APA.
Together, these data indicated that APA was a major contributor to the differential exon usage in depleted cells and revealed APA category-dependent shifts of proximal and distal APA sites conferred by a lack of U2AF(35) and/or an increase of U2AF65. They also suggested that a simple distribution of differentially used start, internal and end exons in RNA-Seq depletion experiments could be indicative of the relative importance of a depleted factor for each RNA processing step.

A high validation rate of DEXSeq-detected alternative RNA processing events
Extensive experimental validation of 82 high-confidence DEXSeq-detected events from the same and independent depletion experiments with the same cell line confirmed 76 exons (Supplementary Table S4, Figure S2), including alternative 3 ss APA in PHF14 and intronic APA in PHF19 (Figure 2H). Apart from endogenous RNAs, altered exon inclusion as a result of U2AF(35) depletion was found also for exogenous transcripts (see below), including murine IgM transcripts without the RNA polymerase II (polII) pause site located between proximal and distal APA sites (Figure 2I). As browser verification of MISO-detected events showed a higher false positivity than for DEXSeq (data not shown), their systematic validation was not carried out. However, a high stringency of default DEXSeq settings and a high sensitivity of MISO may provide a higher accuracy in identifying genuine alternative RNA processing events affected by U2AF35 depletion, complementing each other.

U2AF35 dependency can be explained largely by a lack of the U2AF heterodimer
Apart from altered exon usage seen only in ab-cultures, most transcripts showed a gradient in RNA processing defects, with a hierarchy ab-> a-> b-> controls, mirroring total levels of U2AF35 or U2AF (cf. Supplementary Figure  S2A and Figure 1C-E). Individual depletion of each U2AF subunit in HEK293 cells showed that U2AF65 depletion, which reduces U2AF35 levels (14) (Figure 2I), shifted us-age of most exons in the same direction as U2AF35 depletion (Supplementary Figure S2B). In MAPK8IP3 transcripts, however, depletion of U2AF35 promoted inclusion of an 18-nt exon; in contrast, U2AF65 depletion and depletion of U2AF35a failed to activate this exon and led to skipping of the preceding 12-nt exon instead. To understand this phenomenon, we examined these pre-mRNAs in a dose-dependent transfection experiment shown in Supplementary Figure S3A. Interestingly, U2AF65 depletion increased the relative abundance of U2AF35b, suggesting that skipping of the 12-nt MAPK8IP3 exon in a-and U2AF65cells was due to the excess of U2AF35b. As U2AF65 depletion reduced U2AF65 more than U2AF35, potentially limiting the amount of the available heterodimer, we estimated residual levels of U2AF in each sample by measuring signal intensity from immunoblots from the same transfection (Supplementary Figure S3A). U2AF levels correlated significantly with the usage of most exons, particularly with those excluded from pre-mRNAs in depleted cells. In contrast, many exons upregulated in U2AF35 depleted cells were not activated in cells lacking U2AF65, and several very small exons, including a 12-nt MAPK8IP3 exon, did not correlate with U2AF (Supplementary Figure S3B-D and below). Thus, most but not all differential exon usage induced by U2AF35 depletion could be attributed to a lack of the U2AF heterodimer and sequences of these exons should therefore reveal binding signatures of both U2AF subunits.

Identification of 3 splice sites altered by U2AF35 depletion
Combined MISO (26) and DEXSeq (32) analyses identified a total of 231 differentially used alternative 3 ss pairs. Their individual inspection in genome browsers confirmed 138 pairs of 3 ss, with 93 intron-proximal sites promoted and 45 repressed in ab-cultures (Supplementary Table S7). Only 51/138 sites (37%) promoted in ab-cultures were intrinsically stronger than their competing counterparts (binomial test, two-tail P = 0.003; Supplementary Table S9). We also observed a significant lack of canonical cytosine at position −3 and guanine at position +1 relative to upregulated proximal 3 ss (Supplementary Figure S4A and B). Both positions contribute substantially to the 3 ss strength (37).
Since over a third of APA sites altered in depleted cells was associated with annotated alternative 3 ss usage (Figure 2E and 2F, Supplementary Table S6), we tested if their intrinsic strength contributes to APA selection. Analysis of browser-verified 57 pairs of alternative APA 3 ss showed that unlike alternative 3 ss not associated with APA, intrinsically weaker sites were not preferred (28 stronger versus 29 weaker in ab-cultures). Likewise, we observed no enrichment of weaker sites when comparing proximal upregulated and downregulated APA 3 ss with APA 3 ss of terminal exons. Although the number of APA-associated alternative 3 ss was smaller than the number of non-APA 3 ss pairs, this result is consistent with an additional function of U2AF in APA control that is independent of interactions with alternative 3 ss of internal exons.

U2AF(35) depletion can influence 5 splice-site usage
U2AF35 contacts 3 ss AG (5-7), but we identified at least 32 alternative 5 ss pairs influenced by U2AF35 depletion. Eleven of them were tested using RT-PCR and 10 were confirmed (Supplementary Tables S4 and S8; Figure S5A-G). Over a third of these 5 ss (n = 11) were located in the first introns. As for 3 ss, proximal (26 versus 6) and weaker (19 versus 13) 5 ss were activated more often in depleted cultures than their distal and stronger counterparts. Activation of cryptic 3 ss was occasionally accompanied by cryptic 5 ss activation of the same exon ( Supplementary Figure S5H). Because intronic APA sites compete with upstream 5 ss (46), we also determined the intrinsic strength of the first 5 ss upstream of 67 intronic APA sites affected by U2AF35 depletion. Their average scores tended to be lower (P = 0.12; Supplementary Table S9) than those of authentic counterparts of aberrant 5 ss, which were used as controls (40).

U2AF35 depletion activates exons with longer AGEZs and PPTs
Examination of 204 browser-verified internal exons that excluded multiple-exon regions associated with APA sites revealed that U2AF35 depletion promoted both exon inclusion and skipping as well as intron splicing and, sporadically, intronization within large exons (Supplementary Figure S6A and B). Interestingly, exon activation in depleted cells often occurred in weak, incompletely removed introns (Supplementary Figure S6C). U2AF(35)-sensitive exons were largely alternatively spliced and had sequence features typical of alternative exons (Supplementary Figure  S7 and Table S10). Importantly, upregulated exons had a significantly longer AGEZ than downregulated or control exons ( Figure 3A), but did not show a comparable decrease of other purine dinucleotides upstream of 3 ss (Supplementary Figure S8A). The AGEZ length correlated with the expression change of upregulated but not downregulated exons ( Figure 3B and C). Upregulated exons had also longer PPTs for the best predicted BP although the associated Pvalues were lower than for the AGEZ ( Figure 3D). The PPT length also correlated with the change in usage of upregulated exons ( Figure 3E and F). A specific lack of AGs and longer AGEZs/PPTs was found also upstream of alternative 3 ss upregulated in ab-cultures (insets in Figure 3A and D and Supplementary Figure S8B) and this tendency was observed also for as few as 57 alternative 3 ss leading to APA (Supplementary Figure S9, Figure 2E).
Interestingly, the lack of AGs upstream of upregulated exons was associated with adenine depletion between position −17 and −38, which approximately corresponds to the optimal BP location (20,47), whereas guanine depletion was more widely distributed (Supplementary Figure S10A and  B). Instead, upregulated exons showed enrichment for adenine at positions −3 to −9 while uridine tended to show the opposite, with enrichment between −17 and −38 and depletion closer to the 3 ss. Conversely, downregulated exons showed adenine enrichment in the optimal BP region, particularly in smaller exons (Supplementary Figure S10C-E), and uridine enrichment closer to 3 ss. We also searched for additional known and unknown motifs in sequences flanking U2AF(35)-sensitive junctions using the MEME suite of programs (42), however, no significant hits were found.
We conclude that 3 ss upregulated in ab-cells have longer AGEZs/PPTs, adenine depletion in the optimal BP location and enrichment closer to 3 ss, with uridines showing the opposite pattern. This arrangement moves PPTs further upstream of upregulated exons as compared to their downregulated counterparts. The observed widespread repression by U2AF(35) could thus reflect a lack of AG dinucleotides upstream of 3 ss, which are generally repressive when located downstream of BPs (20,48,49), and/or longer, more upstream PPTs, which may bind exon-repressive PPTbinding proteins (25,50). Because these sequence characteristics are likely to influence secondary structure formation across 3 ss, this group of exons should provide a powerful tool to study regulation of 3 ss by RNA-binding proteins and RNA folding.

Exons repressed by U2AF(35) are stimulated by PUF60 and inhibited by CAPER␣
To examine the role of U2AF-related proteins in usage of U2AF-dependent exons, we measured inclusion of 44 exons in HEK293 cells depleted of PUF60, CAPER␣ (RBM39), CAPER␤ (RBM23) and two other Y-binding proteins (Figure 3G). PUF60 and U2AF were individually capable of protecting the 3 ss AG in footprinting experiments (51), but the exact function of CAPER␣/␤ in recognition of 3 ss or U2AF-dependent exons is unknown. We also depleted hnRNP A1, which allows U2AF to discriminate between pyrimidine-rich RNA sequences followed or not by a 3 splice-site AG (52), and DEK, which facilitates the U2AF35-AG interaction and prevents binding of U2AF65 to pyrimidine tracts not followed by AG (53) (data not shown). Remarkably, the majority of exons upregulated in cells depleted of U2AF35 were downregulated in cells depleted of PUF60 ( Figure 3H and Supplementary Figure  S11). In contrast, CAPER␣, and to a lesser degree PTB, showed synergism with U2AF(35) for this group of exons ( Figure 3H). A significant directionality of hnRNP C, DEK and hnRNP A1 could not be established with this sample size ( Figure 3H and data not shown). To confirm that PUF60 stimulates exons repressed by U2AF, we measured exon inclusion of a minigene reporter transfected into HEK293 cells lacking or overexpressing PUF60 (Figure 3I). We found that cells lacking PUF60 showed increased skipping of this exon, whereas the PUF60 overexpression increased its inclusion in the mRNA. Thus, U2AF(35)-induced exon usage was predictive of responses to other Y-binding proteins, revealing the connection between antagonism and synergism of U2AF-related proteins and characteristic 3 ss organization described above.

Unpaired regions upstream of U2AF(35)-activated andrepressed exons
Because characteristic changes in nucleotide frequencies upstream of U2AF(35)-dependent exons (Supplementary Figure S10) are likely to affect formation of RNA secondary structure, which can influence 3 ss utilization (54,55), we  computed position-specific probabilities of RNA singlestrandedness (44) for high-confidence upregulated and downregulated internal exons. Remarkably, upregulated exons showed on average significantly higher unpaired probabilities at most positions between −25 and −50 than downregulated exons while a lower single-strandedness was observed closer to their 3 ss ( Figure 3J). This finding suggests that intramolecular base-pairing interactions over relatively long distances upstream of 3 ss control exon repression and activation by U2AF and, most likely, by U2AF-related proteins that showed functional antagonism and synergism with U2AF and bind single-stranded RNA ( Figure 3H).

U2AF(35) preferentially regulates nuclear proteins involved in RNA binding
Functional enrichment analysis (35) of exons/genes differentially used in U2AF35 depleted cells showed that they were enriched in proteins involved in RNA/nucleotide binding, respectively ( Figure 4A). Figure 4B shows a single example of alternative 3 ss in a known RNA-binding factor SF1. These 3 ss are responsible for production of BP-binding SF1 proteins with variable C-termini (56) and are associated with tissue-specific APA sites (Supplementary Figure S12). These proline-rich regions interact with PRPF40A (57), a component of the splicesomal E complex (58). Depletion of U2AF35 was associated with upregulation of SF1 mRNA and promotion of the distal 3 ss (Figure 4B and C). Transfection of the SF1 splicing reporter constructs into ab-cells confirmed repression of the proximal site ( Figure 4D and E). Thus, U2AF(35) regulates the length of SF1 3 UTR and, potentially, its PRPF40A interactions.
In addition to RNA binding, differentially expressed exons/genes were enriched in proteins involved in cytoskeleton organization, chromatin modification and proteins found in the organelle lumen ( Figure 4A). Figure 5A gives a summary of exons in transcripts involved in actin dynamics. Tropomyosin genes (TPMs), which control function of actin filaments in a tissue-specific manner (59), serve as the most prominent examples. In ab-cultures, exon 6a of TPM1 and TPM2 was repressed and exon 6b, which has a long PPT in both genes, was activated ( Figure 5B). Isoforms containing exon 6b have lower calcium sensitivity than isoforms with exon 6a, which may be required for a specific interaction with troponin (60). TNNT1, a gene coding for a slow skeletal muscle troponin T, sustained a cryptic 3 ss activation upon U2AF35 depletion (data not shown). Besides TPM1/2, exon 9a of TPM3 was upregulated in depleted cells as well as the TPM4 transcripts ( Figure 5B and Supplementary Table S3). TPM3 isoforms expressing exon 9a are more widely distributed in tissues than isoforms containing exon 9c (61), suggesting that U2AF(35) restricts tissue expression of ␥ -tropomyosin.
Finally, Figure 5C shows U2AF(35)-induced alterations of exon usage in eight genes coding for the histone modifying SAGA (Spt-Ada-Gcn5 Acetyltransferase) complex components, including KAT2A and KAT2B. The U2AF35 depletion promoted the use of a distal 5 ss in KAT2A (Supplementary Figure S5F), modifying the balance of alternatively spliced GCN5 isoforms and, most likely, histone acetyltransferase activity. 5 ss selection was influenced also in a GCN5 paralog KAT2B (PCAF; Supplementary Figure  S5G). The TADA3 gene, which encodes the GCN5 interaction partner, sustained activation of the proximal APA site upon depletion. In contrast, a proximal APA was repressed in TAF5L, a component of the SAGA architecture module ( Figure 5D). Activation of proximal APA was found also for TAF9 while a proximal 3 ss was promoted in ATXN7L3, a component of the deubiquitination module.

Exon-, isoform-and gene-level control of U2AF35 interaction partners
Over 25% (14/51) genes encoding high-confidence interaction partners of U2AF35 (62) were significantly upregulated or downregulated in depleted cells, which was more than expected (hypergeometric test, P < 0.05, Table 1). U2AF2 showed the highest increase, followed by CDC5L, which encodes a key component of the PRPF19-CDC5L complex required for the catalytic step of splicing (63). Upon U2AF35 depletion, the CDC5L interaction partner PLRG1 showed 3 UTR lengthening but CTNNBL1, CCAP1, CCAP3 and CCAP6 mRNAs were not noticeably altered. SNRPA1, which binds U2 snRNA (64), had reduced mRNA levels in depleted cultures. Transcripts encoding U2AF35-related protein U2AF26, which can interact with U2AF65 and functionally substitute U2AF35 in constitutive and enhancer-dependent splicing (65), were also downregulated while SF3 components SF3B1 and SF3B3 showed alterations in APA selection.
Supplementary Figure S13 shows examples of exon usage dependencies of high-confidence U2AF35 interaction partners. In depleted cells, PTC-containing cryptic exons in U2AF2 and CAPERα were downregulated and both genes were upregulated. Expression of CAPER␤ exon 4 was also increased while PUF60 exon 5 was downregulated. Alternative 5 ss of U2AF2 exon 10, which controls the inclusion of four amino acids in U2AF65, was not altered by U2AF35 depletion and the two U2AF2 isoforms were upregulated in depleted cells to the same extent ( Figure 1F). Together, these data identify high-confidence interaction partners of U2AF35 whose expression was altered upon U2AF(35) depletion and reveal exon-centric regulation of closely related U2AF genes.

Evidence for a distinct function of U2AF35 isoforms
U2AF35a and b differ from each other at seven aminoacid positions located in the RNP2 motif of the UHM, a proximal part of a long ␣-helix A and a disordered segment between the two folded regions (13). Isoform-specific depletions identified transcripts exhibiting a gradient in RNA processing defects reflecting total levels of U2AF (35) but also exons activated only upon ab-depletion where U2AF levels were the lowest (Supplementary Figures S2  and S3A,B). However, we also found events that occurred only in b-(Supplementary Figure S14)    varying siRNA concentrations ( Figure 6D) and with exogenous transcripts ( Figure 6E and F). For example, U2AF35a depletion activated a proximal 3 ss of PFN2 intron 2 and a distal cryptic 5 ss of the same intron whereas U2AF35b depletion was associated with the opposite effect in a dosedependent manner ( Figure 6F). The proximal 3 ss was promoted by U2AF35b and repressed by U2AF35a also in reconstitution experiments in which we individually added plasmids expressing each isoform to ab-cells ( Figure 6G). Repression and activation of the PFN2 3 ss was confirmed in cultures depleted with SSOa and SSOb (Supplementary Figure S1B and data not shown). Alternative 3 ss of PFN2 generate isoforms with distinct C-termini of profilin 2, a key actin-monomer binding protein, that have distinct binding affinities for proline-rich sequences and show tissue-specific expression (66,67). Interestingly, exogenous U2AF35b expression was higher compared to U2AF35a ( Figure 6G). The existence of isoform-specific effects was also supported by the number of differentially expressed genes/exons in isoform-specific depletions, with a significant overlap in each category and a low ratio of exon-level versus genelevel events in b-samples ( Figure 6H). In contrast to most transcripts, correlation between U2AF levels and exon usage was absent or decreased for genes with isoform-specific responses (cf. Figure 6I and Supplementary Figure S3). To reinforce these findings further, we carried out independent isoform-specific depletion experiments with total RNA depleted of rRNA. These samples contain a higher fraction of unprocessed RNA than polyA-selected RNA, giving more information about intron splicing (68). DEXSeq analysis followed by browser-assisted verification revealed that the bias toward start and terminal exons was even greater than for poly(A) samples and reconfirmed isoform-specific effects validated experimentally (Supplementary Table S5 and Figure 6 and Supplementary Figure  S14).
Taken together, identification of transcripts with distinct responses to U2AF35a and U2AF35b argues for the existence of isoform-specific interactions that may confer even opposite effects on splice-site selection and provide additional level of exon regulation.

DISCUSSION
We have shown the first global characterization of RNA processing alterations associated with depletion of U2AF35 isoforms. Our data reveal that U2AF function is not limited to 3 ss recognition but involves extensive APA control (Figure 2), particularly through intronic APA sites, suggesting that U2AF contributes to the tight regulation of tissuespecific expression. We have also described characteristic 3 ss organization of U2AF(35)-repressed and -activated exons and functional antagonism and synergism of U2AFrelated proteins PUF60 and CAPER␣ (Figure 3). The exon repression and activation was associated with significant shifts of average unpaired probabilities in their canonical BP and PPT regions and further upstream. Finally, we describe exon-centric regulations of genes encoding U2AF35 interaction partners and identify transcripts with distinct responses to U2AF35a and U2AF35b.
Our results indicate that most but not all changes in APA and exon usage in cells depleted of U2AF35 were attributable to the lack of U2AF heterodimer (Supplementary Figure S3 and Figure 6I). They were replicated in cells lacking U2AF65 ( Figure 3H, Supplementary Figures S2 and S11), in agreement with RASL-Seq data submitted during review of this manuscript and showing significant overlap of U2AF35-and U2AF65-induced events (69). As U2AF65 interacts with the 3 end processing complex, the association of U2AF with the phosphorylated C-terminal domain (CTD) of polII and PRPF19 (70,71) could be important for the U2AF-dependent APA control. U2AF35 depletion was associated with upregulation of CDC5L (Table 1), a key component of the PRPF19-CDC5L complex required for promotion of co-transcriptional splicing (72). The impaired balance between post-and cotranscriptional splicing in cells lacking U2AF is supported by frequent alterations of weak introns that contained cryptic or alternative exons, often near 3 gene ends (Supplementary Figures S5C and S13). These 'detained' introns are spliced post-transcriptionally in humans (73). In Drosophila and the mouse, alternative introns are less efficiently spliced co-transcriptionally than constitutive introns and co-transcriptional splicing is less efficient toward tion rate (75), which can alter APA choice (76), is further supported by a biased distribution of differentially used start and terminal exons in cells depleted of elongation factors (Supplementary Table S5) and significant overlap of genes differentially expressed in U2AF35-, hnRNP C-and AFF4-depleted cells (Supplementary Figure S15). AFF4 is a key elongation factor while hnRNP C directly competes with U2AF65 at authentic and cryptic 3 ss (25), in agreement with the observed tendency to antagonize U2AF(35)repressed exons ( Figure 3H). Altered elongation rates influenced inclusion of ∼15-40% cassette exons on a genomewide scale and the slow elongation has been associated with shorter introns (77), which were prevalent among U2AFsensitive events (Supplementary Figure S7).
The APA selection was influenced by U2AF in an APA category-dependent manner ( Figure 2C-E). U2AF appears to activate proximal APA sites if there is no intron between competing APA sites and distal APA sites if recognition of the 5 ss is productive ( Figure 2C and D). U2AF65 contacts CFIm59 but not CFIm68 (78), which promoted distal APA sites (79). PolII CTD also interacts with the CFIm subunit PCF11 (80) and several splicing factors, including PRP40, CA150 and PSF (81)(82)(83). Significant shifts in cleavage site usage, largely toward proximal sites, were observed for a knockdown of PABPN1 (84), a nuclear protein with high affinity to poly(A) tails, with 43 genes shared between PABPN1-and U2AF35-depleted samples (P < 10 −8 , data not shown). Finally, U2AF-dependent intronic APA sites represented the most frequent APA category ( Figure 2E), whereas in normal cells intronic APA sites are less frequent than tandem 3 UTR and alternative terminal exons (85), which could be due to widespread binding of U2AF(65) to introns, with >80% tags in intronic sequences (69).
Similar to other RNA-binding proteins (50,51,86,87), U2AF (35) can both inhibit and activate splicing (Supplementary Figures S2 and S3). Activated and repressed exons had a distinct 3 ss organization and functional regulation by Y-binding proteins ( Figure 3, Supplementary Figures S10-S12). The U2AF(35)-repressed exons were largely stimulated by PUF60 (Supplementary Figure S11A) and inhibited by CAPER␣ and also by PTB, in line with longer AGEZs/PPTs previously found for PTB-repressed cassettes (50). Because PUF60 preferentially contacts uridines (88), the PUF60-U2AF antagonism might be explained by competition for binding to uridine-rich sequences. However, uridine was enriched upstream of both upregulated and downregulated exons (Supplementary Figure S10) and also within these exons (Supplementary Figure S7B), arguing for the importance of adenine/uridine frequency shifts between optimally located BP and PPT regions (Supplementary Figure S10A) and between regions showing higher single-strandedness ( Figure 3J). These changes are likely to alter not only protein binding but also tertiary contacts and folding transitions by helicases and other RNA chaperones. Structural requirements across 3 ss may also help explain atypical responses of some transcripts, such as GSK3B to PUF60 depletion (Supplementary Figure S11A), and position-dependent alternative splicing activity of a growing number of proteins and their targets (50,(89)(90)(91).
PUF60 contains two RNA recognition motifs (RRMs) and a C-terminal UHM (88). This UHM binds the N-terminus of SF3b155 at the UHM-ligand motif (ULM) around W200 (92). The functional PUF60-U2AF antagonism might also result from competition of U2AF35 and PUF60 for the U2AF65 ULM because binding of PUF60 to the U2AF65 ULM can only occur if this motif is not already bound by the U2AF35 UHM (92). In contrast, PUF60 and U2AF65 can bind to the N terminus of SF3b155 simultaneously and noncompetitively (92). In addition, U2AF and PUF60 had the opposite effect on BP accessibility and U2AF was not strictly required for splicing of some pre-mRNAs in vitro when PUF60 was present (51). Finally, anti-PUF60 antibodies co-precipitated polII CTD and three components of the general transcription factor TFIIH (93), linking PUF60 to transcription. Unlike PUF60 on U2AF-repressed exons ( Figure 3H), however, slow and fast polII elongation do not usually have opposite effects on the inclusion of a given alternative exon (77).
Our study revealed activation and inhibition of many cryptic exons, both in U2AF-related and unrelated genes. The elevated U2AF2 mRNA ( Figure 1D) can be explained by repression of a PTC-containing exon in intron 5 in depleted cells (Supplementary Figure S13). This exon is highly conserved between mouse and man, has a GC 5 ss and is surrounded by Y-rich sequences. Such cryptic exon activation appeared to be common in other U2AF35 partners (Supplementary Figure S13, Table 1). Similar to U2AF2, CAPERα was upregulated upon U2AF depletion, most likely through elimination of a PTC-introducing cryptic exon (Supplementary Figure S13). CAPER␣ showed a strong synergism with U2AF on upregulated exons (Figure 3H) and has the same domain structure as U2AF65, except for the lack of ULM (94). In contrast, a 48-nt CAPERβ exon activated in ab-cells (Supplementary Figure S13) does not present PTCs but is translated only if upstream alternative exons are included in the mRNA. The mRNA expression of PUF60, which lacks both the ULM and the RS domain (94), was unaffected, although PUF60 exon 5 appeared to respond positively to the excess of U2AF35b (Supplementary Figure S13). This 51-nt cassette exon introduces extra 17 amino acids close to the first RRM of PUF60 and encodes two serines that are phosphorylated (95).
Although positions −3 and +1 relative to U2AF(35)dependent 3 ss may participate in U2AF35-RNA interactions, with binding preferences favoring cytosine at position −3 and guanine at position +1 (Supplementary Figure  S4), possibly via ZFs (24), experimental evidence for this interaction is missing. The higher dependency of weaker and proximal alternative 3 ss on U2AF35 was finally seen at the genome-wide level (Supplementary Figure S4, Table S9), resolving previous uncertainties (14,24). However, U2AF(35)-dependent exons and 3 ss were largely alternatively spliced, which makes it difficult to distinguish characteristic sequence features of cassette exons from direct effects of U2AF35 depletion. For example, alternative splicing and exon skipping was found to correlate positively with the BP-3 ss distance (41). The distance between 3 ss and the best predicted BP of exons/3 ss upregulated in ab-cells was marginally higher than in controls and these 3 ss had also a higher number of BPs with positive SVM scores (data not shown). Thus, it remains to be tested how BP choice is affected by a lack of U2AF(35) and by the observed posi-Nucleic Acids Research, 2015, Vol. 43, No. 7 3761 tional changes in single-strandedness ( Figure 3J). Interestingly, cooperative interactions of U2AF and SF1 increased the SF1 binding repertoire and SF1 binding was significantly biased toward terminal exons (87).
The higher exogenous expression of U2AF35b over U2AF35a ( Figure 6G) and elevated relative abundance of U2AF35b upon U2AF65 depletion (Supplementary Figure  S3) suggest that interactions of each isoform with U2AF65 could be important for U2AF stability, thus contributing to tight regulation of U2AF levels in vivo and accurate exon/APA usage. Description of exons with isoformspecific responses to U2AF35 should facilitate characterization of physical interactions of highly conserved U2AF35a and U2AF35b isoforms in future studies. These interactions may involve the extraordinarily long ␣ helix A (96) and are likely to be affected by post-translational modifications of residues encoded by exon 3/Ab as the size difference between U2AF35b and U2AF35a appears to be larger than predicted ( Figure 1C and Supplementary Figure S1).
Although U2AF35 is believed to be a 3 ss recognition factor, we found many examples of 5 ss usage alterations upon depletion (Supplementary Figure S5). This may be explained by altered U2AF65-promoted recruitment of U1 snRNP to weak 5 ss (97) but also elongation kinetics of polII. For example, Rsd1, a yeast homologue of CAPER␣, bridges interactions between U1 and U2 snRNPs through the RS domain of Prp5 (DDX46) (98), which was repressed in ab-cultures (Supplementary Table S2). Prp5 is required for a transcription elongation checkpoint and a release of stalled polII (99). Differentially used alternative 5 ss were also found in cells lacking other RNA-binding proteins, including PTB (50), nevertheless a large excess of U2AF(35)dependent 3 ss over 5 ss in our data set is consistent with the predominant role of this factor in 3 ss recognition.
Finally, genome-wide identification of U2AF(35)dependent events and our validation panel will provide an important resource for more detailed biochemical and structural studies of 3 ss and APA sites, expanding not only the number of U2AF35-dependent exons but also exons sensitive to other factors involved in 3 ss/BP/APA selection.