-
PDF
- Split View
-
Views
-
Cite
Cite
Qiumin Tan, Hari Krishna Yalamanchili, Jeehye Park, Antonia De Maio, Hsiang-Chih Lu, Ying-Wooi Wan, Joshua J. White, Vitaliy V. Bondar, Layal S. Sayegh, Xiuyun Liu, Yan Gao, Roy V. Sillitoe, Harry T. Orr, Zhandong Liu, Huda Y. Zoghbi, Extensive cryptic splicing upon loss of RBM17 and TDP43 in neurodegeneration models, Human Molecular Genetics, Volume 25, Issue 23, 1 December 2016, Pages 5083–5093, https://doi.org/10.1093/hmg/ddw337
- Share Icon Share
Abstract
Splicing regulation is an important step of post-transcriptional gene regulation. It is a highly dynamic process orchestrated by RNA-binding proteins (RBPs). RBP dysfunction and global splicing dysregulation have been implicated in many human diseases, but the in vivo functions of most RBPs and the splicing outcome upon their loss remain largely unexplored. Here we report that constitutive deletion of Rbm17, which encodes an RBP with a putative role in splicing, causes early embryonic lethality in mice and that its loss in Purkinje neurons leads to rapid degeneration. Transcriptome profiling of Rbm17-deficient and control neurons and subsequent splicing analyses using CrypSplice, a new computational method that we developed, revealed that more than half of RBM17-dependent splicing changes are cryptic. Importantly, RBM17 represses cryptic splicing of genes that likely contribute to motor coordination and cell survival. This finding prompted us to re-analyze published datasets from a recent report on TDP-43, an RBP implicated in amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD), as it was demonstrated that TDP-43 represses cryptic exon splicing to promote cell survival. We uncovered a large number of TDP-43-dependent splicing defects that were not previously discovered, revealing that TDP-43 extensively regulates cryptic splicing. Moreover, we found a significant overlap in genes that undergo both RBM17- and TDP-43-dependent cryptic splicing repression, many of which are associated with survival. We propose that repression of cryptic splicing by RBPs is critical for neuronal health and survival. CrypSplice is available at www.liuzlab.org/CrypSplice.
Introduction
To produce functional gene products in eukaryotes, nascent RNA must undergo multiple steps of processing, including pre-mRNA splicing. A large ribonucleoprotein complex called the spliceosome assembles at the intron–exon junctions and carries out splicing reactions to remove introns and join exons, while many regulatory proteins, such as RNA-binding proteins (RBPs), modulate the splicing outcome for particular mRNAs (1,2). The splicing process is tightly regulated and is crucial for proper gene expression. It is thus not surprising that dysfunctional spliceosome or other RBPs have been implicated in human diseases ranging from cancer to neurodegeneration. For example, SF3B1 and U2AF35, both encoding core spliceosome components, are frequently mutated in chronic lymphocytic leukemia and myelodysplasia (3,4). Mutations in TDP-43, FUS, ATXN2 and MATR3, all of which encode RBPs, are implicated in familial amyotrophic lateral sclerosis (ALS); TDP-43 has also been linked to frontotemporal dementia (FTD) (5–9).
The relevance of spliceosomes and regulatory RBPs to human disease has spurred tremendous interest in studying their pathogenic mechanisms. Accordingly, many studies focus on uncovering genome-wide splicing changes caused by dysfunctional RNA-processing proteins that may affect splicing. One challenge in doing so is that many of the established computational tools used for splicing analysis rely on isoform definition or exon annotations (10). However, the eukaryotic genome has a large number of cryptic splice sites, which are suboptimal splice sites that are rarely used under normal conditions. These dormant sites can be activated when the nearby strong splice site is mutated or when there are defects in RNA-processing proteins (11–14). For example, when TDP-43 is removed from mouse embryonic stem cells, the splicing of many cryptic exons was activated. This phenomenon was also observed in cells from human patients with ALS and FTD (13). Splicing analysis based on known exon annotation will miss such cryptic changes. In this case, had the authors not performed an exhaustive manual search, the role of TDP-43 in cryptic exon repression would not have been revealed. Thus there is a biological and computational need to systematically interrogate cryptic splicing events when studying RBP functions and disease pathogenesis.
One of the RBPs that has been implicated in cancer and a neurodegenerative disease, but has not been studied in great detail, is RNA-binding motif protein 17 (RBM17) (15,16). Originally identified as a component of the spliceosome complex, it interacts with splicing factors U2AF2, SF1 and SF3B1 (17–19); our lab has also discovered that RBM17 binds Ataxin-1 in a polyglutamine- and phosphorylation-dependent manner (15). Limited studies on the splicing functions of RBM17 so far have suggested a role in alternative splicing. RBM17 promotes the usage of the upstream cryptic 3’ splice site AG of the alternative exon 3 in Drosophila sex-lethal (Sxl). Loss of RBM17 leads to exon 3 inclusion, compromising the autoregulation loop of Sxl (20). RBM17 has also been shown to regulate the alternative splicing of FAS as its overexpression promotes exon 6 skipping, whereas its depletion favors exon 6 inclusion (18,21). To date, whether and how RBM17 regulates genome-wide splicing remains elusive.
Here, we report that constitutive loss of Rbm17 in mice caused embryonic lethality, and that hindbrain-specific deletion led to cerebellar and midbrain abnormalities and early death. To circumvent the premature death and to investigate the potential in vivo effects of RBM17 on splicing regulation, we generated a conditional mouse mutant that lacks Rbm17 in cerebellar Purkinje neurons and studied the transcriptome-wide expression and splicing patterns. In contrast to its proposed function in alternative splicing, we found that RBM17 regulates isoform splicing in a limited number of genes. This raised the possibility that RBM17 might function beyond alternative splicing and affect splicing events that are unannotated (cryptic). We developed an algorithm called CrypSplice, a novel cryptic splice site detection method and discovered that RBM17 plays a role in repressing cryptic splicing junctions. Given the similarities between RBM17 and TDP-43 in cryptic splice site repression, we re-visit the published TDP-43 RNA-seq data (13). CrypSplice identified substantially more cryptic splicing changes than reported, and revealed a common set of genes with cryptic splicing de-repression between Rbm17 and Tdp-43 null cells.
Results
Loss of Rbm17 leads to abnormal development

RBM17 is widely expressed and important for development and survival. (A) Rbm17 expression in mouse brain of Rbm17 +/- revealed by X-gal staining (blue). BS, brain stem; Cbl, cerebellum; Ctx, cortex; Hp, hippocampus; Mdb, mid-brain; Str, striatum; Th, thalamus. Scale bar, 2 mm. (B) RBM17 immunoreactivity is found throughout the brain. Scale bar, 2 mm. A region in the cerebellum is shown under higher magnification in the inset. Arrowheads indicate the nuclear staining in Purkinje neurons. (C) En1-Cre; Rbm17 f/-mice had smaller midbrains (Mdb) and cerebella (Cbl) compared to control littermates at P0. Scale bar, 2 mm.
Purkinje cell-specific Rbm17 knockout mice develop ataxia and neurodegeneration

Loss of Rbm17 in Purkinje cells leads to ataxia and neurodegeneration. (A and B) Pcp2-Cre; Rbm17 f/- knockout mice showed reduced activity in the open field assay. (C) Pcp2-Cre; Rbm17 f/- knockout mice had impaired motor function on the dowel test. (D) Pcp2-Cre; Rbm17 f/- knockout mice performed worse on the rotarod test. Values were plotted with the box and whisker plots showing all data points with n = 9–11 per genotype at 8–9 weeks of age. (***P < 0.001; **P < 0.01) (E) Histopathological defects revealing Purkinje cell degeneration. Calbindin staining showing Purkinje cells in the Pcp2-Cre; Rbm17 f/- mice when animals were at 4 and 8 weeks of age. Upper panel scale bars, 500 µm. Lower panel scale bars, 100 µm. (F) Electrophysiological defects in Purkinje cells in Pcp2-Cre; Rbm17 f/- mice. Purkinje cells were identified by the presence of a unique action potential waveform called the complex spike (black arrowheads). Representative in vivo spike recordings of Purkinje cells from Pcp2-Cre; Rbm17 f/+ (i and iii) and Pcp2-Cre; Rbm17 f/- (ii and iv) mice are shown. Quantifications of firing rate and coefficient of variation for Purkinje cell simple spikes (SS CV) are shown in v and vi. (Animal age: 4-weeks old, n = 18–26; ***P < 0.001; *P < 0.05).
Histology and immunostaining with a Calbindin antibody showed rapid and progressive reduction of Purkinje cells in the conditional knockout mice (Fig. 2E, Supplementary Material, Fig. S3B). Of note, Purkinje cell loss was not evident at four weeks of age, but by 8 weeks of age only 10% of the expected Purkinje cells remained. Prior to this period of cell death, in vivo electrophysiological recordings showed reduced and irregular firing of mutant Purkinje cells in 4-week old mice (Fig. 2F).
Loss of Rbm17 causes upregulation of apoptotic genes
Next we investigated the molecular mechanism by which loss of RBM17 leads to rapid cell death. Because RBM17 regulates the splicing of Sxl in Drosophila and FAS in cultured mammalian cells (18,20,21,25), we sought to assess global gene expression and mRNA splicing changes in Purkinje cells upon deletion of Rbm17. To this end, we used the TRAP (translating ribosome affinity purification) approach, which allows the isolation and subsequent deep sequencing of cell type-specific mRNAs that are bound to the ribosomes (26,27).

Gene expression changes in Rbm17-deficient Purkinje cells revealed by BacTRAP profiling. (A) Heatmap showing the number of differentially expressed genes (DEGs) using a cut-off of fold change > 2.0 and FDR < 0.05. (B) GO analyses of functional categories enriched with DEGs. Cell death and apoptotic pathways are the predominant pathways. (C) Quantitative qPCR validating the upregulated apoptotic genes. n = 3–6 animals per genotype. (***P < 0.001; **P < 0.01; *P < 0.05).
CrypSplice algorithm finds previously unknown splicing changes and reveals RBM17 represses splicing
To investigate the proposed role of RBM17 in alternative splicing of isoforms, we evaluated the splicing patterns in our RNA-seq data using Cufflinks (28) and found 134 differentially expressed isoforms (Supplementary Material, Table S3). Though this finding indicates that RBM17 regulates the alternative splicing of these genes, the magnitude of the effect does not seem impressive. We then used DEXSeq (29) to search for differentially expressed annotated exons and found 708 genes where at least one of the exons was significantly different between the two genotypes (Supplementary Material, Table S3). Although this seems to suggest that RBM17 influences exon selection, it is worth noting that differentially expressed exons might not conform to differential splicing. Changes in overall gene expression can also lead to differential exon expression.

Workflow of CrypSplice. (A) Read alignment and junction quantification. (B) Filtering out known junctions and collapsing overlapping junctions. (C) Computing junction scores. (D) Performing beta-binomial test. (E) Multiple testing corrections.

Cryptic splicing changes in Rbm17-deficient Purkinje cells revealed by BacTRAP profiling and CrypSplice analyses. (A) PCR validation of a premature 3′ UTR event in Cd99l2. The genome browser view is shown in (i) and the respective Sashimi plots depicting the connection between reads are shown in (ii). PCR gel image is shown in (iii) with the expected product size in the knockouts indicated by an asterisk. (B) PCR validation of a cryptic exon inclusion event in Magohb. (C) PCR validation of an exon extension event in Tmem5. Sashimi plots depicting the connection between reads are shown in (i). PCR gel images are shown in (ii) with the expected products in the controls indicated with red arrows and those in the knockouts indicated by asterisks. (D) Enrichment analyses of phenotypes significantly associated with genes displaying RBM17-dependent cryptic splicing repression. Phenotypes with significant enrichment are shown in colored boxes. The number of genes associated with each significantly enriched phenotype and their respective adjusted P values are also indicated.
It was previously reported that RBM17 binds to the upstream cryptic 3’ splice site AG in the presence of an intact downstream AG (20). We next examined whether there exist consensus motifs near RBM17-dependent cryptic splice sites. We took relatively strong cryptic junctions, which were incorporated into at least 10% of total transcripts, and scanned for consensus motifs in the proximity of their 5′ (upstream -100 bp to downstream 400 bp) and 3′ (upstream -400 bp to downstream 100 bp) splice sites. Motif analyses predicted more than 10 consensus motifs proximal to each site (Supplementary Material, Table S6). Some of the identified motifs were established motifs for splicing reaction. For example, the (C/A)GGUA motif is a 5′ splice donor site, and CCT(G/U)(U/C)CUC could be the pyrimidine tract. Interestingly, the top motifs surrounding both the 5′ and 3′ splice sites are rich in A, and there are three motifs with consensus AG near the 3′ splice sites. These motifs might provide the cis-acting RNA sequence context for RBM17 binding.
To gain further insight into how splicing dysregulation might contribute to the observed phenotypes in the Rbm17 knockout mice, we searched for phenotypes associated with genes displaying splicing defects, both annotated and cryptic, and found a significant enrichment in genes associated with abnormal synaptic transmission, abnormal motor coordination and premature lethality (Supplementary Material, Fig. S7), consistent with the phenotypes that we observed in mutant mice (Fig. 1 and 2). Intriguingly, the majority of the genes associated with these phenotypes gained cryptic junctions in mutant neurons (Fig. 5D, Supplementary Material, Table S7). These findings suggest that splicing defects, especially de-repression of cryptic splicing junctions leading to the inclusion of intronic elements in mature transcripts, contribute to the observed mutant phenotypes.
Genes associated with premature death display RBM17- and TDP-43-dependent cryptic splicing repression

Cryptic splicing changes commonly regulated by RBM17 and TDP-43 are essential to cell survival. CrypSplice analyses reveal substantially more TDP-43-dependent cryptic splicing repression events than previously reported. Some of these predicted changes were validated in human HeLa cells with TDP-43 knocked down using siRNA. The splicing validation PCRs of a previously reported hit, EPB41L4A, and a novel hit identified by CrypSplice, ARHGAP32, are shown in (A) and (B) respectively. Sashimi plots depicting the connection between reads are shown in (i). PCR gel images are shown in (ii) with the expected products in the controls (if present) indicated with red arrows and those in the knockouts indicated by asterisks. (C) Significant overlap is observed for genes with TDP-43- or RBM17-dependent cryptic splicing repression. Significance value was calculated using the hypergeometric test with 14 000 genes as the gene universe. The common set of 203 genes was further analyzed for phenotype enrichment. Phenotypes with significant enrichment are shown in bright colored boxes with bold fonts. The number of genes associated with each significantly enriched phenotype and their respective adjusted P values are also indicated.
Although a paralleled comparison between two different species is likely to reveal some of the common direct targets of TDP-43-dependent cryptic splicing modulation, we reasoned that a comparison between two RBPs sharing some common phenotypes (e.g. embryonic lethality, cell death upon RBP depletion) and splicing outcome (e.g. widespread cryptic splicing dysregulation) might uncover converging molecular themes. To this end, we performed phenotype enrichment analyses on the 203 genes sharing cryptic splicing de-repression when either Rbm17 or Tdp-43 was depleted in mouse cells. We found significant enrichment for genes associated with abnormal survival (40 genes), prenatal lethality (29 genes) and complete embryonic lethality during organogenesis (11 genes) (Fig. 6C, Supplementary Material, Table S12). These genes could be key contributors to the embryonic lethality and cell death due to the loss of RBM17 or TDP-43.
Discussion
We have shown that RBM17 is crucial for cell survival and that its loss leads to widespread disruption in splicing, especially the splicing of cryptic junctions. Previous studies demonstrated that RBM17 affects alternative splicing of two genes, Sxl and FAS (18,20,21,25), but the splicing function of RBM17 had not been established at the whole transcriptome level. Knockdown of RBM17 in cultured mammalian cells led to exon 6 inclusion in FAS, producing the pro-apoptotic form of the protein (18,25). In neurons lacking RBM17, we did not identify any alternative splicing changes in Fas (Supplementary Material, Table S3 and S4). Upregulation of apoptotic pathways upon the loss of RBM17 in mutant neurons was thus not driven by the generation of pro-apoptotic FAS but by other mechanisms (discussed below). The observation that RBM17 regulates isoform splicing of Sxl and FAS raised the possibility that global regulation of isoform splicing could be a general function of RBM17. Our genome-wide splicing analyses, however, suggest that RBM17 regulates alternative isoform splicing for only relatively few (134) genes, and rather predominantly regulates the splicing of cryptic junctions. Interestingly, more than 90% of the genes with RBM17-dependent cryptic splicing changes gained additional junctions in the mutants, suggesting RBM17 normally represses the splicing of cryptic junctions and its loss leads to the inclusion of intronic elements in mature transcripts. How RBM17 executes such repression is unclear at this point. One possibility is that RBM17 and its interacting partners could block the utilization of these intronic elements or cryptic exons by directly binding to the cryptic junctions or adjacent regions. This scenario would be similar to what has been described for the interaction of SXL and RBM17 blocking the selection of Sxl exon 3 (20). We attempted to identify consensus motifs surrounding RBM17-dependent cryptic splice sites (Supplementary Material, Table S6), but further experiments with crosslinking and immunoprecipitation followed by deep sequencing (CLIPseq) will be necessary to identify RBM17 binding sites, and to address the mechanistic functions of RBM17 in cryptic splice site repression.
Exonization of intronic coding cassettes is predicted to create frameshifts or introduce PTCs in nearly 80% of the cases (35,36). Transcripts harboring PTCs are targeted for rapid degradation through a quality-control mechanism called nonsense-mediated decay (NMD), which requires the recruitment of up frameshift proteins and other NMD-activating components during inefficient translation termination at the PTC (37,38). In our study, of the 22 validated splicing changes, 16 were cryptic exon inclusion, of which 15 were predicted to produce a PTC (Supplementary Material, Table S5). We further tested whether five of these genes represent bona fide NMD targets in an in vitro cell assay system. Our results showed that four out of the five genes with RBM17-dependent cryptic exon-inclusion events generated isoforms that were sensitive to cycloheximide, an indirect inhibitor of NMD, suggesting that these isoforms were NMD targets (Supplementary Material, Fig. S11). In several validated cryptic exon inclusion events, cryptic exons were incorporated into >20% of the total transcripts (Supplementary Material, Table S5). If these were bona fide NMD targets, one would expect a concomitant decrease in gene expression. However, our analysis did not find a correlation between incorporation of cryptic events and downregulation in gene expression. We initially speculated that the TRAP approach, compared with the conventional total RNA extraction method, might have captured more NMD targets that were bound to the ribosomes during their pioneer rounds of translation before the recruitment of NMD-activating machinery. But we found this unlikely as the relative abundance of cryptic exon inclusion events was similar with or without the use of TRAP method (Supplementary Material, Fig. S12). The lack of global gene downregulation could be largely due to the fact that majority of the aberrantly spliced transcripts contribute to <20% of the transcripts. Therefore, even though the use of cryptic exons creates targets for NMD, it would only affect a small portion of transcripts and might not decrease gene expression significantly. In addition, the relatively small sample size (n = 3) in our BacTRAP-RNAseq studies might also led to the reduced sensitivity in detecting small gene expression changes. Indeed, when we increased the sample size (n = 6) and re-analyzed the expression of four genes with high percentage (∼20%) incorporation of cryptic exons using qRT-PCR, we found that we could now detect a significant downregulation in one of the genes (Supplementary Material, Fig. S13). Nonetheless, our enrichment analyses suggest that genes with abnormal cryptic splicing changes are highly associated with the phenotypes observed in Rbm17 knockouts, arguing for a strong contribution from these aberrant transcripts to the mutant phenotypes. We speculate that the abnormal transcripts in mutant cells lacking Rbm17 can produce altered, in many cases truncated (Supplementary Material, Table S5), proteins with dominant-negative or toxic gain-of-function activities, thereby leading to cellular stress, upregulation of apoptotic pathways and ultimately cell death.
Both RBM17 and TDP-43 mouse knockouts die at an early embryonic stage before E8.5 (Supplementary Material, Table S1) (33,34), probably due to upregulation of apoptosis (Purkinje neurons lacking Rbm17 upregulate apoptotic pathways (Fig. 3), while mouse embryonic stem cells depleted for Tdp-43 undergo apoptosis (13)). At the molecular level, we found that both RBPs repress cryptic splicing for over a thousand genes. Comparing the genes displaying RBM17- and TDP-43-dependent cryptic splicing repression to identify potential common mechanisms underlying the apoptosis, we found a set of overlapping genes that are significantly associated with cell death and early lethality. Correcting the cryptic splicing defects in TDP-43 deficient cells can rescue apoptosis, providing a direct link between abnormal cryptic splicing repression and cell death (13). Since we have greatly expanded the list of genes with TDP-43-dependent cryptic splicing, it remains to be determined whether the rescue effect is true for all the targets or only a subset of genes.
Advances in high throughput sequencing technology, especially in RNA-seq, have made systematic investigation of alternative splicing possible, but the lack of suitable tools has made detection of novel cryptic splicing events challenging. The method developed in this study, CrypSplice, complements the existing methods that focus on gene isoforms and reference exons and should thus be an invaluable tool for the research community. In light of the two recent reports demonstrating that abnormal cryptic splicing activities by RBPs underlie disease pathogenesis and abnormal neuronal function (13,14), revisiting existing RNA-seq data using CrypSplice might yield novel biological insights that would have been missed by manual or heuristic search. In this study, re-analyzing the published TDP-43 data using CrypSplice markedly extended the original findings. Altogether, our study reveals a key in vivo function of RBM17 and suggests that extensive cryptic splicing repression mediated by RBM17 and TDP-43 are essential for cell survival. CrypSplice provides a new tool to take a fresh look at past RNA-seq datasets to uncover novel biological insights. This is particularly exciting given the amount of available RNA-seq data on disease-causing RBPs, for which cryptic splicing effects are completely unexplored for the majority of the proteins.
Overall our study demonstrates for the first time that the physiological function of RBM17 is crucial for cell survival and neuronal health, and that defects in RBM17 cause global splicing disruption and de-repression of cryptic splice sites. The list of genes sharing both RBM17- and TDP-43-dependent cryptic splicing de-repression provides an exciting opportunity to further dissect the molecular mechanism underlying neuronal death in the two neurodegenerative disease models.
Materials and Methods
Mouse handling
All procedures for mouse animal use were approved by the Institutional Animal Care and Use Committee for Baylor College of Medicine and Affiliates. Primers used for genotyping in the study are listed in Supplementary Material, Table S13.
Generation of RBM17 antibody
DNA sequences encoding 100–200 amino acid residues of RBM17 was cloned into pGEX 4T-1 vector using EcoRI and XhoI sites. GST-RBM17 100-200 was then transformed into BL21 and was expressed and purified with GST antibody. This antigen was sent to Neuromab for the generation of anti-RBM17/SPF45 antibody (UCDavis/NIH NeuroMab facility Cat no. 73-234).
Immunohistochemistry and Purkinje cell pathology analysis
Mouse brains were dissected and immersed in 4% paraformaldehyde in PBS overnight and then prepared for paraffin embedding using standard protocol. The brain sections were cut to a thickness of 5 µm and were immunostained with anti-RBM17 antibody (Neuromab) or anti-calbindin antibody (Sigma) and imaged using an Axio Scan Z1 microscope. X-gal staining was done as previously described in (39).
Mouse behavioral studies
Rotarod analysis, open field assay, and dowel test were performed using 8-week-old mice as previously described in (40–43). Statistical analyses were performed using GraphPad Prism software.
BacTRAP profiling and data analyses, quantitative RT-PCR, and splicing RT-PCR analyses
BacTRAP profiling was performed as previously described (44) with the following modifications: One cerebellum was homogenized in 1.5 ml of homogenization buffer and 60 µg each of 19C8 and 19F7 was used for each IP. Detailed protocol and the subsequent bioinformatic analyses are described in Supplementary Material. Quantitative RT-PCR was performed using Perfecta SYBR Green FastMix (Quanta Biosciences). Relative fold change in gene expression was normalized to Rps16 and/or Hprt and calculated using the 2ΔΔCT method (45). Splicing RT-PCR reactions were performed using KOD Hot Start Polymerase (EMD Millipore) or EconoTaq PLUS GREEN 2X Master Mixes (Lucigen) and standard PCR protocols. PCR products were run on MetaPhor agarose (Lonza) gels. Primers used for PCRs are listed in Supplementary Material, Table S9 and S13.
TDP-43 siRNA knockdown in human HeLa cells
HeLa cells grown in a 6-well plate were transfected with 40 nM of TDP-43 siRNA (Sigma EHU109221) or control siRNA (Qiagen 1027280) in triplicates and cultured for 72 hours. Total RNA isolation was performed using TRIzol (Invitrogen) and Aurum Total RNA Fatty and Fibrous Tissue Kit (BioRad). 2.5 µg of total RNA was used for cDNA synthesis using the M-MLV RT Kit (Invitrogen). qPCRs were performed using the 2X GeneAmp Fast PCR Master Mix (BioRad) and PCRs were performed using the KOD Hot Start DNA Polymerase (EMD Millipore). PCR products were analyzed using MetaPhor agarose gels (Lonza).
CrypSplice
Processing splice junctions. CrypSplice avoids remapping of reads and relies on junction counts from any efficient read-mapping algorithms. In this study, we used junction counts from TopHat v2.0.9 (30) as input to CrypSplice. TopHat is advantageous in controlling true negatives (46) and the adopted beta binomial modeling (31) of CrypSplice compliment this by controlling false positives. In agreement to the definition of cryptic splicing given in the results section CrypSplice first filters out all known/database reported splicing junctions from the observed junction (TopHat output). Errors in sequencing may result in ambiguous junction boundaries (47,48). To account any such errors CrypSplice collapses junctions with a reciprocal overlap of greater than or equal to an overlap cut-off (here 95%). Accepted and rejected reciprocal overlaps are illustrated in Supplementary Material, Fig. S14. Coverage of the new collapsed junction is computed as the average respective junctions. Since no alignment is perfect, junctions with coverage less than a minimal expression threshold (here 10) are ignored (49,50) to minimize any alignment noise. However, user can tailor select the expression noise threshold depending upon the sequencing depth. Expression score of every resultant junction is computed by, where Si is the expression score of ith junction, Ji is the number of reads spanning the junction i and Ui is the respective 5′ coverage. Ji can then be abstracted as junction counts and Ui as total sample counts for beta-binomial modelling. Beta binomial test is advantageous in modelling count data to other non-parametric methods such as the Mann–Whitney test or the Kruskal–Wallis test especially when the number of samples per condition is small (typically three for biological samples). It differentiates intra- and inter-sample variations and minimizes large artificial test statistics for the junctions in highly expressed genes. Intra-sample variation is modelled with a binomial distribution and inter sample variation is modelled by treating the binomial distribution parameter as a random variable following beta distribution. Model parameters in beta-binomial distribution are inferred by maximum likelihood estimation (MLE) procedure. Finally, likelihood ratio test is used to infer the significance of the group differences.
Every junction is multiple testing corrected using Benjamini-Hochberg procedure (51). Significant junctions with an adjusted P value < 0.01 are predicted as cryptic splicing changes. We further classified cryptic events into three groups based on minimum expression threshold (Supplementary Material, Fig. S5): junction gains (observed only in case samples), junction losses (observed only in control samples), and differential junctions, which may not meet the full definition of cryptic. Junctions spanning more than one gene are marked as overlaps.
Accession number
Data are available at Gene Expression Omnibus GSE79020.
Supplementary Material
Supplementary Material is available at HMG online.
Acknowledgements
We thank Dr. Nathaniel Heintz for the generous gift of Pcp2-BacTRAP line (Tg (Pcp2-EGFP/Rpl10a) DR166 Htz), and members of the Zoghbi and Liu laboratory for helpful suggestions and discussions. The project was supported in part by IDDRC grant number 1U54 HD083092 from the Eunice Kennedy Shriver National Institute of Child Health & Human Development and the cores used were the RNA In Situ Hybridization Core Facility, Mouse Behavioral Core and the Genomic and RNA Profiling Core at Baylor College of Medicine. Development of the GFP monoclonal antibodies used in this study was supported in part through the NIH/NCI Cancer Center Support Grant GrantP30 CA008748 which funds the Antibody and Bioresource Core Facility at Memorial Sloan Kettering Cancer Center.
Conflict of Interest Statement. None declared.
Funding
This work was supported by National Institute of Health (F32 NS083091 to Q. T., F31 NS092264 to J. J. W., R01 NS089664 to R. V. S., R37 NS22920 to H. T. O., and R37 NS027699 to H. Y. Z.) This work was also supported by National Science Foundation (DMS-1263932 to Z. L.).
References
Author notes
†The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors.
‡Present address: Program in Genetics and Genome Biology, The Hospital for Sick Children; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.