Abstract

The spinocerebellar ataxias (SCAs) are a genetically heterogeneous group of rare dominantly inherited neurodegenerative diseases characterized by progressive ataxia. The most common mutation seen across the SCAs is a CAG repeat expansion, causative for SCA1, 2, 3, 6, 7, 12 and 17. We recently identified dysregulation of alternative splicing as a novel, presymptomatic transcriptomic hallmark in mouse models of SCAs 1, 3 and 7. In order to understand if dysregulation of alternative splicing is a transcriptomic feature of patient-derived cell models of CAG SCAs, we performed RNA sequencing and transcriptomic analysis in patient-derived fibroblast cell lines of SCAs 1, 3 and 7. We identified widespread and robust dysregulation of alternative splicing across all CAG expansion SCA lines investigated, with disease relevant pathways affected, such as microtubule-based processes, transcriptional regulation, and DNA damage and repair. Novel disease-relevant alternative splicing events were validated across patient-derived fibroblast lines from multiple CAG SCAs and CAG containing reporter cell lines. Together this study demonstrates that dysregulation of alternative splicing represents a novel and shared pathogenic process in CAG expansion SCA1, 3 and 7 and can potentially be used as a biomarker across patient models of this group of devastating neurodegenerative diseases.

Introduction

The Spinocerebellar Ataxias (SCAs) are a genetically heterogeneous group of rare, dominantly inherited neurological disorders characterized by progressive ataxia [1]. Despite different genetic backgrounds, SCAs share the hallmark features of cerebellar degeneration. In most cases, the degeneration centers on loss of Purkinje neurons, however, in other SCAs these cells are comparatively spared [2]. Damage to other parts of the nervous system such as spinal cord, pontine nuclei and medulla in the brainstem also occurs to a greater or lesser extent across each SCA subtype [1]. This pattern of degeneration causes the hallmark symptoms of SCAs such as a loss of balance and coordination, and slurred speech [1].

There are more than 40 types of SCAs with some of the most prevalent subtypes being those caused by CAG repeat expansion mutations, including SCA1, SCA2, SCA3, SCA6, SCA7, SCA12 and SCA17 [3, 4]. In most of these cases, the CAG expansion is located within the coding region of disease associated genes and encodes for an expanded polyglutamine (polyQ) tract in the respective disease protein. The exception is SCA12 where the CAG expansion is located within the 5′ UTR of the disease associated gene, PPP2R2B. The CAG SCAs all share the production of CAG expansion RNAs with a common hallmark of the protein coding CAG SCAs being the accumulation of intranuclear inclusions composed of aggregates of the polyQ expansion disease protein. This accumulation is driven by conformational changes in the protein due to the presence of the expanded polyQ tract which can alter the normal protein structure, impair its function, and lead to the sequestration of additional proteins into the intranuclear inclusions [1, 4].

Across CAG expansion SCAs, disruption to a variety of cellular pathways have been implicated in the disease process, such as DNA damage and repair, transcriptional regulation, cytoskeletal structure and function, ion-channel function [1]. For example, accumulation of DNA damage and impairment of DNA repair pathways has been seen in models of SCA7, with a slower response to DNA damage stimuli and a reduced efficiency of DNA repair seen for SCA1 and SCA3, respectively [1, 4–6]. Likewise, transcriptional dysregulation and alterations to gene expression have been observed across multiple CAG SCAs and are associated with alterations to specific transcription factors in SCA1 and dysregulation of histone acetylation in SCA7 [5, 6]. Additionally, disorganization of the extracellular matrix and cytoskeletal structures has been seen for SCA3 with disruption to the cytoskeleton a hallmark of neurodegeneration in the CAG SCAs [6–8]. Despite the current state of understanding of the cellular pathways disrupted in CAG SCAs, the drivers of these disruptions to and their individual contributions to disease pathogenesis remains unclear.

Alternative splicing dysregulation has recently been implicated as a potential presymptomatic driver of neuronal dysfunction in SCAs caused by CAG repeat expansions [9, 10]. Alternative splicing is a highly regulated mechanism that significantly increases genetic diversity by processing RNA transcripts into distinct combinations of exons [11]. This process is a critical step in the regulation of gene expression and disruption of alternative splicing leads to mis-splicing events resulting in proteins with altered function, stability, and expression [12]. While the role of widespread presymptomatic dysregulation of alternative splicing in disease pathogenesis of CAG SCAs [9, 10] is unclear, other well studied CTG/CAG expansion diseases provide clear insight into how alternative splicing dependent disease mechanisms can drive patient symptoms [13]. For example, in myotonic dystrophy type 1 (DM1) the muscleblind-like (MBNL) family of alternative splicing regulating RNA binding proteins (RBPs) are sequestered by the CUG repeat expansion containing RNAs leading to disruption of splicing for MBNL dependent events, that drive key symptoms of the disease, such as myotonia [13, 14]. Likewise, alternative splicing has been identified as a highly impacted molecular process and mis-splicing of genes linked to neurodegeneration and movement have been identified in Huntington disease (HD) [15–19]. Overall, evidence is building to suggest that alternative splicing might be a common driver of disease pathogenesis in repeat expansion diseases.

In the context of CAG repeat expansions in spinocerebellar ataxias, we and others recently provided transcriptome wide evidence for dysregulation of alternative splicing in CAG SCA mouse models [9, 10]. Through a comprehensive analysis of 29 RNA sequencing (RNASeq) datasets, we identified widespread presymptomatic dysregulation of alternative splicing that was repeat length dependent and occurred in affected brain regions including the cerebellum and brainstem across mouse models from SCAs 1, 3 and 7. These alternative splicing changes are known to cause functional changes to the gene products produced which are involved in pathways known to be disrupted in SCAs, synaptic signaling, transcriptional regulation, and the cytoskeleton. Furthermore, these changes in alternative splicing were responsive to therapeutic intervention [9]. Similarly, a separate study focusing on cerebellum of two SCA1 mouse models also identified widespread dysregulation of alternative splicing affecting disease relevant cellular pathways from presymptomatic disease stages onwards and demonstrated that alternative splicing dysregulation is a core, cell-autonomous transcriptomic feature of SCA1 [10].

These recent studies all focused on mouse models of CAG expansion SCAs, raising the question of whether alternative splicing dysregulation represents a marker of disease in patient-based model systems. While one study previously demonstrated mis-splicing of SERCA1 (ATP2A1) in SCA3 patient-derived fibroblasts [20], providing evidence that alternative splicing may be dysregulated in patient model systems, this study focused on individual splicing events and lacked transcriptome wide analysis. Taken together this evidence, highlights the need for a comprehensive transcriptomic study to understand the extent of alternative splicing dysregulation in patient fibroblasts from SCA3 and other CAG expansion SCAs. To investigate alternative splicing dysregulation as part of the patient disease process in CAG expansion SCAs, we performed RNASeq with analysis for alternative splicing and differential gene expression to characterize transcriptomic changes across CAG expansion SCA patient-derived fibroblast cell lines, from SCAs 1, 3 and 7. We performed differential gene expression and alternative splicing analysis to understand the relative effects of both forms of transcriptional dysregulation, and validated novel mis-spliced events that are associated with pathways known to be impaired in SCAs. Together, this study demonstrates that dysregulation of alternative splicing represents a novel transcriptomic signature and potential pathogenic mechanism across patient-derived model systems of multiple CAG expansion SCAs.

Results

Widespread alternative splicing dysregulation is a transcriptomic hallmark of CAG expansion SCA patient-derived fibroblasts

Before investigating disease associated changes in alternative splicing and gene expression to understand the complete picture of transcriptomic dysregulation in these cell lines, we wanted to characterize the expression of disease associated genes in patient-derived fibroblast lines. We found that gene expression for ATXN1, ATXN3 and ATXN7 ranged from 29 to 83, 11 to 32 and 16 to 33 transcripts per million (TPM), respectively, across all fibroblast lines investigated (Fig. S1A).

To understand if the changes in alternative splicing observed in mouse model systems [9, 10], also represent a transcriptomic feature in CAG expansion SCA patient-derived model systems, we performed RNASeq analysis using three SCA1, two SCA3 and one SCA7 patient-derived fibroblast cell lines (Supplemental Table 1). Alternative splicing analysis was performed for each SCA cell line versus five control fibroblast lines with thresholds for significance based on previous studies [9, 21] and three biological replicates per cell line. Dysregulation of alternative splicing was identified across all CAG expansion SCA fibroblast lines analyzed (ΔPSI>10%, FDR < 0.1; Fig. 1A). Consistent with analyses in mouse models [9, 10], skipped exon (SE) or cassette exon events were the most frequently dysregulated class across the majority of the cell lines and accounted for more than 50% of total mis-spliced events in four out of six cell lines (Fig. 1A). Alternative splicing of retained introns (RI), alternative 5′ splice site (A5SS), alternative 3′ splice site (A3SS) and mutually exclusive exons (MXE), were also detected in all analyzed cell lines with MXE events being the most frequently dysregulated class in SCA1.2 and SCA1.3 fibroblast cell lines (Fig. 1A, Supplemental Table 2).

Widespread dysregulation of alternative splicing in CAG expansion SCA patient-derived fibroblasts cell lines. (A) Percentage of significantly mis-spliced skipped exon (SE) retained intron (RI), mutually exclusive exons (MXE), alternative 5′ splice site (A5SS) and alternative 3′ splice site (A3SS) events as a proportion of total splicing events in SCA vs control, number of each event shown on bar, FDR < 0.1, ΔPSI > 10%. (B) Percentage of exon inclusion (positive) or exclusion (negative) for significantly alternatively spliced SE events per dataset, FDR < 0.1, ΔPSI > 10%. (C) Total number of SE events dysregulated in more than one dataset, FDR < 0.1, ΔPSI > 10%. These shared events were used for the generation of PCA plots. (D) Number of SE events per dataset with the proportion of events dysregulated in one to six datasets shown, FDR < 0.1, ΔPSI > 10%. (E) Principal component analysis of shared skipped exon events dysregulated in two or more fibroblast cell lines. (F) Enrichment of summary gene ontology terms identified using metascape for SE events dysregulated in each dataset. (G) Functional classification of analysis of significantly mis-spliced skipped exon events, FDR < 0.1, ΔPSI > 10%.
Figure 1

Widespread dysregulation of alternative splicing in CAG expansion SCA patient-derived fibroblasts cell lines. (A) Percentage of significantly mis-spliced skipped exon (SE) retained intron (RI), mutually exclusive exons (MXE), alternative 5′ splice site (A5SS) and alternative 3′ splice site (A3SS) events as a proportion of total splicing events in SCA vs control, number of each event shown on bar, FDR < 0.1, ΔPSI > 10%. (B) Percentage of exon inclusion (positive) or exclusion (negative) for significantly alternatively spliced SE events per dataset, FDR < 0.1, ΔPSI > 10%. (C) Total number of SE events dysregulated in more than one dataset, FDR < 0.1, ΔPSI > 10%. These shared events were used for the generation of PCA plots. (D) Number of SE events per dataset with the proportion of events dysregulated in one to six datasets shown, FDR < 0.1, ΔPSI > 10%. (E) Principal component analysis of shared skipped exon events dysregulated in two or more fibroblast cell lines. (F) Enrichment of summary gene ontology terms identified using metascape for SE events dysregulated in each dataset. (G) Functional classification of analysis of significantly mis-spliced skipped exon events, FDR < 0.1, ΔPSI > 10%.

As skipped exon events were overall the most frequently dysregulated event class, we further analyzed these events to understand the effect of their dysregulation. We identified dysregulation of both inclusion and exclusion events, with both being equally frequent (Fig. 1B). The mean ΔPSI for inclusion events ranged from 19.9% to 22.2% and the mean ΔPSI ranged from 20.2% to 22% for exclusion events. The maximum inclusion ΔPSI per cell line ranged from 57% to 99% with the maximum ΔPSI for exon exclusion ranging from 72% to 91%. Interestingly, for both inclusion and exclusion events, SCA7 showed the maximal ΔPSI among the cell lines (Fig. 1B).

Next, we sought to understand if the dysregulation of specific skipped exon events was shared between the different cell lines. A total of 7427 skipped exon events (ΔPSI > 10%, FDR < 0.1) were identified with 1556 events shared between two or more cell lines (Figs 1C andS1B). While events unique to each cell line were identified, across all lines approximately 50% of all significant skipped exon events were shared with at least one other cell line (Fig. 1D). We identified 20 shared mis-spliced events across all six cell lines analyzed (Fig. 1C and D, Fig. S1B). Our pairwise analysis showed the highest number of shared events between SCA1.2 and SCA1.3 cell lines with 340 shared events, whereas the lowest number of 198 shared events was observed between SCA1.1 and SCA7 cell lines (Fig. S1C). By performing a PCA analysis of significant skipped exon events shared between two or more fibroblast cell lines we saw a distinct separation along the x-axis of SCA cell lines versus a tight cluster of the five control cell lines, explaining 16.1% of the variation in the data (Fig. 1E).

To identify if mis-splicing occurred in disease relevant pathways previously implicated in CAG expansion SCAs, we performed gene ontology enrichment analysis of the genes with skipped exon events dysregulated in two or more cell lines using metascape [22]. Enriched gene ontology terms for analysis of shared skipped exon events included membrane trafficking, microtubule cytoskeleton organization, DNA damage response and chromatin organization (Fig. 1F). These pathways have all previously been shown to be affected or implicated in SCA diseases [1, 3–7, 23–25]. Using two distinct analysis platforms, DAVID [26] coupled to cytoscape [27] (Fig. 1G) and metascape [22] (Fig. S1D), we confirmed the enrichment of these pathways at the level of individual cell lines with enriched terms demonstrating that skipped exon events occurred in genes implicated in cytoskeletal based processes, DNA damage and repair and cilium organization (Fig. 1G; Fig. S1D). These broader analyses also highlighted enrichment of GO terms implicated in transcription such as zinc-fingers, metal ion binding and transcriptional regulation (Fig. 1G, Fig. S2). Together these data demonstrate that dysregulation of alternative splicing is a shared transcriptomic hallmark of patient-derived SCA fibroblast cell lines, and that alternative splicing occurs in genes involved in cellular pathways previously implicated in SCAs.

In order to understand the possible contribution of CAG repeat length to alternative splicing dysregulation, CAG repeat lengths (Fig. S3A), total TPMs for the relevant ATXN gene (Fig. S3B) and approximate CAG repeat load (Fig. S3C) per sample were correlated with the total number of splicing changes, the number of significantly mis-spliced skipped exon events, and dysregulation score for significantly mis-spliced skipped exon events. Assuming an equal contribution to gene expression from each allele we approximated CAG load as the TPM for the ATXN gene divided by 2 and multiplied by the expansion repeat length of the ATXN gene for each cell line. Our results did not reveal any significant correlations between splicing and CAG repeat size, approximate CAG load or disease gene expression (Fig. S3). This is consistent with patient derived fibroblast cell lines from myotonic dystrophy with variable CTG and CCTG repeat lengths in which correlations with repeat length and splicing dysregulation were not identified [21].

Gene expression changes show less consistency across SCA patient-derived fibroblast cell lines compared to alternative splicing changes

Having confirmed expression of the relevant disease associated genes in these cell lines (Fig. S1A), we next wanted to understand the relative contributions of differential gene expression and alternative splicing dysregulation to the transcriptomic hallmark of CAG SCAs. Using DESeq2, we identified 477 (SCA1.1), 563 (SCA1.2), 902 (SCA1.3), 486 (SCA3.1), 553 (SCA3.2) and 1179 (SCA7) differentially expressed genes (DEGs; log2FC > |1.5| & padj < 0.05) per cell line with 851 genes showing differential expression across two or more cell lines (Fig. 2A). We are reporting global changes, which showed both up and downregulated genes, with an overall trend of more genes showing downregulation than upregulation (2897 total downregulated genes versus 1264 upregulated) even after removal of confounding Y-linked genes reported as downregulated in female cell lines (Figs 2B andS4A, Supplemental Table 3). This trend of more downregulated genes than upregulated is consistent with expression changes in mouse models of multiple CAG SCAs [28–33] and of CAG repeat expressing reporter cell lines [34].

Differential gene expression analysis for CAG repeat expansion SCA fibroblasts revealed less disease-relevant terms. (A) Number of differentially expressed genes in each cell line with the proportion of DEGs shared between two to five cell lines shown, log2FC > |1.5| & padj < 0.05. (B) Scatter plot of all significantly upregulated (positive log2FC) and downregulated (negative log2FC) genes per cell lines (sex-linked genes excluded), log2FC > |1.5| & padj < 0.05. (C) Total number of differentially expressed genes in more than one cell line, log2FC > |1.5| & padj < 0.05. These shared events have been used to generate PCA plot. (D) Principal component analysis plot of genes based on transcripts per million (TPM) from highly expressed isoforms for each gene plotted per sample per condition, log2FC > |1.5| & padj < 0.05. (E) Enrichment of summary gene ontology terms identified using metascape for genes significantly expressed in two or more cell lines, log2FC > |1.5| & padj < 0.05.
Figure 2

Differential gene expression analysis for CAG repeat expansion SCA fibroblasts revealed less disease-relevant terms. (A) Number of differentially expressed genes in each cell line with the proportion of DEGs shared between two to five cell lines shown, log2FC > |1.5| & padj < 0.05. (B) Scatter plot of all significantly upregulated (positive log2FC) and downregulated (negative log2FC) genes per cell lines (sex-linked genes excluded), log2FC > |1.5| & padj < 0.05. (C) Total number of differentially expressed genes in more than one cell line, log2FC > |1.5| & padj < 0.05. These shared events have been used to generate PCA plot. (D) Principal component analysis plot of genes based on transcripts per million (TPM) from highly expressed isoforms for each gene plotted per sample per condition, log2FC > |1.5| & padj < 0.05. (E) Enrichment of summary gene ontology terms identified using metascape for genes significantly expressed in two or more cell lines, log2FC > |1.5| & padj < 0.05.

In contrast to alternative splicing analysis, we did not identify any differentially expressed genes that were shared across all six cell lines (Fig. 2C). Indeed, we only identified 10 differentially expressed genes that were shared across five cell lines (Fig. 2C) compared to 38 skipped exon events shared across five cell lines (Fig. 1C). Pairwise analysis showed the highest number of shared DEGs (164) between SCA1.3 and SCA7 cell lines, whereas the lowest number of shared DEGs (56) was observed between SCA1.3 and SCA3.1 cell lines (Fig. S4B). By performing a PCA plot of differentially expressed genes shared across two or more cell lines, based on transcripts per million (TPMs) from highly expressed isoforms for each gene, in contrast to our PCA of alternative splicing events (Fig. 1E), we observed no distinct separation or clustering of healthy controls versus SCA patient-derived fibroblast cell lines (Fig. 2D).

To better understand the pathways affected by differentially expressed genes, we performed gene ontology enrichment analysis using metascape of DEGs shared across two or more cell lines and found pathways including skeletal system and muscle structure development, cell morphogenesis and cell differentiation to be enriched (Fig. 2E). We also performed gene ontology enrichment analysis at the level of individual cell lines using DAVID coupled to cytoscape [26, 27, 35] which, in contrast to the alternative splicing analysis, did not show disease relevant functional clusters enriched in multiple cell lines. In fact, we primarily saw clusters of terms enriched for one or two cell lines, such as a developmental protein cluster (Fig. S5), with little overlap of enriched terms with our metascape analysis (Fig. 2E). Compared to enriched pathways identified in our alternative splicing analysis, enriched GO terms and functional cluster overlap for differentially expressed genes were less closely linked to pathways known to be affected in CAG expansion SCAs.

Finally, we wanted to understand the overlap between mis-spliced and differentially expressed genes. When comparing shared significantly mis-spliced skipped exon events (ΔPSI>10%, FDR < 0.1) to the set of significantly DEGs (log2FC > |1.5| & padj < 0.05), we identified less than 1.5% of mis-spliced events being also differentially expressed (Fig. S4C). Previously it was demonstrated that up to 25% of all alternatively spliced genes were also differentially expressed in cerebellum of an SCA1 mouse model [10]. Using similar filtering thresholds (FDR < 0.1; padj < 0.05) we identified that between 13.2% (SCA7) and 47.6% (SCA1.3) of the differentially spliced genes in our cell lines were also differentially expressed (Fig. S4D). Overall, our analyses identified a greater number of skipped exon events compared to differentially expressed genes with the skipped exon events affecting genes in disease relevant pathways while the differentially expressed genes are implicated in pathways with weaker relevance to ataxia. These findings might indicate that transcriptomic dysregulation in CAG expansion SCA patient-derived fibroblasts is characterized by disease-relevant defects in alternative splicing and suggest that alternative splicing dysregulation could contribute more to global transcriptional dysregulation and disease burden seen in SCAs than differential gene expression.

Patient-derived fibroblast cell lines of CAG expansion SCAs exhibit dysregulation of alternative splicing in genes involved in pathways disrupted in SCAs

Due to the identification of widespread alternative splicing dysregulation compared to the limited differential gene expression changes detected using stringent thresholds for both analyses (Figs 1A and2A), we sought to understand the possible contribution of alternative splicing to disease in CAG expansion SCAs. We identified and validated specific alternative splicing events dysregulated across multiple patient-derived cell lines. We are reporting for the first time four novel mis-spliced events significantly dysregulated across two to six patient-derived fibroblast cell lines (FDR < 0.1, ΔPSI > 10%; Figs 3 and S6).

Genes involved in pathways impaired in CAG expansion SCAs are significantly mis-spliced across SCA1, 3 and 7 patient-derived fibroblast cell lines. (A–D) RT-PCR analysis of SLC35E2A exon 2, ZNF880 exon 4, LRRC15 exon 2, and EXOSC10 exon 13 respectively, in SCA1, SCA3 and SCA7 patient-derived fibroblast cell lines; ns—not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, mean ± SD.
Figure 3

Genes involved in pathways impaired in CAG expansion SCAs are significantly mis-spliced across SCA1, 3 and 7 patient-derived fibroblast cell lines. (A–D) RT-PCR analysis of SLC35E2A exon 2, ZNF880 exon 4, LRRC15 exon 2, and EXOSC10 exon 13 respectively, in SCA1, SCA3 and SCA7 patient-derived fibroblast cell lines; ns—not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, mean ± SD.

Solute carrier family 35 member E2A (SLC35E2A) exon 2, which is predicted to be involved in membrane trafficking, was identified by RNASeq to be less included in SCA1.1, SCA3.1, SCA3.2 and SCA7 fibroblast lines compared to healthy controls (Fig. S6A). Using RT-PCR, we confirmed a significant decrease in exon 2 inclusion across all six patient-derived cell lines, with PSI ranging from 45.9% in SCA7 (P < 0.0001) to 71.8% SCA1.3 (P < 0.01; Ctrl average PSI: 77.8%; Fig. 3A). Similarly, by RNASeq we observed an increase in exon 4 inclusion in zinc finger protein 880 (ZNF880) which is involved in transcriptional regulation (Fig. S6B). RT-PCR demonstrated an increase in ZNF880 exon 4 inclusion with PSIs ranging from 25.1% for SCA7 (P < 0.0001) to 38.3% for SCA3.1 (P < 0.0001) patient-derived fibroblast cell lines compared to an average PSI of 8.5% for control cell lines (Fig. 3B). Interestingly, SCA1.1 and SCA1.2 did not show any significant changes compared to control cell lines (Fig. 3B). We also validated increased inclusion of leucine rich repeat containing protein 15 (LRRC15) exon 2, which is predicted to be involved in membrane trafficking and regulation of cell migration. LRRC15 exon 2 showed increased inclusion for SCA3.1, SCA3.2 and SCA7 compared to control cell lines via RNASeq (Fig. S6C). RT-PCR analysis confirmed an increased inclusion of exon 2 from an average control PSI of 7.7% to 25.9%, 32.7% and 38.5% for SCA7, SCA3.2 and SCA3.1, respectively (P < 0.0001;  Fig. 3C). Finally, we also report RNASeq analysis showing dysregulation of exon 13 splicing in SCA3 fibroblasts for exosome component 10 (EXOSC10) (Fig. S6D), which is known to be an important factor in various RNA processing events, including rRNA processing, and is a putative catalytic component of the RNA exosome. Validation via RT-PCR showed 58.9% and 56.9% inclusion of EXOSC10 exon 13 in SCA3.1 and SCA3.2 (P < 0.0001), respectively, compared to an average control PSI of 100% (Fig. 3D). Interestingly, SCA1.1, SCA1.2, SCA1.3 and SCA7 also showed an average PSI of 100% (Fig. 3D), indicating the mis-splicing of EXOSC10 exon 13 is specific to SCA3.

Together, these data demonstrate that specific events are mis-spliced across multiple CAG expansion SCA patient-derived fibroblast cell lines and that the genes affected by these alternative splicing events are relevant to pathological hallmarks of SCAs and cellular pathways disrupted in SCA disease pathogenesis, such as transcriptional regulation, membrane trafficking and RNA processing.

To assess the responsiveness of identified and validated mis-spliced events to therapeutic interventions, we treated patient-derived fibroblast cell lines with Hit 2, a novel CAG regulating compound that has been identified by our group [34]. Hit 2 has been shown to reduce expression of CAG expansion transcripts across multiple patient-derived fibroblast lines and in the Atxn1154Q/2Q SCA1 mouse model and successfully rescued dysregulation of alternative splicing in this SCA1 mouse model [34]. We treated SCA1.1 and SCA3.1 patient-derived fibroblast cell lines with Hit 2 for 48 h. In the SCA1.1 patient-derived cell lines, 10 nM Hit 2 reduced expression of ATXN1 by 46.6% (P = 0.0339) and at 100 nM Hit 2, ATXN1 expression was reduced by 47% (P = 0.0373; Fig. S7A). For SCA3.1, a 33.3% reduction in ATXN3 expression was seen at 1 μM (P = 0.0195; Fig. S7B). Hit 2 rescued two validated mis-spliced events, for SLC35E2A exon 2 in SCA1.1 cells treatment with 10 nM and 100 nM Hit 2 shifted PSI to 76.3%, and 76.9%, respectively, while DMSO treated SCA1.1 had a PSI of 67%. For SLC35E2A exon 2 inclusion at 10 nM was increased by 9.3% (P = 0.0139) and at 100 nM by 9.9% (P = 0.0108) in SCA1.1 patient-derived fibroblasts compared to DMSO treated SCA1.1 (Fig. S7C). LRRC15 exon 2 showed an 7.1% increased inclusion at 100 nM (P = 0.0078) and at 1 μM this exon was increased by 7.2% (P = 0.00111) in the SCA3.1 patient-derived fibroblasts compared to DMSO treated SCA3.1 (Fig. S7D). Our rescue results are consistent with the small molecule treatments from DM and SCA studies showing partial rescue of mis-spliced events [21, 34], suggesting these patient-derived SCA cell models can be used to study therapeutic approaches.

Dysregulation of alternative splicing occurs independently of the genetic context of SCA disease genes

To assess the CAG dependence of alternative splicing in SCAs, we performed alternative splicing analysis of RNASeq data from (CAG)60 repeat containing reporter cell lines in which the repeat expansion is out of context of any SCA disease causing gene [34].

Alternative splicing analysis was performed for three biological replicates per cell line for non-repeat containing parental controls versus two independent (CAG)60 repeat containing reporter cell lines, referred to as Clone 15 and Clone 37. Dysregulation of alternative splicing was identified across both CAG repeat containing cell lines with skipped exon events accounting for more than 50% of all mis-spliced events (ΔPSI > 10%, FDR < 0.1; Fig. 4A). Both clone 15 and clone 37 showed dysregulation of both inclusion and exclusion events with no prevalence for either type (Fig. 4B). The maximum ΔPSI per cell line for inclusion events ranged from 84% to 100%, with the mean ΔPSI between 26.7% and 28.9%. The maximum ΔPSI per cell line for exclusion events ranged from 80% to 85%, with the mean ΔPSI of 23.7% and 26.9% (Fig. 4B). Across both cell lines, a total of 293 shared skipped exon events (ΔPSI > 10%, FDR < 0.1) were detected; utilizing these events to perform a PCA analysis showed a distinct separation and clustering of non-repeat containing parental cell lines versus repeat containing reporter cell lines (Fig. 4C).

Widespread dysregulation of alternative splicing in CAG repeat containing reporter cell lines. (A) Percentage of significantly mis-spliced skipped exon (SE) retained intron (RI), mutually exclusive exons (MXE), alternative 5′ splice site (A5SS) and alternative 3′ splice site (A3SS) events as a proportion of total splicing events in CAG containing reporter cell lines vs non repeat control, number of each event shown on bar, FDR < 0.1, ΔPSI > 10%. (B) Percentage of exon inclusion (positive) or exclusion (negative) for significantly alternatively spliced skipped exon events per cell line, FDR < 0.1, ΔPSI > 10%. (C) Principal component analysis of shared skipped exon events dysregulated in Clone15 and 37 datasets. (D) Enrichment of summary gene ontology terms identified using metascape for skipped exon events dysregulated in each cell line. (E) Functional classification of analysis of significantly mis-spliced skipped exon events, P < 0.05, ΔPSI > 10%.
Figure 4

Widespread dysregulation of alternative splicing in CAG repeat containing reporter cell lines. (A) Percentage of significantly mis-spliced skipped exon (SE) retained intron (RI), mutually exclusive exons (MXE), alternative 5′ splice site (A5SS) and alternative 3′ splice site (A3SS) events as a proportion of total splicing events in CAG containing reporter cell lines vs non repeat control, number of each event shown on bar, FDR < 0.1, ΔPSI > 10%. (B) Percentage of exon inclusion (positive) or exclusion (negative) for significantly alternatively spliced skipped exon events per cell line, FDR < 0.1, ΔPSI > 10%. (C) Principal component analysis of shared skipped exon events dysregulated in Clone15 and 37 datasets. (D) Enrichment of summary gene ontology terms identified using metascape for skipped exon events dysregulated in each cell line. (E) Functional classification of analysis of significantly mis-spliced skipped exon events, P < 0.05, ΔPSI > 10%.

To understand if mis-splicing occurred in SCA disease relevant pathways despite the CAG expansion not being located in a SCA disease causing gene, we performed gene ontology enrichment analysis of the genes with shared skipped exon events using metascape [22]. Actin filament-based process, membrane organization, regulation of intracellular transport and regulation of transmembrane activity were identified as enriched gene ontology summary terms in our analysis (Fig. 4D) and reflect pathways that have all previously been shown to be affected or impaired in SCA diseases [1, 3–7, 23–25]. Our broader cell line specific analysis using metascape [22] and DAVID [26, 27, 35] for significant skipped exon events (FDR < 0.1, ΔPSI > 10%) revealed DNA damage and repair, transcriptional regulation, zinc finger proteins, cilium biogenesis and assembly, and regulation of cytoskeleton organization being enriched GO terms (Figs 4E andS8A). Some of these GO terms were similar to the terms we observed in our patient-derived fibroblast cell lines, such as DNA damage and repair, and cilium biogenesis (Fig. 1F and G; Figs S1D and S2). Together, these data demonstrate that CAG repeat expansions can induce dysregulation of alternative splicing independent of a genetic context of any SCA causing gene with dysregulated skipped exon events affecting pathways known to be disrupted in CAG expansion SCAs.

CAG repeat containing reporter cell lines show dysregulation of alternative splicing events involved in pathways known to be disrupted in SCAs

We next used our reporter cell lines to understand whether mis-splicing affecting individual disease relevant genes can be detected and validated in a system out of genetic context of any SCA disease causing gene. To do this, we generated a heatmap of the 75 mis-spliced events with the largest PSI differences between clone 15 and 37 compared to parentals (FDR < 0.1, ΔPSI > 10%; Fig. 5A) and selected six disease relevant events for validation in clone 37 (Fig. 5B–G, Fig. S8B–G).

Significantly mis-spliced events in CAG repeat containing reporter cell lines are involved in disrupted pathways in SCAs. (A) Heatmap of shared skipped exon events across Clones15 and 37. (B–H) RT-PCR analysis of STK33 exon 2, SLC29A3 exon 3, ZNF573 exon 2, TMEM116 exon 7, PAX3 exon 2 and GMDS-DT exon 4 respectively, in CAG repeat containing reporter cell lines; ns—not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, mean ± SD.
Figure 5

Significantly mis-spliced events in CAG repeat containing reporter cell lines are involved in disrupted pathways in SCAs. (A) Heatmap of shared skipped exon events across Clones15 and 37. (B–H) RT-PCR analysis of STK33 exon 2, SLC29A3 exon 3, ZNF573 exon 2, TMEM116 exon 7, PAX3 exon 2 and GMDS-DT exon 4 respectively, in CAG repeat containing reporter cell lines; ns—not significant, *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, mean ± SD.

Serine/Threonine Kinase 33 (STK33), which is predicted to be involved in mitotic DNA damage checkpoint signaling, showed a significant reduction in inclusion of exon 2 compared to parental controls based on RNASeq data (Fig. S8B). We confirmed 63.4% inclusion in Clone 37 compared to an average PSI of 98.4% for parental cells using RT-PCR (P < 0.0001; Fig. 5B). RNASeq for solute carrier family 29 member 3 (SLC29A3), which is known to be involved in membrane trafficking, identified a significant reduction in inclusion of exon 3 (Fig. S8C). RT-PCR data confirmed 78.1% inclusion in Clone 37 compared to 86.5% average PSI of parental controls (P = 0.0017;  Fig. 5C. RNASeq data for zinc finger protein 573 (ZNF573), which has a role in transcription regulation, identified reduced inclusion of exon 2 in CAG repeat containing Clone 37 compared to parental controls (Fig. S8D). We confirmed 89% inclusion in Clone 37 compared to 93.2% of average PSI for parental using RT-PCR (P = 0.0022; Fig. 5D). Transmembrane protein 116 (TMEM116) which is predicted to be an integral component of the membrane, showed a significant reduction in inclusion of exon 7 compared to parental controls based on RNASeq data (Fig. S8E). Validation via RT-PCR confirmed 82.8% inclusion compared to an average of 88.9% PSI for parental (P = 0.0095, Fig. 5E). Paired Box 3 (PAX3), which is involved in transcriptional regulation and may regulate cell proliferation, migration, apoptosis and neural development, showed reduced inclusion of exon 2 via RNASeq (Fig. S8F). RT-PCR analysis confirmed a reduced inclusion of ~33.1% of PAX3 exon 2 compared to average PSI of 57% for parental controls (P = 0.0108; Fig. 5F). Finally, RNASeq data for GMDS divergent transcript (GMDST-DT), a long non-coding RNA, detected increased inclusion of exon 4 in Clone 37 (Fig. S8G). We confirmed 49.7% inclusion of exon 4 in Clone 37 compared to a 27.2% average PSI for parental controls (P = 0.0066; Fig. 5G).

Together, our data demonstrate that identified mis-spliced events in our reporter cell lines, where the repeat is out of context of any SCA disease causing gene, occur in pathways known to be disrupted in SCAs and that the genes affected by the mis-spliced events, which can be validated, are relevant to SCA disease pathogenesis and pathological hallmarks.

Discussion

Disruption of alternative splicing is a well-studied key driver of disease pathogenesis in DM1, a CTG repeat expansion disorder, where dysregulation of various splicing events has been directly linked to disease symptoms [13, 36]. Recent studies have identified widespread, presymptomatic dysregulation of alternative splicing across mouse models of SCA1, 3 and 7 [9]. While these studies have implicated a possible role of alternative splicing as a driver of neuronal dysfunction, its role in patient-derived model systems remains unknown. Here, we identified widespread dysregulation of alternative splicing across CAG expansion patient-derived fibroblast cell lines of SCA1, 3 and 7, with skipped exon events being the most frequently dysregulated class. Interestingly, and consistent with presymptomatic data from CAG SCA mouse models [9], we identified a greater proportion of genes affected by alternative splicing dysregulation than differential gene expression in patient-derived fibroblasts. We also found that alternative splicing dysregulation affected genes that function in pathways known to be impaired in SCAs, whereas the differentially expressed genes are associated with pathways that have less relevance to disease pathogenesis. Finally, we demonstrate that the presence of CAG repeat expansion, independent of the genetic context of any SCA associated gene, is sufficient to induce widespread alternative splicing dysregulation. Together these data suggest that alternative splicing is a key and predominant disease relevant transcriptomic phenotype in CAG expansion SCAs.

We are reporting for the first time dysregulation of alternative splicing events, involved in disease relevant pathways, across multiple CAG expansion SCA patient-derived fibroblasts. Although further studies on these events will be needed to understand possible functional links and relevance to impaired pathways, current evidence implicates these genes in disease relevant cellular processes. For instance, the exon in EXOSC10 that was mis-spliced in both SCA3 cell lines is located within the helicase and RNaseD C-terminal (HRDC) domain with EXOSC10 being required for recombinant protein A (RPA) assembly and contributing to DNA repair by homologous recombination. The absence or depletion of EXOSC10 can cause various RNA processing defects, increase sensitivity to DNA damage and severely impair DNA repair [37, 38]. Interestingly, it has recently been demonstrated that EXOSC10 overexpression is protective in mouse models of SCA1 and improves phenotypic and pathological hallmarks of the disease [39]. Furthermore, in our CAG repeat containing reporter cell lines we identified mis-splicing of STK33 which is also implicated in DNA damage and repair. DNA damage and repair pathways are disrupted across multiple SCAs [1, 3–6] with the importance of this pathway to ataxia symptoms highlighted by the fact that mutations in DNA repair genes directly cause several recessive ataxias [1], including ataxia telangiectasia [40], ataxia with ocular apraxia type 1 (AOA1) [41, 42], type 2 (AOA2) [43] and spinocerebellar ataxia with axonal neuropathy 1 (SCAN1) [42]. Other pathways that we observed in our GO term analysis were membrane, membrane trafficking and transcriptional regulation, pathways known to be disrupted in CAG expansion SCAs [1, 3, 31]. Further studies are needed to understand the consequences of skipped exon dysregulation and their role in disruption of specific pathways.

One of the aims of this study was to probe the relevance and contribution of differential gene expression and alternative splicing dysregulation to the overall transcriptomic phenotype and disease burden in patient-based model systems of CAG expansion SCAs. Previous studies in SCA1 and 3 mouse models have not been successful in identifying key and consistent shared drivers of disease pathogenesis [29, 30, 44]. However, our recent study identified the presence of widespread repeat-dependent alternative splicing dysregulation prior to symptom onset and prior to global changes in differential gene expression [9]. Similarly CAG repeat length dependent alternative splicing changes were identified at early stages in brain regions of HD knock-in mouse models prior to behavioral phenotypic changes [19]. Here we found that a greater proportion of genes involved in disease relevant cellular pathways were affected by alternative splicing dysregulation than by differential expression. Overall, these findings might indicate that transcriptomic dysregulation in CAG expansion SCA patient-derived fibroblasts is characterized by disease-relevant defects in alternative splicing, that alternative splicing dysregulation could contribute more to global transcriptional dysregulation seen in SCAs than differential gene expression.

Despite the fact that we do not have a good understanding of the mechanism of alternative splicing dysregulation in CAG expansion SCAs, insights can be gathered from other well studied repeat expansion diseases with a clear link of mis-splicing to disease symptoms. Similar to DM1, the sequestration of a specific RPB or multiple RBPs might be one possible mechanism for widespread mis-splicing in CAG expansion SCAs. This process could be mediated by any one of or a combination of sense CAG and antisense CUG RNAs, polyglutamine expansion proteins or proteins produced through repeat-associated non-AUG translation [45]. Evidence for CUG RNA mediated splicing disruption comes from previous studies in both SCA8 [46] and SCA2 [47], while in SCA3 and HD it has previously been proposed that CAG expansion RNAs drive dysregulation of specific alternative splicing events [20]. Studies in HD have also shown reduced expression of multiple RBPs [15], although changes in RBP expression were not detected at early and pre-symptomatic time points where alternative splicing dysregulation was already widespread in mouse models of CAG SCAs [9, 10]. In this study we did not detect changes in RBP expression despite widespread splicing dysregulation, thus changes in RBP transcript levels are unlikely to be an early or robust driver of alternative splicing dysregulation in CAG SCAs. Further studies accounting for each of these possible mechanisms, including assessing protein levels and changes in cellular localization of RBPs, are necessary to understand which RBPs may be involved in driving splicing dysregulation and through which mechanisms they drive this dysregulation in CAG expansion SCAs.

Here we identified dysregulation of alternative splicing as a novel transcriptomic hallmark in patient-derived model systems of CAG expansion SCAs. Our data indicates that patient-derived fibroblasts cell lines are a good working model to study transcriptomic changes in CAG expansion SCAs, but the use of more disease relevant patient-based model systems such as iPSC-derived neurons, including neuronal subtypes with differential vulnerability, is essential for further characterization of transcriptomic changes and to understand the possible contribution of specific mis-spliced events to neuronal vulnerability and dysfunction and disease pathogenesis. Although we did not see any clear correlations between CAG repeat size, approximate CAG repeat load or expression of the disease genes with the splicing changes (Fig. S3), investigating alternative splicing across a broader range of cell lines with a wide range of repeat lengths including lines from additional SCAs such as SCA6 for shorter repeats and SCA12 may reveal if or how repeat length, disease gene expression and coding versus non-coding CAG expansions affect alternative splicing. Furthermore, this would also enable investigation of shared, or disease specific, mechanisms of alternative splicing dysregulation in a system with an expression profile of RBPs and a disease phenotype, including polyglutamine protein behavior, more consistent with that of affected tissues in patients. The demonstration that treatment with a small molecule can partially rescue mis-splicing in SCA1 and SCA3 patient-derived fibroblast cell lines is consistent with our previous study with this small molecule showing reduction of CAG RNA expression levels and partial rescue of splicing dysregulation in a SCA1 mouse model [34], together suggesting that CAG SCA patient-derived fibroblast cell lines can be used to study therapeutic approaches with splicing dysregulation as a therapeutic readout. Our study provides the first demonstration that alternative splicing dysregulation might be the predominant transcriptomic phenotype in patient-derived fibroblast cell lines of multiple CAG expansion SCAs and provides a basis to investigate mechanisms of splicing dysregulation across this group of diseases.

Materials and methods

Data and material availability

All datasets used in this study are available through the database of Genotypes and Phenotypes (dbGaP) using the accession number phs003759.v1.p1. Further information and requests for resources and reagents should be directed to and will be fulfilled by the corresponding authors: J Andrew Berglund ([email protected]), Hannah K Shorrock ([email protected]).

Culture of stable CAG repeat expressing HEK293T screening cell line and CAG SCA patient-derived fibroblast cell lines

Parental HEK293T cells and HEK293T cells containing stably integrated CAG repeat expansions (referred to as clone 15 and clone 37) were cultured in DMEM supplemented with 10% fetal bovine serum (FBS) and 1× penicillin/streptomycin in a humidified atmosphere at 37°C at 5% CO2. RNA extractions were performed using Aurum total RNA mini kit (Bio Rad) kit with on-column DNase I treatment, following the manufacturer’s protocol.

All CAG SCA patient-derived fibroblast cell lines (Supplemental Table 1) were cultured in Eagle’s Minimum Essential Medium (EMEM; Corning) containing 15% fetal bovine serum (FBS) and 1× penicillin–streptomycin (P/S) (Thermo Fisher Scientific) at 37°C and 5% CO2. The cells were expanded and plated into 6 well plates until they reached 80% confluency. Cell pellets were harvested for RNA extraction. RNA extractions were performed using Quick-RNA Miniprep kit (Zymogen) with on-column DNase I treatment following manufacturer’s protocol.

For the Hit 2 treatment, the cells were expanded and seeded in 12-well tissue culture plates with a density of approximately 3 × 104 cells/ml. Once the cells reached ~70% confluency, media was removed and replaced with fresh media containing Hit 2 at the specified concentrations or with a matched percentage of DMSO (vehicle; always ≤ 0.01% DMSO). Following 48 h treatment, media was removed, and RNA was extracted using Aurum mini kit (BioRad) with on-column DNase I treatment, following the manufacturer’s protocol.

RNA sequencing

Library preparation was performed using NEBNext Ultra II Directional RNA library preparation kit for Illumina with rRNA depletion and 500 ng of starting RNA (n = 3 biological replicates were used to generate libraries for each cell lines). The Qubit RNA high sensitivity assay (Invitrogen) was used to obtain RNA concentration. The Agilent Bioanalyzer (Fragment Analyzer) was used to determine RNA quality as a RIN score. Equal quantities of libraries were loaded into P2 flow cell 1000/2000 by Illumina. FASTQ file quality was assessed using FastQC (version 0.11.9) and datasets with an average read depth of > 40 million paired end reads were included in this study (Supplemental Table 4). FASTQ files were then aligned to the GRCh38/hg38 human reference genome using STAR (version 2.7.10a) [48]. Differential gene expression (DEG) was performed in RStudio (2023.09.0; R 4.2.2) using DESeq2 (version 1.36.0) [49] and genes that passed a threshold of padj < 0.05 and log2FC > |1.5| were considered significantly differentially expressed. Quantification of transcript abundances transcripts per million (TPMs) from highly expressed isoforms was performed using kallisto [50] and plotted with RStudio (2023.09.0; R 4.2.2). Alternative splicing analysis was performed using rMATS (version 4.1.2) [51] and events were considered significantly mis-spliced if they passed the threshold of false-discovery rate (FDR) < 0.1 and ΔPSI > |0.1|. All ΔPSI values are converted from a ratio to a percentage with the threshold adjusted accordingly: ΔPSI > |10%|. Exon numbers are based on counting the first exon in a relevant transcript as exon 1 (Supplemental Table 5). Upset plots were generated using the ComplexHeatmap (2.10.0) R package. PCA plots was created using ggplot2 (version 3.4.2) in RStudio. Although all events shared across two and more datasets (Figs 1C and2C) have been used to generate PCA plots for alternative splicing and DGE analysis, only events that were detected across all samples with PSI values for each sample were taken into consideration for plotting the PCA graphs.

Gene ontology enrichment analysis

Gene ontology enrichment analysis was performed using metascape (version v3.5.20230101) [22] and the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [26, 35] coupled to Cytoscape (version 3.9.1) [27] with a node Q-value of 0.05 and an Edge Cutoff of 0.375 using the Edge weighted spring embedded layout based on overlap size.

RT-PCR splicing analysis

RNA concentrations were measured using nanodrop and 500 ng total RNA was reverse transcribed using SuperScript IV reverse transcriptase (Invitrogen) with random hexamers (IDT). PCR for selected splicing events was performed using the Taq 2× master mix (NEB) with 2 ul cDNA under the following conditions: 95°C 30 s, followed by 26 cycles for patient-derived fibroblasts and 28 cycles for CAG containing reporter cell lines of 95°C 30 s, primer specific annealing temperature (Ta) 30 s, 68°C 30 s, followed by 68°C 5 min (Supplemental Table 6). PCR products were resolved through capillary electrophoresis in a 5300 Fragment Analyzer system using the DNF-905 kit for 1–500 bp fragments (Agilent Technologies), following the manufacturer’s protocol. The relative fluorescence values (RFU) for the inclusion and exclusion bands, obtained from the ProSize 4.00 software (Agilent technologies), were used to calculate percent spliced in (PSI) for each exon of interest using the following formula: (Inclusion band RFU)/(Inclusion band RFU + Exclusion band RFU) *100.

Quantitative PCR (RT-qPCR) analysis

qPCRs cDNA was subjected to qPCR for 39 cycles with PowerUp SYBR Master Mix (Applied Biosystems), according to manufacturer’s instructions with ATXN1 and ATXN3 qPCR Forward and Reverse primers (IDT; Supplemental table 7). All qPCRs were performed in a CFX384 or CFX96 Real-Time System (Bio-Rad) in technical triplicates using 1–2 ul cDNA. Ct values were obtained via CFX maestro software (Bio-Rad) and RT-qPCR data were analyzed using the 2−ΔΔCt method [52]. The levels of ATXN1 and ATXN3 from Hit2 treatment presented as relative mRNA levels by comparing to DMSO control treatments. GAPDH (IDT; Supplemental table 7) was used as the housekeeping gene. To confirm specificity of primers, qPCR was performed on RT reactions in which the RT enzyme was replaced with H2O (RT-) under the same conditions.

Data analysis

Data are represented as mean with standard deviation (SD) and statistical analyses were performed using two-tailed Student’s unpaired t-test for CAG containing reporter cell lines and one-way analysis of variance (ANOVA) followed by Tukey’s multiple comparisons test for patient-derived fibroblast cell lines. Figures were generated using GraphPad Prism 10.3.0 and Adobe Illustrator 27.0.

Acknowledgements

The authors thank The RNA Institute and members of the Berglund laboratory for their valuable input and discussion, and the CFG Core Facility at the University at Albany for performing RNASeq analysis. The authors would like to thank Ronald A.M. Bujisen and Willeke van Roon-Mom, Leiden University Medical Centre, for providing SCA1 and controls fibroblast cell lines (SCA1.2, SCA1.3, Cntrl4 and Cntrl5) and Laura P.W. Ranum for providing control 1 fibroblast line.

Conflict of interest statement: J.A.B. serves on the Scientific Advisory Committee for the Myotonic Dystrophy Foundation, has consulted or currently consults for Entrada Therapeutics, Juvena Therapeutics, Kate Therapeutics, D.E. Shaw Research, Dyne Therapeutics, Syros Pharmaceuticals, Wayfinder Biosciences and received research funding from Agios Pharmaceuticals, Biomarin Pharmaceuticals, PepGen, Syros Pharmaceuticals and Vertex Pharmaceuticals. J.A.B. has received licensing royalties from the University of Florida. J.A.B. and J.D.C. are co-founders and have financial interest in Repeat RNA Therapeutics Inc. J.D.C. is a part-time employee of the Center for NeuroGenetics at the University of Florida. J.A.B., H.K.S. and J.D.C. have a patent pending on the CAG repeat selective screening approach. All other authors report no competing interests.

Funding

This work was supported by the National Ataxia Foundation, National Institute of Health [K99 NS124994 to H.K.S., P50 NS04843 to J.A.B., R01 NS135254 to J.A.B. and H.K.S.]. The content of this article are solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funding agencies.

References

1.

Klockgether
 
T
,
Mariotti
 
C
,
Paulson
 
HL
.
Spinocerebellar ataxia
.
Nat Rev Dis Primer
 
2019
;
5
:
24
.

2.

Hekman
 
KE
,
Gomez
 
CM
.
The autosomal dominant spinocerebellar ataxias: emerging mechanistic themes suggest pervasive Purkinje cell vulnerability
.
J Neurol Neurosurg Psychiatry
 
2015
;
86
:
554
561
.

3.

Ashizawa
 
T
,
Öz
 
G
,
Paulson
 
HL
.
Spinocerebellar ataxias: prospects and challenges for therapy development
.
Nat Rev Neurol
 
2018
;
14
:
590
605
.

4.

Paulson
 
HL
,
Shakkottai
 
VG
,
Clark
 
HB
. et al.  
Polyglutamine spinocerebellar ataxias - from genes to potential treatments
.
Nat Rev Neurosci
 
2017
;
18
:
613
626
.

5.

Switonski
 
PM
,
Delaney
 
JR
,
Bartelt
 
LC
. et al.  
Altered H3 histone acetylation impairs high-fidelity DNA repair to promote cerebellar degeneration in spinocerebellar ataxia type 7
.
Cell Rep
 
2021
;
37
:
110062
.

6.

McLoughlin
 
HS
,
Moore
 
LR
,
Paulson
 
HL
.
Pathogenesis of SCA3 and implications for other polyglutamine diseases
.
Neurobiol Dis
 
2020
;
134
:
104635
.

7.

Zeng
 
L
,
Zhang
 
D
,
McLoughlin
 
HS
. et al.  
Loss of the spinocerebellar ataxia type 3 disease protein ATXN3 alters transcription of multiple signal transduction pathways
.
PLoS One
 
2018
;
13
:
e0204438
.

8.

Ayhan
 
F
,
Perez
 
BA
,
Shorrock
 
HK
. et al.  
SCA8 RAN polySer protein preferentially accumulates in white matter regions and is regulated by eIF3F
.
EMBO J
 
2018
;
37
:
e99023
.

9.

Shorrock
 
HK
,
Lennon
 
CD
,
Aliyeva
 
A
. et al.  
Widespread alternative splicing dysregulation occurs presymptomatically in CAG expansion spinocerebellar ataxias
.
Brain
 
2024
.

10.

Olmos
 
V
,
Thompson
 
EN
,
Gogia
 
N
. et al.  
Dysregulation of alternative splicing in spinocerebellar ataxia type 1
.
Hum Mol Genet
 
2023
;
33
:138–149.

11.

Wang
 
Y
,
Liu
 
J
,
Huang
 
B
. et al.  
Mechanism of alternative splicing and its regulation
.
Biomed Rep
 
2015
;
3
:
152
158
.

12.

Titus
 
MB
,
Chang
 
AW
,
Olesnicky
 
EC
.
Exploring the diverse functional and regulatory consequences of alternative splicing in development and disease
.
Front Genet
 
2021
;
12
:775395.

13.

Hale
 
MA
,
Johnson
 
NE
,
Berglund
 
JA
.
Repeat-associated RNA structure and aberrant splicing
.
Biochim Biophys Acta Gene Regul Mech
 
2019
;
1862
:
194405
.

14.

Charlet-B
 
N
,
Savkur
 
RS
,
Singh
 
G
. et al.  
Loss of the muscle-specific chloride channel in type 1 myotonic dystrophy due to misregulated alternative splicing
.
Mol Cell
 
2002
;
10
:
45
53
.

15.

Elorza
 
A
,
Márquez
 
Y
,
Cabrera
 
JR
. et al.  
Huntington’s disease-specific mis-splicing unveils key effector genes and altered splicing factors
.
Brain J Neurol
 
2021
;
144
:
2009
2023
.

16.

Lin
 
L
,
Park
 
JW
,
Ramachandran
 
S
. et al.  
Transcriptome sequencing reveals aberrant alternative splicing in Huntington’s disease
.
Hum Mol Genet
 
2016
;
25
:
3454
3466
.

17.

Schilling
 
J
,
Broemer
 
M
,
Atanassov
 
I
. et al.  
Deregulated splicing is a major mechanism of RNA-induced toxicity in Huntington’s disease
.
J Mol Biol
 
2019
;
431
:
1869
1877
.

18.

Tano
 
V
,
Utami
 
KH
,
Yusof
 
NABM
. et al.  
Widespread dysregulation of mRNA splicing implicates RNA processing in the development and progression of Huntington’s disease
.
EBioMedicine
 
2023
;
94
:
104720
.

19.

Ayyildiz
 
D
,
Bergonzoni
 
G
,
Monziani
 
A
. et al.  
CAG repeat expansion in the Huntington’s disease gene shapes linear and circular RNAs biogenesis
.
PLoS Genet
 
2023
;
19
:
e1010988
.

20.

Mykowska
 
A
,
Sobczak
 
K
,
Wojciechowska
 
M
. et al.  
CAG repeats mimic CUG repeats in the misregulation of alternative splicing
.
Nucleic Acids Res
 
2011
;
39
:
8938
8951
.

21.

Jenquin
 
JR
,
O’Brien
 
AP
,
Poukalov
 
K
. et al.  
Molecular characterization of myotonic dystrophy fibroblast cell lines for use in small molecule screening
.
iScience
 
2022
;
25
:
104198
.

22.

Zhou
 
Y
,
Zhou
 
B
,
Pache
 
L
. et al.  
Metascape provides a biologist-oriented resource for the analysis of systems-level datasets
.
Nat Commun
 
2019
;
10
:
1523
.

23.

Bushart
 
DD
,
Shakkottai
 
VG
.
Ion channel dysfunction in cerebellar ataxia
.
Neurosci Lett
 
2019
;
688
:
41
48
.

24.

Fogel
 
BL
,
Hanson
 
SM
,
Becker
 
EBE
.
Do mutations in the murine ataxia gene TRPC3 cause cerebellar ataxia in humans?
 
Mov Disord
 
2015
;
30
:
284
286
.

25.

Huang
 
L
,
Warman-Chardon
 
J
,
Carter
 
MT
. et al.  
Missense mutations in ITPR1 cause autosomal dominant congenital nonprogressive spinocerebellar ataxia
.
Orphanet J Rare Dis
 
2012
;
7
:
67
.

26.

Sherman
 
BT
,
Hao
 
M
,
Qiu
 
J
. et al.  
DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update)
.
Nucleic Acids Res
 
2022
;
50
:
W216
W221
.

27.

Shannon
 
P
,
Markiel
 
A
,
Ozier
 
O
. et al.  
Cytoscape: a software environment for integrated models of biomolecular interaction networks
.
Genome Res
 
2003
;
13
:
2498
2504
.

28.

Aikawa
 
T
,
Mogushi
 
K
,
Iijima-Tsutsui
 
K
. et al.  
Loss of MyD88 alters neuroinflammatory response and attenuates early Purkinje cell loss in a spinocerebellar ataxia type 6 mouse model
.
Hum Mol Genet
 
2015
;
24
:
4780
4791
.

29.

Driessen
 
TM
,
Lee
 
PJ
,
Lim
 
J
.
Molecular pathway analysis towards understanding tissue vulnerability in spinocerebellar ataxia type 1
.
elife
 
2018
;
7
:
e39981
.

30.

Haas
 
E
,
Incebacak
 
RD
,
Hentrich
 
T
. et al.  
A novel SCA3 knock-in mouse model mimics the human SCA3 disease phenotype including neuropathological, Behavioral, and transcriptional abnormalities especially in oligodendrocytes
.
Mol Neurobiol
 
2022
;
59
:
495
522
.

31.

Niewiadomska-Cimicka
 
A
,
Hache
 
A
,
Trottier
 
Y
.
Gene deregulation and underlying mechanisms in spinocerebellar ataxias with Polyglutamine expansion
.
Front Neurosci
 
2020
;
14
:
571
.

32.

Pflieger
 
LT
,
Dansithong
 
W
,
Paul
 
S
. et al.  
Gene co-expression network analysis for identifying modules and functionally enriched pathways in SCA2
.
Hum Mol Genet
 
2017
;
26
:
3069
3080
.

33.

Schuster
 
KH
,
Zalon
 
AJ
,
Zhang
 
H
. et al.  
Impaired oligodendrocyte maturation is an early feature in SCA3 disease pathogenesis
.
J Neurosci
 
2022
;
42
:
1604
1617
.

34.

Shorrock
 
HK
,
Aliyeva
 
A
,
Frias
 
JA
. et al.  
CAG repeat-selective compounds reduce abundance of expanded CAG RNAs in patient cell and murine models of SCAs
. bioRxiv
2024.08.17.608349
.

35.

Huang
 
DW
,
Sherman
 
BT
,
Lempicki
 
RA
.
Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources
.
Nat Protoc
 
2009
;
4
:
44
57
.

36.

Scotti
 
MM
,
Swanson
 
MS
.
RNA mis-splicing in disease
.
Nat Rev Genet
 
2016
;
17
:
19
32
.

37.

Marin-Vicente
 
C
,
Domingo-Prim
 
J
,
Eberle
 
AB
. et al.  
RRP6/EXOSC10 is required for the repair of DNA double-strand breaks by homologous recombination
.
J Cell Sci
 
2015
;
128
:
1097
1107
.

38.

Domingo-Prim
 
J
,
Endara-Coll
 
M
,
Bonath
 
F
. et al.  
EXOSC10 is required for RPA assembly and controlled DNA end resection at DNA double-strand breaks
.
Nat Commun
 
2019
;
10
:
2135
.

39.

Gall-Duncan
 
T
,
Luo
 
J
,
Jurkovic
 
C-M
. et al.  
Antagonistic roles of canonical and alternative-RPA in disease-associated tandem CAG repeat instability
.
Cell
 
2023
;
186
:
4898
4919.e25
.

40.

Senior
 
K
.
DNA damage mechanisms in ataxia telangiectasia
.
Lancet Neurol
 
2003
;
2
:
139
.

41.

Albaradie
 
R
,
Alharbi
 
A
,
Alsaffar
 
G
. et al.  
Ataxia with oculomotor apraxia type 1 associated with mutation in the APTX gene: a case study and literature review
.
Exp Ther Med
 
2022
;
24
:
709
.

42.

Rass
 
U
,
Ahel
 
I
,
West
 
SC
.
Defective DNA repair and neurodegenerative disease
.
Cell
 
2007
;
130
:
991
1004
.

43.

Suraweera
 
A
,
Becherel
 
OJ
,
Chen
 
P
. et al.  
Senataxin, defective in ataxia oculomotor apraxia type 2, is involved in the defense against oxidative DNA damage
.
J Cell Biol
 
2007
;
177
:
969
979
.

44.

Ingram
 
M
,
Wozniak
 
EAL
,
Duvick
 
L
. et al.  
Cerebellar transcriptome profiles of ATXN1 transgenic mice reveal SCA1 disease progression and protection pathways
.
Neuron
 
2016
;
89
:
1194
1207
.

45.

Banez-Coronel
 
M
,
Ranum
 
LPW
.
Repeat-associated non-AUG (RAN) translation: insights from pathology
.
Lab Investig
 
2019
;
99
:
929
942
.

46.

Daughters
 
RS
,
Tuttle
 
DL
,
Gao
 
W
. et al.  
RNA gain-of-function in spinocerebellar ataxia type 8
.
PLoS Genet
 
2009
;
5
:
e1000600
.

47.

Li
 
PP
,
Sun
 
X
,
Xia
 
G
. et al.  
ATXN2-AS, a gene antisense to ATXN2, is associated with spinocerebellar ataxia type 2 and amyotrophic lateral sclerosis
.
Ann Neurol
 
2016
;
80
:
600
615
.

48.

Dobin
 
A
,
Davis
 
CA
,
Schlesinger
 
F
. et al.  
STAR: ultrafast universal RNA-seq aligner
.
Bioinformatics
 
2013
;
29
:
15
21
.

49.

Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Genome Biol
. 2014;
15
:550.

50.

Bray
 
NL
,
Pimentel
 
H
,
Melsted
 
P
. et al.  
Near-optimal probabilistic RNA-seq quantification
.
Nat Biotechnol
 
2016
;
34
:
525
527
.

51.

Shen
 
S
,
Park
 
JW
,
Lu
 
Z
. et al.  
rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data
.
Proc Natl Acad Sci
 
2014
;
111
:
E5593
E5601
.

52.

Livak
 
KJ
,
Schmittgen
 
TD
.
Analysis of relative gene expression data using real-time quantitative PCR and the 2−ΔΔCT method
.
Methods
 
2001
;
25
:
402
408
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]