-
PDF
- Split View
-
Views
-
Cite
Cite
Lein N H Dofash, Gavin V Monahan, Emilia Servián-Morilla, Eloy Rivas, Fathimath Faiz, Patricia Sullivan, Emily Oates, Joshua Clayton, Rhonda L Taylor, Mark R Davis, Traude Beilharz, Nigel G Laing, Macarena Cabrera-Serrano, Gianina Ravenscroft, A KLHL40 3’ UTR splice-altering variant causes milder NEM8, an under-appreciated disease mechanism, Human Molecular Genetics, Volume 32, Issue 7, 1 April 2023, Pages 1127–1136, https://doi.org/10.1093/hmg/ddac272
- Share Icon Share
Abstract
Nemaline myopathy 8 (NEM8) is typically a severe autosomal recessive disorder associated with variants in the kelch-like family member 40 gene (KLHL40). Common features include fetal akinesia, fractures, contractures, dysphagia, respiratory failure and neonatal death. Here, we describe a 26-year-old man with relatively mild NEM8. He presented with hypotonia and bilateral femur fractures at birth, later developing bilateral Achilles’ contractures, scoliosis, and elbow and knee contractures. He had walking difficulties throughout childhood and became wheelchair bound from age 13 after prolonged immobilization. Muscle magnetic resonance imaging at age 13 indicated prominent fat replacement in his pelvic girdle, posterior compartments of thighs and vastus intermedius. Muscle biopsy revealed nemaline bodies and intranuclear rods. RNA sequencing and western blotting of patient skeletal muscle indicated significant reduction in KLHL40 mRNA and protein, respectively. Using gene panel screening, exome sequencing and RNA sequencing, we identified compound heterozygous variants in KLHL40; a truncating 10.9 kb deletion in trans with a likely pathogenic variant (c.*152G > T) in the 3′ untranslated region (UTR). Computational tools SpliceAI and Introme predicted the c.*152G > T variant created a cryptic donor splice site. RNA-seq and in vitro analyses indicated that the c.*152G > T variant induces multiple de novo splicing events that likely provoke nonsense mediated decay of KLHL40 mRNA explaining the loss of mRNA expression and protein abundance in the patient. Analysis of 3’ UTR variants in ClinVar suggests variants that introduce aberrant 3’ UTR splicing may be underrecognized in Mendelian disease. We encourage consideration of this mechanism during variant curation.
Introduction
Nemaline myopathies (NEM) are a clinically and genetically heterogeneous group of congenital myopathies characterized by the presence of nemaline bodies within muscle fibres (1). Clinical symptoms include muscular hypotonia and weakness, often accompanied by respiratory insufficiency (2). The age of onset, disease progression and severity are variable and largely depend on the underlying genetic cause (2). To date, 14 genes have been associated with NEM (3) and are routinely screened in diagnostic laboratories (4). Generally in diagnostic laboratories, analysis of known disease genes is restricted to rare coding and structural variants, while variants in regulatory regions are often uncaptured or overlooked (5).
One common severe subtype of NEM is nemaline myopathy 8 (NEM8; OMIM# 615348), an autosomal recessive disorder associated with biallelic variants in the kelch-like family member 40 gene (KLHL40) (6). Almost all patients with NEM8 present with severe disease (6,7). Characteristic features include fetal akinesia or hypokinesia, fractures, contractures, facial involvement, dysphagia, respiratory failure and neonatal death (average age at death in a NEM8 cohort was 5 months of age) (6). To date, only two patients have been reported with relatively mild NEM8, one of whom responded beneficially to acetylcholinesterase inhibitors (8,9).
KLHL40 is a member of the kelch-repeat-containing protein superfamily and is involved in a range of interactions that regulate skeletal muscle myogenesis and promote skeletal muscle maintenance (10,11). KLHL40 is reported to bind the E2F1-DP1 complex to regulate myogenesis (11). In addition, KLHL40 binds NEB and LMOD3 to stabilize the thin filament (10). To date, pathogenic variants that change the KLHL40 open reading frame have been implicated in NEM8 (6,7). These include protein truncating variants such as frameshift, essential splice site and nonsense variants (6). Pathogenic missense variants have also been reported and are predicted to destabilize KLHL40 by disrupting intramolecular interactions (6). KLHL40 deficiency consequently destabilizes the thin filament and causes NEM (10). NEM8 patients typically show reduced or absent KLHL40 in skeletal muscle by western blotting (6–8).
In this study, we describe a Spanish patient with mild NEM8 who had diminished KLHL40 transcript and protein abundance. He harbours a multi-exon KLHL40 deletion on one allele and a splicing-altering variant in the 3’ UTR of KLHL40 on the other allele. The 3’ UTR variant creates a cryptic donor splice site 152 bp downstream of the termination codon, likely inducing nonsense mediated decay. Our analysis of ClinVar Class 3–5 variants suggests that this 3’ UTR mechanism may be underrecognized in Mendelian disease.
Results
Clinical features and pathology
The patient is a male born to non-consanguineous parents, with no relevant family history of disease. He was born hypotonic with bilateral femur fractures. He walked at the age of 30 months, but never jumped or ran. He lost independent ambulation at age 13, after prolonged immobilization after Achilles tendon lengthening. From childhood he had scoliosis, a myopathic face and ogival palate, and weakness of bilateral orbicularis oculi muscles. He was last examined at age 24, being able to stand with bilateral support. At last examination, he had weakness of proximal and distal muscles of upper and lower limbs and showed obvious atrophy of intrinsic hand muscles. Deep tendon reflexes were absent, with bilateral Achilles tendon contractures as well as elbow contractures and cervical and dorsal rigid spine.
A muscle magnetic resonance imaging (MRI) showed widespread fat replacement of pelvic girdle muscles, thighs and lower legs (Fig. 1A–C). At the thigh level, the fat replacement was particularly severe involving posterior compartments as well as rectus anterioris.

Muscle pathology in the patient with a milder form of NEM8. (A–C) Magnetic resonance imaging shows extensive fatty replacement of muscles of pelvic girdle (A), thighs (B), with most prominent involvement of posterior compartment (B) and legs (C). Muscle biopsy sections: H&E stain showing clusters of nemaline bodies (D), ATPase 4.3 stain demonstrating all myofibres in the sample are type 1 (E), semithin slide stained with toluidine blue showing an intranuclear rod (F, arrow), Electron micrograph displaying electron-dense rod-shaped structures that correspond to nemaline bodies (G). Scale bar: 20 μm (D–F); 2 μm (G).
Light microscopy examination revealed moderate and patchy fat replacement, without relevant fibrosis, with mild variability in fibre size. Necrotic or regenerating fibres were not seen. Numerous eosinophilic birefringent structures, rod or round shaped, were seen forming clusters in the centre of the sarcoplasm or subsarcolemmal regions (Fig. 1D). There were internal nuclei. Some intranuclear rods were identified on the semithin sections (Fig. 1F). Gomori trichrome confirmed the presence of abundant rods with irregular distribution in the sarcoplasm (data not shown). Histoenzymatic techniques showed an irregular intermyofibrillar pattern, with presence of pseudo-lobulated fibres and some core-like areas devoid of oxidative activity harbouring the clusters of rods. ATPase stains showed 100% of the fibres were type 1 (Fig. 1E).
Electron microscopy confirmed the presence of numerous electron-dense rod-shaped structures corresponding to nemaline rods in most muscle fibres (Fig. 1G). Rods were irregularly distributed throughout the fibre, clustered centrally and formed subsarcolemmal clusters in the periphery. They often ran parallel to the long axis of the sarcomere and showed continuity with Z-disks.
Genomic, transcriptomic and protein investigations
Gene panel sequencing and whole exome sequencing (WES) identified a heterozygous 10.9 kb deletion (hg19; chr3:42729537–42740458) spanning KLHL40 (ENSG00000157119) and HHATL (ENSG00000010282) in the patient. The deletion encompassed exons 2–6 of the six exons of KLHL40 (Fig. 2A) resulting in complete loss of a functional allele. The deletion was absent in gnomAD (v2.1) and was only present in this patient from the 10 000 alleles in the broad structural variant callset within the seqr analysis platform (12).

Alterations of KLHL40 at the DNA, mRNA and protein level in the patient. (A) Schematic representation of the KLHL40 gene with the 5′ and 3’ UTRs (light blue), six coding exons (cyan) and the compound heterozygous variants in the patient. Figure designed with BioRender.com. (B) Western blotting for KLHL40 (top panel) from patient (P) and control (Ctrl 1–2) muscle biopsies. Western blotting for GAPDH (bottom panel) as a loading control. (C) Sanger confirmation of the c.*152G > T variant in patient (bottom) compared to an unaffected control (top); the variant appears homozygous due to deletion on the second allele. (D–F) RNA-seq data of muscle RNA from the patient and controls C1–C3 aligning to KLHL40. (D) Sashimi plots for patient and controls. (E) Close up of Sashimi plots at the 3’ UTR region of KLHL40. (F) Patient RNA-seq reads spanning the 3’ UTR region with the c.*152G > T variant. Sequences viewed using the IGV.
Western blotting indicated very low abundance of KLHL40 in patient skeletal muscle compared to healthy controls (Fig. 2B). Although deficiency of KLHL40 is sufficient for diagnosis of NEM8, a second variant in trans would be expected given the recessive nature of KLHL40-related NEM (6). Following re-analysis of WES data to include rare non-coding variants, a single nucleotide variant (c.*152G > T) was identified in the 3’ UTR of KLHL40 in trans with the deletion (Fig. 2A). The variant appeared homozygous given the complete absence of the 3’ UTR on the other allele harbouring the 10.9 kb deletion. The c.*152G > T variant was present in gnomAD (v3.1.2) at a frequency of 3.94 × 10−5 (6/152 204 alleles; no homozygotes). The c.*152G > T variant was confirmed by bidirectional Sanger sequencing (Fig. 2C). Parental DNA was not available.
The SpliceAI tool (13) predicted that the c.*152G > T variant creates a weak donor splice site which results in a 255 bp cryptic intron in the 3’ UTR of KLHL40 (positions c.*151 to c.*405). The donor gain prediction score (0.17) was below the Δ score threshold (0.2) suggested for SpliceAI (13). Nevertheless, this prediction was supported by the Introme tool (14) which had a score above the threshold (Introme score 0.76, threshold 0.54).
Analysis of patient skeletal muscle RNA-seq data showed significantly lower levels of KLHL40 transcripts compared to non-NEM patient skeletal muscle (control, Fig. 2D). Patient RNA-seq coverage ranged between 1 and 38 reads (Fig. 2D). Control RNA-seq coverage ranged between 5 and 927 reads (Fig. 2D). In comparison, coverage of 10 670 other genes in the patient sample appeared comparable to controls, suggesting that the low coverage of KLHL40 detected in the patient sample was specific to this transcript. Normal splicing of KLHL40 exons 1–6 was observed at low frequency (Fig. 2D). Splicing of the predicted 255 bp cryptic intron was not detected by RNA-seq, although a distinct drop in coverage over 78 bp neighbouring the 3’ UTR variant site was observed (Fig. 2E and F) suggesting the formation of a 78 bp cryptic intron which would result in a secondary variant in the 3’ UTR of KLHL40; c.*151_*228del. MINTIE (15) alignment of the RNA-seq reads confirmed the 78 bp deletion (hg19; chr3:42733635–42733714) and indicated the presence of a new exon junction with a variant allele frequency of 0.94. This implies that 94% of the reads contained the cryptic 78 bp deletion. Some RNA-seq reads also suggested the formation of other cryptic deletions; however, these were not detected by MINTIE alignment.
In vitro KLHL40 expression analyses
To functionally demonstrate the effect of the c.*152G > T variant on KLHL40 expression and abundance, we transiently expressed KLHL40 with either the wild-type (WT) or the mutant (c.*152G > T) 3’ UTR in HEK293FT cells. We also generated a 78 bp deletion mutant (c.*151_*228del) to mimic the cryptic intron identified by RNA-seq.
Quantitative PCR analysis indicated significantly lower expression of the KLHL40c.*152G > T construct (1.7 ± 0.1) relative to KLHL40WT (2.4 ± 0.1) and KLHL40c.*151_*228del (2.5 ± 0.1), confirming the association of the c.*152G > T variant with reduced KLHL40 transcript levels (Fig. 3A). The difference in expression between the KLHL40WT and KLHL40c.*152G > T constructs was statistically significant by a two-way analysis of variance (ANOVA) (P < 0.0001; Fig. 3A). The difference in expression between the KLHL40c.*152G > T and KLHL40c.*151_*228del constructs was also significant (P < 0.0001; Fig. 3A). There was no significant difference between KLHL40c.*151_*228del and KLHL40WT expression suggesting the KLHL40c.*151_*228del transcript was relatively stable despite harbouring the 78 bp 3’ UTR deletion, arguing against loss of specific stabilizing elements in this region.

In vitro KLHL40 expression studies. (A) Log10 transformed qPCR data for KLHL40 wildtype (WT), c.*152G > T and c.*151_*228del pcDNA3.1 transcripts expressed in HEK293FT cells. Reactions performed in biological quadruplicates (n = 4). Expression normalized to EEF2 and TBP. Significance was determined by a two-way ANOVA followed by Tukey’s multiple comparison test; *P < 0.05, **P < 0.01, ***P < 0.001. Error bars indicate ± SEM. (B) Representative western blot for KLHL40 from HEK293FT cells expressing WT and mutant (c.*152G > T, and c.*151_*228del) KLHL40 and untreated HEK293FT cells (U/T). Western blotting for GAPDH as a loading control. (C) Relative abundance of KLHL40 WT and mutant proteins quantified by ImageJ from KLHL40 western blots (n = 3). Relative abundance normalized to GAPDH. Significance was determined by a one-way ANOVA followed by Tukey’s multiple comparison test; *P < 0.05. Error bars indicate ± SEM. (D) RT-PCR and agarose gel electrophoresis of KLHL40 3’ UTR products from cDNA of patient (P) and control (C) skeletal muscle, HEK293FT expressing the KLHL40 pcDNA3.1 transcripts, untreated (U/T) HEK293FT and a no template control (NTC).
To investigate whether the reduced expression of KLHL40c.*152G > T was a consequence of nonsense mediated decay, we expressed the WT and mutant KLHL40 constructs in the presence of puromycin, a known nonsense mediated decay inhibitor (16). Analyses of the log10-transformed expression data indicated an increase in expression of puromycin-treated KLHL40c.*152G > T (2.2 ± 0.1) compared to untreated KLHL40c.*152G > T (1.7 ± 0.1; Fig. 3A). The increased expression of KLHL40c.*152G > T was statistically significant by a two-way ANOVA (P = 0.005; Fig. 3A), suggesting that the splice-altering transcript generated from the c.*152G > T variant underwent nonsense mediated decay.
Western blotting analyses indicated significantly lower abundance of KLHL40c.*152G > T corresponding to ~15% of the abundance of KLHL40WT (Fig. 3B and C). The difference was statistically significant by a one-way ANOVA (P = 0.036) and further supported an association of the c.*152G > T variant with reduced KLHL40 abundance (Fig. 3C). While each experiment indicated lower abundance of KLHL40c.*152G > T compared to KLHL40c.*151_*228del (Fig. 3B and C), the difference was not statistically significant by a one-way ANOVA.
RT-PCR analysis of the KLHL40 3’ UTR from patient skeletal muscle cDNA and KLHL40c.*152G > T cDNA revealed three distinct products (Fig. 3D). The largest and most prominent product (a) was of comparable size to KLHL40WT (551 bp; Fig. 3D) and corresponded to the unspliced transcript with the single nucleotide variant (Supplementary Material, Fig. S1). Products b and c were relatively less prominent in the patient and KLHL40c.*152G > T samples and were undetected in control skeletal muscle cDNA and KLHL40WT cDNA (Fig. 3D). Product b was of comparable size to KLHL40c.*151_*228del (473 bp; Fig. 3D) and likely corresponded to a spliced transcript containing the 78 bp deletion (Supplementary Material, Fig. S1). The smallest product (c; ~ 300 bp) appeared to correspond to the cryptic transcript predicted by SpliceAI and Introme (Fig. 3D). Sequencing alignments confirmed this product contained a 255 bp deletion spanning positions c.*151 to c.*405 (Supplementary Material, Fig. S1). This suggested that in addition to the 78 bp deletion detected by RNA-seq (c.*151_*228del), the primary c.*152G > T variant also induced splicing of a 255 bp cryptic intron which resulted in a second deletion (c.*151_*405del) in the 3’ UTR of KLHL40 in the patient.
Analysis of 3’ UTR variants in ClinVar
To estimate the occurrence of 3’ UTR splicing as a mechanism of rare disease, we interrogated the ClinVar (17) database (n = 1 113 674 variants, GRCh38 VCF dated 22/01/2022) for non-benign 3’ UTR variants with permissive SpliceAI scores (≥0.1) similar to that of KLHL40 c.*152G > T. Our analyses indicated that 400 of 33 237 (i.e. ~1.2%) 3’ UTR variants in canonical transcripts have SpliceAI scores ≥ 0.1, as presented in Supplementary Material, Table S1. Some of these variants have been reported in the literature (Table 1) and support that 3’ UTR splicing may be an underappreciated mechanism of Mendelian disease.
Highlighted ClinVar variants that may be associated with 3’ UTR splicing-provoked nonsense mediated decay
Ch . | Pos . | Ref . | Alt . | Gene . | Variant . | Clinical significance . | Disease association . | SpliceAI . |
---|---|---|---|---|---|---|---|---|
X | 154837541 | C | A | F8 | c.*56G > T | Likely pathogenic | Haemophilia A (18) | 0.62 |
X | 151405064 | A | G | VMA21 | c.*6A > G | Likely pathogenic | X-linked myopathy with excessive autophagy (19) | 0.88 |
19 | 45770204 | C | (CAG)n | DMPK | c.*283_*284ins(CTG)n | Pathogenic | Steinert myotonic dystrophy syndrome (20) | 0.1–0.14 |
Ch . | Pos . | Ref . | Alt . | Gene . | Variant . | Clinical significance . | Disease association . | SpliceAI . |
---|---|---|---|---|---|---|---|---|
X | 154837541 | C | A | F8 | c.*56G > T | Likely pathogenic | Haemophilia A (18) | 0.62 |
X | 151405064 | A | G | VMA21 | c.*6A > G | Likely pathogenic | X-linked myopathy with excessive autophagy (19) | 0.88 |
19 | 45770204 | C | (CAG)n | DMPK | c.*283_*284ins(CTG)n | Pathogenic | Steinert myotonic dystrophy syndrome (20) | 0.1–0.14 |
List of all 3’ UTR variants in ClinVar with predicted SpliceAI scores ≥ 0.1 are presented in Supplementary Material, Table S1.
Highlighted ClinVar variants that may be associated with 3’ UTR splicing-provoked nonsense mediated decay
Ch . | Pos . | Ref . | Alt . | Gene . | Variant . | Clinical significance . | Disease association . | SpliceAI . |
---|---|---|---|---|---|---|---|---|
X | 154837541 | C | A | F8 | c.*56G > T | Likely pathogenic | Haemophilia A (18) | 0.62 |
X | 151405064 | A | G | VMA21 | c.*6A > G | Likely pathogenic | X-linked myopathy with excessive autophagy (19) | 0.88 |
19 | 45770204 | C | (CAG)n | DMPK | c.*283_*284ins(CTG)n | Pathogenic | Steinert myotonic dystrophy syndrome (20) | 0.1–0.14 |
Ch . | Pos . | Ref . | Alt . | Gene . | Variant . | Clinical significance . | Disease association . | SpliceAI . |
---|---|---|---|---|---|---|---|---|
X | 154837541 | C | A | F8 | c.*56G > T | Likely pathogenic | Haemophilia A (18) | 0.62 |
X | 151405064 | A | G | VMA21 | c.*6A > G | Likely pathogenic | X-linked myopathy with excessive autophagy (19) | 0.88 |
19 | 45770204 | C | (CAG)n | DMPK | c.*283_*284ins(CTG)n | Pathogenic | Steinert myotonic dystrophy syndrome (20) | 0.1–0.14 |
List of all 3’ UTR variants in ClinVar with predicted SpliceAI scores ≥ 0.1 are presented in Supplementary Material, Table S1.
Discussion
We describe a patient with NEM8 with two novel variants, a 10.9 kb deletion including exons 2–6 of KLHL40 and a single nucleotide 3’ UTR variant (c.*152G > T) in KLHL40. Together, these biallelic variants fit with the recessive inheritance pattern of NEM8 (6). Notably, the patient’s clinical phenotype was milder than most NEM8 cases described in the literature (6,7,21). He is one of very few cases who has survived into adolescence (6) and remained ambulant, albeit with walking difficulties until 13 years of age. The findings in his skeletal muscle biopsy, including nemaline bodies, supported his NEM8 diagnosis. It is noteworthy that intranuclear rods were also observed on biopsy, given this feature was not previously reported in NEM8 cases. The patient’s skeletal muscle revealed significantly low levels of KLHL40 compared to controls (Fig. 2B), thus supporting KLHL40 as the underlying genetic cause of his disease.
The exon 2–6 deletion was expected to generate a non-functional transcript and therefore no functional KLHL40 protein would be produced from that allele. Given the biallelic nature of KLHL40-related disease (6), we suspected that the c.*152G > T variant in KLHL40 was the second underlying cause of NEM in the patient based on the following evidence. This variant was detected in trans with the pathogenic deletion and was predicted by SpliceAI and Introme to create a cryptic donor splice site in the 3’ UTR. Splicing within the 3’ UTR has been suggested to reduce transcript stability and abundance (22,23). Consistently, our data from RNA-seq and western blotting of patient skeletal muscle respectively indicated a significant reduction in KLHL40 mRNA and protein compared to controls. These findings were reproduced by our in vitro studies of KLHL40c.*152G > T. Taken together alongside recent non-coding variant curation criteria (5), our data suggest that the c.*152G > T variant is likely pathogenic. Of note, the low frequency of the c.*152G > T variant in gnomAD suggests that this variant could become a recurrent cause of NEM8 and should be considered in patients presenting with NEM.
Interestingly, the c.*152G > T variant appeared to induce multiple splicing events in the 3’ UTR. Splicing of the 78 bp cryptic intron generates a deletion; c.*151_*228del that was detected in patient RNA-seq data (Fig. 2F). The 78 bp deletion was also apparent in RT-PCR studies of both patient skeletal muscle cDNA and KLHL40c.*152G > T expression construct cDNA and was undetected in controls (Fig. 3D). Although the 255 bp deletion predicted by SpliceAI and Introme was undetected by RNA-seq, it was apparent by RT-PCR of both patient skeletal muscle cDNA and KLHL40c.*152G > T construct cDNA but was undetected in controls (Fig. 3D). We predict that the 255 bp deletion may prevent maturation of the resulting transcript, which may explain why it was undetected by RNA-seq.
We initially postulated that the 3’ UTR deletions would reduce KLHL40 stability by removing recognition sites for regulatory elements such as polyadenylation proteins (20,24), or by introducing illegitimate microRNA sites (25,26). However, there were no remarkable annotations for the deleted regions in 3’ UTR databases. Moreover, our in vitro expression analyses suggested that the 78 bp 3’ UTR deletion itself did not significantly affect expression of KLHL40 (Fig. 3A–C). While mRNA and protein abundance following expression of KLHL40c.*152G > T were significantly lower than WT KLHL40, mRNA and protein abundance following expression of KLHL40c.*151_*228del appeared comparable to the WT despite containing the 78 bp deletion (Fig. 3A–C). Given that the 255 bp deletion was not experimentally investigated, it is uncertain whether it would recapitulate the effect of the 78 bp deletion or of the c.*152G > T variant. Nonetheless, our data suggest that the c.*152G > T splice variant does cause reduced transcript levels and protein abundance. Notably, attenuation of nonsense mediated decay by puromycin treatment led to significantly increased expression of KLHL40c.*152G > T (Fig. 3A). Taken together, we propose that the pathomechanism of c.*152G > T may be attributed to a rare form of nonsense mediated decay provoked by 3’ UTR intron splicing (22,27). Our data suggest that introns in UTRs are significant modulators of gene expression warranting further recognition (22).
Nonsense mediated decay is implicated in various genetic disorders (28). In most reports, this RNA surveillance mechanism is triggered by variants that result in protein pretermination, including frameshift, nonsense and splice variants. A rarer mechanism of nonsense mediated decay involves splicing in the 3’ UTR has been proposed in domain specific reviews (22). It has been suggested that intronic insertions into most human 3’ UTRs at a distance >50–55 bp downstream of the termination codon will stimulate nonsense mediated decay (23). This was functionally demonstrated by introduction of a spliceable intron 62 bp into the 3’ UTR of WT β-Globin mRNA which reduced protein abundance to 24% of the WT (29). Notably, while this mechanism has been demonstrated by in vitro studies (29–31), to our knowledge, nonsense mediated decay provoked by 3’ UTR splicing has not been reported as a mechanism of Mendelian disease to date.
3’ UTR-splice activated nonsense mediated decay may explain the patient’s relatively milder disease presentation. Given the c.*152G > T cryptic donor splice site is weakly spliced, a proportion of transcripts are expected to evade aberrant splicing and proceed to translation of functional KLHL40 in the patient. Indeed, a proportion of the canonical KLHL40 transcript was detected by RNA-seq of patient skeletal muscle (Fig. 2F) as well as in our in vitro studies of KLHL40c.*152G > T (Fig. 3).
Mechanistically, 3’ UTR splice-activated nonsense mediated decay involves deposition of RNA binding complexes, such as the human UPF protein complex and the exon junction complex at the 3’ UTR (27,30). Following translation termination, interactions between the 3’ UTR-bound protein complexes and the terminating ribosome signals activate nonsense mediated decay (22,28,32). We postulate that the cryptic introns introduced by the c.*152G > T variant may influence recruitment of such complexes to the 3’ UTR and thus trigger nonsense mediated decay (27).
Despite the crucial roles of 3’ UTRs in mRNA metabolism and surveillance (22,30,33), variants identified in 3’ UTRs are often overlooked during disease gene screening (25). This may be due to limited guidelines available for the clinical interpretation and functional validation of 3’ UTR variants (5). While there are some reports of pathogenic 3’ UTR variants mediating aberrant splicing (18,25,34), the mechanism of nonsense mediated decay provoked by 3’ UTR splicing appears to be underrecognized. Notably, a 3’ UTR variant in F8 (c.*56G > T) has been similarly reported to create a cryptic donor splice site that results in a 159 bp deletion in the 3’ UTR (Table 1) (18). This deletion was associated with significantly reduced F8 mRNA levels and reported to cause a milder form of haemophilia A (18). It is tempting to suggest that the underlying mechanism of this F8 c.*56G > T variant also involves 3’ UTR splice-activated nonsense-mediated decay, particularly given it conforms with the >50–55 bp of termination codon rule (23). Additionally, 3’ UTR (CTG)n expansions in the DM1 protein kinase gene (DMPK) are a cause of DMPK deficiency in patients with myotonic dystrophy (OMIM# 160900). There are several speculations as to how the expansions reduce DMPK mRNA and protein abundance, one of which involves aberrant splicing (20,35). Overall, such reports indicate that aberrant 3’ UTR-splicing may underlie genetically unresolved rare diseases.
Our analysis of the ClinVar database (n = 1 113 674 variants. GRCh38 VCF dated 22/01/2022) suggested a small percentage (~1.2%) of 3’ UTR variants could induce splicing (Supplementary Material, Table S1). We predict that the prevalence of this mechanism may be underrepresented given that 3’ UTRs are not usually interrogated during variant curation (5) and thus 3’ UTR variants are less likely to be submitted to ClinVar. Of the 3’ UTR variants that have been submitted; few appear to have been functionally investigated. Moreover, the mechanism for some of the 3’ UTR variants that have been investigated remains unknown (19,36). This includes a c.*6A > G variant in VMA21 associated with X-linked myopathy with excessive autophagy (OMIM# 310440, Table 1). This variant was reported twice in ClinVar (VCV000208803.4) and in 20 patients from three families (19). While Ramachandran et al. (19) functionally showed that the c.*6A > G variant was associated with reduced VMA21 transcript and protein abundance, the underlying mechanism remains unconfirmed. Our SpliceAI analyses suggest that the c.*6A > G substitution may lead to the creation of a new donor splice site at position c.*1 with a high donor gain prediction score of 0.88 (Δ ≥ 0.8) (13). Although Ramachandran et al. (19) did not detect aberrant splicing at this position by RT-PCR of patient lymphoblasts, it would be interesting to investigate the VMA21 3’ UTR by an alternative method such as RNA-seq to determine whether the c.*6A > G variant is another example of pathogenic 3’ UTR splicing (37). Overall, it is apparent that 3’ UTR variants warrant further recognition during variant curation (22,26). We encourage submission of 3’ UTR variants to disease gene databases to facilitate improved curation and classification of 3’ UTR variants in Mendelian disease.
Conclusions
This study identified a likely pathogenic splice variant in the 3’ UTR of KLHL40 (c.*152G > T) in a patient with a milder presentation of NEM8. This finding expands the genotypic and phenotypic spectrum of NEM8. Using next generation sequencing, RNA-seq, and in vitro studies, we provide the first evidence of a pathogenic variant destabilizing the KLHL40 mRNA by 3’ UTR splice-activated nonsense mediated decay. Aberrant 3’ UTR splicing may be an underrecognized mechanism of Mendelian disease. Our study highlights the utility of RNA-seq in (1) detecting expression outliers and (2) identifying cryptic splice variants in non-coding genomic regions that may evade detection and/or curation in data generated from targeted gene panel and WES (37,38). Inclusion of RNA-seq into the diagnostic toolkit for rare disorders can thus enable identification of pathogenic variants that may contribute to the number of cases remaining undiagnosed following genetic screening (37). Overall, this study highlights the necessity of looking beyond the coding regions when investigating the molecular basis of rare disease (18,26).
Materials and Methods
This project was approved by the University of Western Australia Human Research Ethics Committee (RA/4/20/1008) and the Curtin University Human Research Ethics Office (HRE2019-0566). Written informed consent was provided by the family.
Patient details
The patient was diagnosed with NEM on the basis of clinical findings and histopathological findings on skeletal muscle biopsy. Clinical assessments, MRI, muscle biopsy analysis and western blotting of patient muscle biopsy were performed at the Biomedicine Institute of Sevilla and the Department of Neurology at Virgen del Rocío University Hospital in Sevilla, Spain.
Methods for histology and electron microscopy
Muscle biopsy was obtained from the quadriceps muscle, immediately frozen by standard methods, and processed following the standard procedures (39). Routine histochemical techniques were performed on 7 μm transverse sections of frozen muscle, including haematoxylin and eosin, modified Gomori trichrome, oil red O, periodic acid-Schiff, nicotinamide adenine dinucleotide-tetrazolium reductase (NADH-TR), succinate dehydrogenase (SDH), cytochrome C oxidase (COX) and adenosine triphosphatase (ATPase) pH 9.4, 4.6 and 4.3.
Ultrastructural studies were performed following standard methods (39). A small fragment of the muscle was fixed in 2.5% w/v glutaraldehyde solution, postfixed in 1% w/v osmium tetroxide and embedded in epoxy resin. Semithin sections were stained with 1% toluidine blue. Ultrathin sections were mounted on copper grids and examined with a Zeiss Libra 120 transmission electron microscope (Carl Zeiss NTS GmbH, Oberkochen, Germany).
Western blotting
Frozen muscle samples were homogenized in RIPA buffer (20 mm Tris HCl pH 7.4, 150 mm NaCl, 1 mm EDTA, 1% v/v IGEPAL, 0.1% w/v SDS) containing protease inhibitor cocktail (Roche). The lysates were centrifuged at 15 781 g at 4°C for 20 min. The supernatant was collected. The protein lysates were separated on 10% w/v sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) gels and transferred onto polyvinylidene difluoride (PVDF) membranes (Millipore). Western blot analysis of equal-protein loading was performed with the following primary antibodies: rabbit polyclonal anti-KLHL40 (1:500; HPA024463) and rabbit polyclonal anti-GAPDH (1:2000; Sigma-Aldrich). Immunoreactivity was detected with secondary antibodies conjugated to horseradish peroxidase (1:5000; Jackson Immuno Research) and developed with SuperSignal West Femto (Thermo Fisher Scientific) using an ImageQuant LAS 4000 MiniGold System (GE Healthcare Life Sciences).
Genetic analyses
Neuromuscular disease gene panel
DNA from the proband was sequenced on the neuromuscular disease gene panel at PathWest using Illumina sequencing chemistry as previously described (4,40). Base calling, mapping and variant calling were performed as previously described (4,40).
Whole exome sequencing
Illumina WES was performed at the Broad Institute of Harvard and MIT as previously described (4). Exome data were analysed in seqr (12) (https://seqr.broadinstitute.org).
Variant interpretation and confirmation
Variants were analysed and interpreted following standard bioinformatic approaches (41,42). Candidate single nucleotide variants were validated by bidirectional Sanger sequencing (Australian Genome Research Facility (AGRF), Perth) (40). Sanger chromatograms were aligned to a reference sequence in benchling (benchling.com) using the MAFFT algorithm.
In silico variant prediction analysis
Introme (14) (https://github.com/CCICB/introme; manuscript in preparation) and SpliceAI (13) (https://spliceailookup.broadinstitute.org/) were used to predict the effect of variants on splicing.
Analysis of 3’ UTR variants in ClinVar
Variants were downloaded from ClinVar (17) (GRCh38, 22/01/2022; https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/archive_2.0/2022/clinvar_20220122.vcf.gz) in VCF format (n = 1 113 674 variants). BCFtools (43) query was used to extract class 3–5 variants (VUS, likely pathogenic and pathogenic; n = 623 975 variants). BCFtools view was used to extract variants with a Molecular Consequence of ‘3_prime_UTR_VARIANT’ (n = 43 775 variants). These variants were run on SpliceAI (13). Eleven variants had reference alleles that were deemed too long to annotate by SpliceAI and variants with no annotation were removed, leaving 38 515 variants. Next, the VCF was annotated using VEP (44) to determine the canonical transcript for each variant. Variants in the 3’ UTR of the respective canonical transcript were retained, leaving 33 237 variants in the 3’ UTR of the canonical transcript. We chose a SpliceAI threshold of 0.10 to keep variants, based on the KLHL40 c.*152G > T score (0.17) and the optimal threshold (0.11) reported by Riepe et al. (45) for their MYBPC3 dataset, in contrast to the SpliceAI suggested threshold of 0.20 (13). These data are presented in Supplementary Material, Table S1.
RNA sequencing
RNA extraction from patient skeletal muscle and RNA sequencing were performed at the Department of Diagnostic Genomics (PathWest, Perth) as previously described (46). RNA enrichment was performed using an Illumina TruSeq mRNA library preparation kit (Illumina, USA). The library preparation was sequenced with HiSeq4000 (151 bp read length, paired-end run, ~100 million reads). RNA-seq data were visualized with the Integrative Genomics Viewer (IGV) (47). Data were also assembled and analyzed using MINTIE (15), a reference-free RNA-seq analysis pipeline that maps reads de novo to identify novel structural and splice variants (15). Muscle RNA-seq FASTQ data from five additional non-NEM patients were used as controls for MINTIE.
In vitro expression analyses
Expression constructs
Mammalian pcDNA3.1 expression constructs (pcDNA3.1) were synthesized by Genscript. These constructs contained the KLHL40 coding sequence (ENST00000287777) with the WT or mutant 3’ UTR encoding either of two variants: c.*152G > T or c.*151_*228del.
Cell culture
Human embryonic kidney cell lines (HEK293FT) were maintained in Dulbecco’s Modified Eagle Medium (DMEM; Gibco) supplemented with 10% fetal bovine serum (FBS) and 1% Penicillin/Streptomycin (Gibco). At 50–60% confluency, expression constructs were transfected following the Viafect kit protocol (Promega) at a 4:1 ratio (Viafect transfection reagent: DNA). Cultures were maintained at 37°C with 5% CO2. A pmaxGFP vector was transfected alongside to estimate transfection efficiency. Cells were harvested at 72 h post-transfection and pellets snap frozen and stored at −80°C until required for RNA and protein extraction.
Puromycin assays
HEK293FT cells were maintained in DMEM + 10% FBS and transfected as above. At 21 h post-transfection, cells were incubated with media containing either 100 μg/ml puromycin dihydrochloride (Thermo Fisher [# A1113803]) or no puromycin for 3 h at 37°C with 5% CO2 before harvesting at 24 h post-transfection.
RNA extraction and quantitative PCR
Total RNA was extracted from HEK293FT pellets using the RNeasy mini kit protocol (Qiagen). Complementary DNA (cDNA) was synthesized from 1 μg RNA using the SuperScript III First-Strand Synthesis System (Thermo Fisher Scientific, Waltham, MA, USA). To investigate relative transcript expression levels of WT KLHL40, KLHL40c.*152G > T and KLHL40c.*151_*228del, qPCR was performed as previously described (48). Reactions were performed in 10 μL volumes containing a 2X Rotor-Gene SYBR Green PCR master mix (Qiagen; 204076), 1 μL diluted cDNA and 0.8 μm of the following forward and reverse primers: KLHL40_FD5_6 (5’ GGAGGTATAACGAGGAGGAGAA-3′, bridges the junction of exon 5 and exon 6), and KLHL40_RV6 (5’-CTGAGCTGGTCACATCTTAGTC-3′). Reactions were performed in technical duplicates for each biological replicate (n = 4) and Ct values were averaged. Expression was normalized using the delta Ct method (49), compared to the geometric mean of two reference genes (TBP and EEF2) (48). Data were log10 transformed and analysed by a two-way ANOVA followed by Tukey’s multiple comparison test to measure statistical significance using GraphPad Prism 9.0.1 (La Jolla, USA).
Reverse transcriptase PCR
RT-PCR was performed to confirm the occurrence of mis-splicing events in the 3’ UTR region. Reactions were performed in 25 μL volumes containing a 1X GoTaq G2 master mix (Promega, Madison, WI [#M7832]), 0.4 μm forward primer (KLHL40_UTR_F; 5’-CCAGCTCAGGCAGACTGAAC-3′), 0.4 μm reverse primer (KLHL40_UTR_R; 5- GACACCAGATGGAGAGCAGAG-3′) and 7.5 ng cDNA. Touchdown PCR cycling conditions were as follows: 5 min at 98°C; 15 cycles of 15 s at 98°C, 10 s at 65–57.5°C (−0.5°C per cycle), and 15 s at 72°C; 30 cycles of 15 s at 98°C, 10 s at 57°C and 15 s at 72°C; and a final cycle of 5 min at 72°C. Products were resolved on 2% w/v agarose gel in 1X TAE at 95 V for 1 h. PCR products were gel excised and purified using the QIAquick gel and PCR purification kit. DNA was quantified on a NanoDrop One spectrophotometer (Thermo Scientific) and Sanger sequenced at AGRF (Perth, Australia). Sanger chromatograms were aligned to a reference sequence in benchling (benchling.com) using the MAFFT algorithm.
Protein extraction and western blotting
Frozen cell pellets were suspended in lysis buffer containing 2% w/v SDS, 125 mm Tris (pH 6.8) and 1x PIC (Halt™ Protease Inhibitor Cocktail; Thermo Scientific) and homogenized by sonication. Protein concentrations were determined by a BCA protein assay (Pierce; #23227). Lysates were prepared in loading buffer containing 10% w/v SDS, 312.5 mm Tris (pH 6.8), 50% v/v glycerol, bromophenol blue saturated solution, 50 mm DTT and 1x PIC. Western blotting was performed as previously described (6). Lysates (500 ng) were separated on 4–12% w/v Bis-Tris NuPAGE Novex polyacrylamide gels and transferred to PVDF transfer membranes (Thermo Fisher Scientific). After 1 h blocking with PBS + 0.1% v/v Tween-20 and 5% w/v skim milk, membranes were incubated with a Human Protein Atlas rabbit polyclonal KLHL40 (KBTBD5) antibody (Sigma-Aldrich; #HPA024463; [1:2500]) overnight at 4°C and a secondary goat anti-rabbit horseradish peroxidase antibody (Sigma; #A0545) at room temperature for 1 h. Membranes were imaged by chemiluminescent detection (Pierce ECL Plus kit; #32132) using the Invitrogen iBright FL1000 imaging system (Thermo Fisher Scientific). Membranes were also blotted with an anti-GAPDH antibody (Sigma; #G8795; [1:10 000]) as a loading control. Relative KLHL40 abundance was quantified using ImageJ (50). Statistical significance was measured using a one-way ANOVA followed by Tukey’s multiple comparisons test with a single pooled variance (n = 3) on GraphPad Prism 9.0.1 (La Jolla, USA).
Acknowledgements
We would like to express our gratitude to the patient and family for their involvement in this study. We thank Associate Professor David Groth and Dr Danielle Dye for their inputs into this study.
Conflict of Interest statement. The authors have no conflicts to declare.
Funding
LD is supported by an Australian Government Research Training Program (RTP) Scholarship. GR (Investigator Grant, APP2007769) and NGL (Fellowship APP1117510) are supported by the Australian National Health and Medical Research Council (NHMRC). This work is funded by NHMRC Ideas Grant (APP2002640). This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.
References
- magnetic resonance imaging
- western blotting
- autosome disorder
- deglutition disorders
- fractures
- achilles tendon
- child
- contracture
- elbow region
- femoral fractures
- fetus
- genes
- heterozygote
- immobilization
- skeletal muscles
- myopathies, nemaline
- respiratory insufficiency
- rna splicing
- rna, messenger
- rod photoreceptors
- sequence analysis, rna
- thigh
- untranslated regions
- wheelchairs
- muscle hypotonia
- scoliosis
- rna
- biopsy of muscle
- mendelian disorders
- neonatal death
- akinesia
- pelvic girdle
- vastus intermedius muscle
- knee contracture
- whole exome sequencing
- donors
- rna-seq