-
PDF
- Split View
-
Views
-
Cite
Cite
Zhidong Cen, Zhengwen Jiang, You Chen, Xiaosheng Zheng, Fei Xie, Xiaodong Yang, Xingjiao Lu, Zhiyuan Ouyang, Hongwei Wu, Si Chen, Houmin Yin, Xia Qiu, Shuang Wang, Meiping Ding, Yelei Tang, Feng Yu, Caihua Li, Tao Wang, Hiroyuki Ishiura, Shoji Tsuji, Chuan Jiao, Chunyu Liu, Jianfeng Xiao, Wei Luo, Intronic pentanucleotide TTTCA repeat insertion in the SAMD12 gene causes familial cortical myoclonic tremor with epilepsy type 1, Brain, Volume 141, Issue 8, August 2018, Pages 2280–2288, https://doi.org/10.1093/brain/awy160
Close - Share Icon Share
Abstract
Familial cortical myoclonic tremor with epilepsy is an autosomal dominant neurodegenerative disease, characterized by cortical tremor and epileptic seizures. Although four subtypes (types 1–4) mapped on different chromosomes (8q24, 2p11.1-q12.2, 5p15.31-p15.1 and 3q26.32-3q28) have been reported, the causative gene has not yet been identified. Here, we report the genetic study in a cohort of 20 Chinese pedigrees with familial cortical myoclonic tremor with epilepsy. Linkage and haplotype analysis in 11 pedigrees revealed maximum two-point logarithm of the odds (LOD) scores from 1.64 to 3.77 (LOD scores in five pedigrees were >3.0) in chromosomal region 8q24 and narrowed the candidate region to an interval of 4.9 Mb. Using whole-genome sequencing, long-range polymerase chain reaction and repeat-primed polymerase chain reaction, we identified an intronic pentanucleotide (TTTCA)n insertion in the SAMD12 gene as the cause, which co-segregated with the disease among the 11 pedigrees mapped on 8q24 and additional seven unmapped pedigrees. Only two pedigrees did not contain the (TTTCA)n insertion. Repeat-primed polymerase chain reaction revealed that the sizes of (TTTCA)n insertion in all affected members were larger than 105 repeats. The same pentanucleotide insertion (ATTTCATTTC)58 has been reported to form RNA foci resulting in neurotoxicity in spinocerebellar ataxia type 37, which suggests the similar pathogenic process in familial cortical myoclonic tremor with epilepsy type 1.
See Guerrini and Mei (doi:10.1093/brain/awy196) for a scientific commentary on this article.
Introduction
Familial cortical myoclonic tremor with epilepsy (FCMTE), widely known as benign adult familial myoclonic epilepsy in Japan, and autosomal dominant cortical myoclonus and epilepsy in Europe, was first reported in the 1980s in the Japanese population (Kudo et al., 1984; van Rootselaar et al., 2005). Its major clinical manifestations include autosomal dominant condition, adult onset, cortical tremor, with or without epileptic seizures, which mainly manifest as generalized tonic-clonic seizures. Giant somatosensory evoked potential and long-latency cortical reflex can be detected by electromyography in patients with FCMTE, which collectively support the cortical origin of the cortical tremor (van Rootselaar et al., 2005). Four subtypes have been reported previously: FCMTE1 (8q24, OMIM: 601068) (Mori et al., 2011; Cen et al., 2015); FCMTE2 (2p11.1-q12.2, OMIM: 607876) (Guerrini et al., 2001; Striano et al., 2004; De Fusco et al., 2014); FCMTE3 (5p15.31-p15.1, OMIM: 613608) (Depienne et al., 2010); and FCMTE4 (3q26.32–3q28, OMIM: 615127) (Yeetong et al., 2013). Although mutations in candidate genes (UBR5 8q22.3, ACMSD 2q21.3, ADRA2B 2q.11.2, PLA2G6 22q13.1 and CTNND2 5p15.2) of FCMTE were reported, no additional mutations in the above candidate genes were found in other pedigrees (Kato et al., 2012; Marti-Masso et al., 2013; De Fusco et al., 2014; Gao et al., 2016; van Rootselaar et al., 2017).
FCMTE1 was only reported in the Chinese and Japanese population (Mori et al., 2011; Cen et al., 2015). Clinical anticipation was observed in Chinese and Japanese FCMTE pedigrees, which indicated a possibility that repeat expansion was the underlying cause (Hitomi et al., 2012; Cen et al., 2016a, b). In this study, we identified an intronic pentanucleotide (TTTCA)n insertion in the SAMD12 gene (in 8q24) as the probable causative mutation of FCMTE1.
Materials and methods
Subjects
Twenty FCMTE pedigrees were enrolled from 2011 to 2017 in the Second Affiliated Hospital, School of Medicine, Zhejiang University. All were Han Chinese and from eight different provinces of China. We collected clinical data (Supplementary Table 1) and blood samples after obtaining written informed consent from all participants involved. The diagnosis of FCMTE was based on the criteria we previously used (Cen et al., 2016a). Genomic DNA of the participants was extracted from peripheral blood leucocytes via standard methods. The study conformed to the tenets of the Declaration of Helsinki, and the ethics were approved by the Second Affiliated Hospital, School of Medicine, Zhejiang University.
Linkage analysis and haplotype analysis
DNA samples from five FCMTE pedigrees [Pedigrees A, B (previously reported by Cen et al., 2015, C, K and O] were hybridized to HumanOmniZhongHua-8 BeadChip from Illumina® according to the manufacturer’s recommendations. Short tandem repeat markers from chr8: 104 863 833–134 036 685 (UCSC Genome Browser build hg19) were genotyped in another six FCMTE pedigrees (Pedigrees H–J and L–N). Two-point linkage analysis was performed in Merlin with a parametric model assuming an autosomal-dominant mode of inheritance, a disease-allele frequency of 0.0001, and 90% penetrance. Haplotype analysis was performed in all pedigrees using single nucleotide polymorphism (SNPs) or short tandem repeat genotypes.
Whole-genome sequencing
Whole-genome sequencing (WGS) was performed in eight definite affected members (Patients P-A-IV2 and P-A-V2 from Pedigree A, Patients P-B-III4 and P-B-IV6 from Pedigree B, Patients P-K-III7 and P-K-III11 from Pedigree K, and Patients P-O-III1 and P-O-III4 from Pedigree O). Paired-end DNA libraries were prepared according to manufacturer’s instructions (Illumina Truseq Library Construction). DNA libraries were sequenced on Illumina HiSeq X according to manufacturer’s instructions for paired-end 150 bp reads. The average sequencing depth ranged from 31.35× to 57.77× and 90.1% to 99.2% of whole genome were covered at least 20×.
Routine whole-genome sequencing analysis
Reads (without barcode) were aligned to hg19 using SpeedSeq (Chiang et al., 2015). Single nucleotide variants and insertions/deletions (indels) calling were performed using both Genome Analysis Toolkit v2.1 and VarScan programs (McKenna et al., 2010; Koboldt et al., 2012). Structure variants and copy number variants were analysed in SpeedSeq (Chiang et al., 2015). Annotations of single nucleotide variants, indels, structure variants and copy number variants were performed with ANNOVAR (Wang et al., 2010).
Whole-genome sequencing data analysis with ExpansionHunter
Two hundred and thirty-three tri-, tetra-, penta-, hexa-, and heptanucleotide repeats (UCSC Genome Bioinformatics, hg19, Simple repeats database) in the FCMTE1 candidate region were analysed by ExpansionHunter as the repeats of interest according to the instructions (Dolzhenko et al., 2017). Unmapped reads containing the candidate repeat expansion were called by the following steps: (i) reads in the target region (chr8: 119 378 852–119 379 357) were retrieved from the BAM file by SAMtools; (ii) paired-reads mapped on different positions were filtered as candidates; and (iii) reads with short tandem repeats (three to seven polynucleotides, repeat number >5) were checked.
PCR amplification and sequencing of pentanucleotide repeat alleles
Short pentanucleotide repeat alleles were analysed by standard PCR with primers SAMD12LF and SAMD12LR (Supplementary material) with HotStarTaq® polymerase (Takara Bio). PCR products were purified and repeat sizes were assessed by Sanger sequencing. Long pentanucleotide repeats alleles were analysed by long-range PCR with 100–500 ng genomic DNA, 0.2 µM primers SAMD12LF and SAMD12LR (Supplementary material), 200 μM dNTP Mixture, 1× PrimeSTAR GXL Buffer, 1.25 U PrimeSTAR GXL DNA Polymerase (Takara Bio) in 50 µl. After 1 min at 98°C, DNA samples underwent 30 cycles (98°C for 10 s and 68°C for 15 min). PCR products were separated by electrophoresis in a 1% agarose gel; DNA was extracted from the gel. Using sequencing primers (SAMD12SeqF and SAMD12SeqR) complementary to the 5′ or 3′ unique sequences flanking the repeat, the purified PCR products were sequenced by Sanger sequencing.
Repeat-primed PCR
The repeat sequence was amplified by repeat-primed PCR with primers [Fam-S-F, R1S-(AAAAT)6-R and RIS for (TTTTA)n expansion and R1S-(TTTCA)6-F, Fam-S-R and R1S for (TTTCA)n insertion] (Supplementary material). Fam-S-F and Fam-S-R were FAM-labelled locus-specific primers, R1S-(AAAAT)6-R and R1S-(TTTCA)6-F were repeat-specific primers with a DNA tail sequence at the 5′ end, and primer R1S contained the same 5′ tail sequence as R1S-(AAAAT)6-R and R1S-(TTTCA)6-F. PCR was performed with 10–20 ng genomic DNA, 0.2 µM primer Fam-S-F (or Fam-S-R) and primer R1S, and 0.02 µM primer R1S-(AAAAT)6-R [or R1S-(TTTCA)6-F] with HotStarTaq® polymerase (Takara Bio). The initial repeat-primed PCR step was at 95°C for 2 min and was followed by 11 cycles [94°C for 20 s, (56°C − 0.5°C)/cycle for 40 s, and 72°C for 3 min] and 25 cycles (94°C for 20 s, 57°C for 30 s, and 72°C for 3 min). Products of repeat-primed PCR were detected on an ABI3730xl DNA Analyzer. Minimum repeat size of (TTTTA)n expansion and (TTTCA)n insertion was estimated by counting the number of peaks (one peak for one repeat) in GeneMapper.
Analysis of gene expression
We downloaded the RNA-seq data from BrainSpan project (http://www.brainspan.org/). Five hundred and ninety samples of 41 individuals were included. The samples’ ages ranged from embryonic stages to late adulthood, involving 27 brain regions. The locally-weighted scatterplot smoothing method was used to fit the data. The Genotype-Tissue Expression data (GTEx_Analysis_2016–01–15_v7_RSEMv1.2.22_transcript_tpm.txt.gz) were downloaded from https://www.gtexportal.org/home/datasets. We obtained 11 688 samples of 715 individuals, involving 52 tissues.
Results
Narrowing of the FCMTE1 candidate region
Linkage analysis performed in 11 FCMTE pedigrees revealed that all 11 pedigrees reached maximum two-point logarithm of the odds (LOD) scores from 1.64 to 3.77 (Fig. 1A, B and Table 1) in chromosomal region 8q24. Haplotype analysis revealed the minimum candidate region of FCMTE1 flanked by rs800532 and D8S1112, a 4.9 Mb interval containing 21 genes (Table 1 and Supplementary Fig. 1) (Cen et al., 2015).
Maximum LOD scores and candidate regions of 11 pedigrees
| Pedigree ID . | Maximum LOD score . | Candidate region markers . | Candidate region chromosomal position . |
|---|---|---|---|
| Pedigree A | 3.15 | rs2432736–rs17253999 | chr8:113859094–124591925 |
| Pedigree B | 3.77 | rs7001897–rs10093411 | chr8:104160509–124518530 |
| Pedigree C | 3.15 | rs17710204–rs11774172 | chr8:97414597–129225502 |
| Pedigree H | 3.54 | D8S1047–D8S198 | chr8:111042906–123475786 |
| Pedigree I | 1.81 | NA–D8S1801 | chr8:NA–130468552 |
| Pedigree J | 2.09 | D8S1008–NA | chr8:113345875–NA |
| Pedigree K | 3.12 | rs2513399–rs998018 | chr8:98080225–129278791 |
| Pedigree L | 1.64 | D8S1008–D8S1804 | chr8:113345875–124864876 |
| Pedigree M | 2.10 | NA–D8S1720 | chr8:NA–128949884 |
| Pedigree N | 1.81 | D8S556–D8S1112 | chr8:106110324–121750868 |
| Pedigree O | 2.70 | rs800532–rs16900666 | chr8:116842110–126526686 |
| Pedigree ID . | Maximum LOD score . | Candidate region markers . | Candidate region chromosomal position . |
|---|---|---|---|
| Pedigree A | 3.15 | rs2432736–rs17253999 | chr8:113859094–124591925 |
| Pedigree B | 3.77 | rs7001897–rs10093411 | chr8:104160509–124518530 |
| Pedigree C | 3.15 | rs17710204–rs11774172 | chr8:97414597–129225502 |
| Pedigree H | 3.54 | D8S1047–D8S198 | chr8:111042906–123475786 |
| Pedigree I | 1.81 | NA–D8S1801 | chr8:NA–130468552 |
| Pedigree J | 2.09 | D8S1008–NA | chr8:113345875–NA |
| Pedigree K | 3.12 | rs2513399–rs998018 | chr8:98080225–129278791 |
| Pedigree L | 1.64 | D8S1008–D8S1804 | chr8:113345875–124864876 |
| Pedigree M | 2.10 | NA–D8S1720 | chr8:NA–128949884 |
| Pedigree N | 1.81 | D8S556–D8S1112 | chr8:106110324–121750868 |
| Pedigree O | 2.70 | rs800532–rs16900666 | chr8:116842110–126526686 |
NA = not available.
Maximum LOD scores and candidate regions of 11 pedigrees
| Pedigree ID . | Maximum LOD score . | Candidate region markers . | Candidate region chromosomal position . |
|---|---|---|---|
| Pedigree A | 3.15 | rs2432736–rs17253999 | chr8:113859094–124591925 |
| Pedigree B | 3.77 | rs7001897–rs10093411 | chr8:104160509–124518530 |
| Pedigree C | 3.15 | rs17710204–rs11774172 | chr8:97414597–129225502 |
| Pedigree H | 3.54 | D8S1047–D8S198 | chr8:111042906–123475786 |
| Pedigree I | 1.81 | NA–D8S1801 | chr8:NA–130468552 |
| Pedigree J | 2.09 | D8S1008–NA | chr8:113345875–NA |
| Pedigree K | 3.12 | rs2513399–rs998018 | chr8:98080225–129278791 |
| Pedigree L | 1.64 | D8S1008–D8S1804 | chr8:113345875–124864876 |
| Pedigree M | 2.10 | NA–D8S1720 | chr8:NA–128949884 |
| Pedigree N | 1.81 | D8S556–D8S1112 | chr8:106110324–121750868 |
| Pedigree O | 2.70 | rs800532–rs16900666 | chr8:116842110–126526686 |
| Pedigree ID . | Maximum LOD score . | Candidate region markers . | Candidate region chromosomal position . |
|---|---|---|---|
| Pedigree A | 3.15 | rs2432736–rs17253999 | chr8:113859094–124591925 |
| Pedigree B | 3.77 | rs7001897–rs10093411 | chr8:104160509–124518530 |
| Pedigree C | 3.15 | rs17710204–rs11774172 | chr8:97414597–129225502 |
| Pedigree H | 3.54 | D8S1047–D8S198 | chr8:111042906–123475786 |
| Pedigree I | 1.81 | NA–D8S1801 | chr8:NA–130468552 |
| Pedigree J | 2.09 | D8S1008–NA | chr8:113345875–NA |
| Pedigree K | 3.12 | rs2513399–rs998018 | chr8:98080225–129278791 |
| Pedigree L | 1.64 | D8S1008–D8S1804 | chr8:113345875–124864876 |
| Pedigree M | 2.10 | NA–D8S1720 | chr8:NA–128949884 |
| Pedigree N | 1.81 | D8S556–D8S1112 | chr8:106110324–121750868 |
| Pedigree O | 2.70 | rs800532–rs16900666 | chr8:116842110–126526686 |
NA = not available.
Pedigree-B linked to FCMTE1 and long-range PCR in some members from Pedigrees B and J. (A) Pedigree structure of Pedigree B and the corresponding individual genotypes and pentanucleotide repeat configuration. The (TTTCA)n insertion co-segregated with disease. The red blocks represented the (TTTCA)n insertion and the grey blocks represented the (TTTTA) repeats. ins = insertion. (B) Linkage analysis using genotypes of 20 members from Pedigree B showed a maximum two-point LOD score of 3.77 in chromosomal region 8q24. (C and D) Long-range PCR in some members of Pedigree B (C) and Pedigree J (D) showed a long allele band (5–6 kb, white arrows), which co-segregated with disease in the pedigrees.
Routine whole-genome sequencing analysis failed to detect any candidate mutation
Single nucleotide variants and indels were filtered using the following criteria: (i) heterozygous; (ii) absent or with a minor allele frequency value <0.01 in public databases [including dbSNP147, 1000 Genomes Project, the NHLBI Exome Sequencing Project (ESP), the Exome Aggregation Consortium (ExAC) and KaViar database]; (iii) located in the FCMTE1 candidate region; and (iv) shared among different pedigrees. Structure and copy number variants were filtered using the following criteria: (i) heterozygous; (ii) in the FCMTE1 candidate region; and (iii) shared among different pedigrees. No candidate mutation was found in the FCMTE1 locus after routine WGS analysis.
Discovery of the (TTTCA)n insertion
We then chose 233 repetitive motifs of tri-, tetra-, penta-, hexa-, and heptanucleotide repeats in the 4.9 Mb candidate region as the repeats of interest (UCSC Genome Bioinformatics) and input them into ExpansionHunter to search for potential repeat expansions (Dolzhenko et al., 2017). Comparing results of eight FCMTE1 affected members against those of nine controls, a (TTTTA)n expansion in intron 4 of the SAMD12 gene (Genebank: NM_001101676) was indicated in seven of eight FCMTE1 affected members but none of the controls (Supplementary Table 2). The (TTTTA)n expansion was tagged as ‘inrepeat’, indicating that the repeat expansion length was longer than the read length (∼150 bp). To investigate the configuration of the (TTTTA)n expansion further, we checked the unmapped reads from the affected members’ WGS data. Interestingly, a pentanucleotide (TTTCA)n insertion, which was not recorded in the database (UCSC Genome Bioinformatics), was found between an Alu element and the TTTTA repeats (Fig. 2A and B).
Identification of (TTTCA)n insertion in intron 4 of the SAMD12 gene. (A) The (TTTCA)n insertion was located in intron 4 of SAMD12 gene. On the normal allele, the TTTTA/TAAAA repeats were on the 3′ side of AluSq2. On the mutant allele, the pathogenic (TTTCA/TGAAA)n insertion was located between the TTTTA/TAAAA repeats and AluSq2. (B) The unmapped sequences from WGS data of Patient P-K-III7 were mapped back to the genome in UCSC Genome Browser (hg19), which supported the postulated configuration. (C) Direct sequencing of the mutant allele of Patient P-I-III2 that showed (TTTTA)36 repeats (top track to middle track with green line) on the 5′ side of the gene; (TTTTA)36 repeats followed by (TTTCA)n repeats in the middle track and a (TTTCA)n insertion (∼2 kb) (middle track to bottom track with red line) on the 3′ side. (D) Repeat-primed PCR for the (TTTTA)n expansion and (TTTCA)n insertion. The primers P1, P2 and P2 anchor are designed to detect the TTTTA repeats, while P3, P4 and P3 anchor are designed to detect the TTTCA repeats, with the assumption that the TTTTA repeats are located upstream of the TTTCA repeat insertion in intron 4 of SAMD12. The Saw-like peaks were detected in the affected Patient P-K-III7 and not detected in the unaffected Subject P-K-III16, which indicated Patient P-K-III7 carried both (TTTTA)n expansion and (TTTCA)n insertion while Subject P-K-III16 had neither of them.
Identifying the (TTTCA)n insertion as the probable causative mutation of FCMTE1
To confirm the existence of the repeat expansion and repeat insertion, long-range PCR was performed in 11 probands of FCMTE1 pedigrees (Pedigrees A–C and H–O) and 10 of them (except Pedigree O) demonstrated long allele bands (3–6 kb) (Fig. 1C, D and Supplementary Fig. 2). Direct sequencing of the 5′ and 3′ end of the long alleles revealed TTTTA repeats on the 5′ side of the gene and TTTCA repeats on the 3′ side, which supported the configuration that the (TTTCA)n insertion was located between the Alu element and TTTTA repeats (Fig. 2B and C). The (TTTCA)n insertion was detected in all 11 probands of the FCMTE1 pedigrees by repeat-primed PCR, which confirmed the result of long-range PCR and also indicated that the proband (Patient P-O-III3) without a long allele band in the long-range PCR might carry a larger repeat insertion that could not be detected in long-range PCR (Fig. 2D and Supplementary Fig. 2). In another nine FCMTE pedigrees that did not have linkage mapping information (Pedigrees D–G and P–T), (TTTCA)n insertion was detected by repeat-primed PCR in seven probands (from Pedigrees D, F, G and P–S). Four of the seven (Pedigrees F, P, Q and S) showed long allele bands (5–6 kb) in long-range PCR (Supplementary Fig. 2). Repeat-primed and long-range PCRs were performed in all members available from these 18 pedigrees. The (TTTCA)n insertion was completely co-segregated with the disease by repeat-primed PCR (Fig. 1A and Supplementary Fig. 3). The sizes of (TTTCA)n insertion were estimated to be at least 105 repeats in all affected members by repeat-primed PCR. No (TTTCA)n insertion was detected in 119 Chinese controls by repeat-primed PCR and Sanger sequencing. For TTTTA repeats, some affected members had only ∼25–44 repeats on the allele containing (TTTCA)n insertion (Fig. 2C) while some unaffected members had alleles with ∼800 repeats (4 kb) in long-range PCR. The sizes of TTTTA repeats in 238 chromosomes of Chinese controls ranged from 11 to 80+ repeats, while TTTTA repeat size >80 was detected in 3.36% (8/238) control chromosomes (Supplementary Fig. 4). Sanger sequencing of coding regions and splice-sites of the SAMD12 gene in all 20 probands did not detect any candidate mutation. These results suggested that the heterozygous (TTTCA)n insertion between the Alu element and polymorphic TTTTA repeats was the probable causative mutation of FCMTE1.
Founder effect analysis in FCMTE1 pedigrees
In the five pedigrees (Pedigrees A–C, K and O) with HumanOmniZhongHua-8 BeadChip data, we used the genotypes of 92 SNPs covering five linkage disequilibrium (LD) blocks (LD blocks 1–5) around the (TTTCA)n insertion for founder effect analysis. A core haplotype (Supplementary Table 3) containing the (TTTCA)n insertion was shared among all the five pedigrees. This haplotype has a frequency of 0.226 (47/208) in the 1000 Genomes database for the Chinese population. Chi square test between our pedigrees and the control population showed a P-value of 0.00055, which supported a founder effect in these five FCMTE1 pedigrees. To see whether there was a founder effect between Chinese and Japanese FCMTE1 pedigrees, the SNPs data from three representative Japanese FCMTE1 pedigrees with a founder effect were compared with our pedigrees’ data (Ishiura et al., 2018). Fifty-nine SNPs (Supplementary Table 4) shared in the two groups (Chinese pedigrees data from HumanOmniZhongHua-8 BeadChip and Japanese pedigrees data from Affymetrix Genome-Wide Human SNP array 6.0) were used for the analysis for a founder effect. A core haplotype containing the (TTTCA)n insertion region was found to be shared among all these five Chinese pedigrees and three Japanese pedigrees, indicating a founder effect between FCMTE1 pedigrees from these two countries (Supplementary Table 4).
Expression of transcripts containing the repeat region
The SAMD12 gene contains a sterile alpha motif domain, the function of which has been scarcely studied. A total of eight transcripts of the SAMD12 gene were found in Ensembl, UCSC and CCDS database, four of which contained the (TTTCA)n insertion intron (Fig. 3A). In the BrainSpan database, the largest transcript (ENST00000409003.4) of the SAMD12 gene encompassing the (TTTCA)n insertion region was the most dominantly expressed transcript in the brain. Its expression increases with age and remains stable after adolescence (Fig. 3A) (Miller et al., 2014). In the Genotype-Tissue Expression (GTEx) project, it also showed that ENST00000409003.4 was abundantly expressed in brain. The highest expressed region in the brain was frontal cortex followed by cerebellar hemisphere and cerebellum (Fig. 3B) (GTEx Consortium, 2013). In mouse cerebellum, its expression is mainly found in Purkinje cell and granule cell layer (Allen Brain Atlas).
Expression of SAMD12 transcripts. (A) Eight transcripts were recorded in Ensembl, UCSC and CCDS database, of which four transcripts (underlined red) spanning the region contained the (TTTCA)n insertion. In the BrainSpan database, ENST00000409003.4 was dominantly expressed in brain and its expression increased with age (yellow line, plateaued around adolescence). (B) In the GTEx, ENST00000409003.4 was shown to be expressed in most tissues and highly expressed in brain. The highest expressed region in the brain was frontal cortex followed by cerebellar hemisphere and cerebellum.
Discussion
Since the first report of the FCMTE1 locus in 1999, previous studies have narrowed the candidate region to a 17.5 Mb interval between D8S1784 and D8S514 (Cen et al., 2015). Our linkage and haplotype analysis of 11 pedigrees further delineated a 4.9 Mb interval flanked by rs800532 and D8S1112 with 21 genes in this region.
Next-generation sequencing has largely accelerated the discovery of novel causative genes. However, it’s quite challenging for short-read WGS to identify repeat expansions, especially those located in non-coding regions, such as the 5′ UTR region [spinocerebellar ataxia (SCA) type 12, fragile X-associated tremor/ataxia syndrome, fragile X mental retardation syndrome and fragile XE syndrome], 3′ UTR region [myotonic dystrophy type 1, SCA8 and Huntington disease-like 2], and intronic region [Friedreich ataxia, myotonic dystrophy type 2, SCA10, SCA31, SCA36, SCA37, Fuchs endothelial corneal dystrophy and C9orf72 amyotrophic lateral sclerosis/frontotemporal dementia (C9ORF72-ALS/FTD)] (Seixas et al., 2017; Zhang et al., 2017). Dolzhenko et al. developed a software tool—ExpansionHunter—to detect the long repeat expansions using short-read WGS data, even if the expanded repeats are longer than the read length (Dolzhenko et al., 2017). ExpansionHunter in our study detected a (TTTTA)n expansion that led us to incidentally identify the (TTTCA)n insertion from the unmapped reads, which was finally considered as the causative mutation of FCMTE1.
There are three non-exclusive hypotheses about how repeat expansion in non-coding regions gives rise to neurotoxicity: (i) the expanded RNA (pre-mRNA) aggregates to form nuclear RNA foci, which sequester critical RNA-binding proteins and prevent them from performing their normal functions; (ii) the expanded RNA can be non-canonically translated to produce short peptides that lead to neurotoxicity; and (iii) the expansion could result in a gain or loss of function of the associated gene (Gatchel et al., 2005; Todd and Paulson, 2010; Loureiro et al., 2016; Zhang et al., 2017). Formation of nuclear RNA foci and its neurotoxicity has been proven in many kinds of diseases caused by repeat expansions (Seixas et al., 2017; Zhang et al., 2017). A recent study demonstrated that the formation of RNA foci was relied on sequence-specific base-pairing properties of RNA and the threshold number of repeats (Jain and Vale, 2017). The pathogenic repeat sequence (TTTCATTTCA)n found in FCMTE1 matches the pathogenic repeat sequence (ATTTC)n found in SCA37 (Seixas et al., 2017). It has been proven that RNA foci formed in HEK293T cells with ins(ATTTC)58 overexpression, and (AUUUC)58-containing RNA also impaired early embryonic development in the zebrafish in the report of SCA37 (Seixas et al., 2017). Besides, the size of TTTCA repeats in our study was estimated to be over 105, which exceeded the threshold number of repeats to form RNA foci seen in SCA37 (Seixas et al., 2017). These results suggested that the (TTTCA)n insertion found in FCMTE1 might share the similar pathogenic mechanism with SCA37. Recent studies showed that the repeat expansion in C9orf72 generates neurotoxicity via several mechanisms synergistically, including nuclear RNA foci formation and haploinsufficiency (Shi et al., 2018). Similarly, these same mechanisms may apply to the repeat expansion in SAMD12. However, few studies on the function of SAMD12 have been reported so far. It would be important to elucidate how its loss of function may give rise to neurotoxicity in FCMTE1.
The transcript ENST00000409003.4 is mainly expressed in human brain, especially in frontal cortex, cerebellar hemisphere and cerebellum. These brain regions have been reported to be impaired in neuropathological, neuroimaging and neurophysiological studies (Cen et al., 2016b). In mouse cerebellum, expression of Samd12 is mainly found in Purkinje cells and granule cell layer, which matches the FCMTE neuropathological feature of Purkinje cell loss in cerebellum (Sharifi et al., 2012). More interestingly, expression of SAMD12 in human brain increased with age and remained stable after adolescence. The temporal expression pattern is consistent with the observation that most patients started to develop symptoms in adolescence or early adulthood.
In the recent report of Japanese FCMTE1 pedigrees, Ishiura et al. (2018) identified the pentanucleotide (TTTCA)n insertion in the SAMD12 gene as the causative mutation in 85 FCMTE1 patients (from 49 Japanese pedigrees). Two repeat configurations [(TTTTA)exp(TTTCA)exp and (TTTTA)exp(TTTCA)exp(TTTTA)exp] were identified by nanopore sequencing of genomic DNA. The RNA foci were observed in the cortical neurons and Purkinje cells in the brains of patients using a Cy3-(TGAAA)12 probe, which supported evidence that RNA-mediated toxicity was the mechanism underlying the pathogenesis of FCMTE1 (Ishiura et al., 2018). In our report, the pentanucleotide (TTTCA)n insertion in the SAMD12 gene was identified in 105 patients with FCMTE1 (from 18 Chinese pedigrees), which further replicated the findings of Ishiura et al. (2018). The probable founder effect between five representative Chinese FCMTE1 pedigrees and three representative Japanese FCMTE1 pedigrees (suggested by Ishiura et al., 2018) indicated a common ancestral founder in all FCMTE1 patients.
To our knowledge, FCMTE1 is the third neurodegenerative disease (in addition to SCA31 and SCA37) caused by repeat expansion ‘insertion’ (Sato et al., 2009; Seixas et al., 2017). A parsimonious scenario of evolution has been reported that the ATTCT motif of ATXN10 (the causative repeat expansion of SCA10) was directly generated from an ancestral ATTTT motif in the common ancestor of catarrhines, which was mediated by the Alu element (Kurosaki et al., 2009). Like FCMTE1, SCA31 and SCA37 also have a founder effect, and their causative repeat motifs (TTTCA, TGGAA and ATTTC, respectively) are found to be adjacent to an Alu element and TTTTA motif, which may suggest a specific mechanism on where and how pathogenic repeat insertion may occur.
In conclusion, we identified the pentanucleotide (TTTCA)n insertion in the intron of the SAMD12 gene as the causative mutation of FCMTE1. The discovery of the causative mutation of FCMTE1 could shed light on the identification of causative genes for other types of FCMTE. With the development of sequencing techniques and bioinformatic tools, we should pay more attention to the pathogenic repeat expansion in neurodegenerative diseases of unknown molecular aetiology.
Web resources
ExpansionHunter, https://github.com/Illumina/ExpansionHunter
UCSC Genome Browser, http://genome.ucsc.edu/
The NHLBI Exome Sequencing Project (ESP), http://evs.gs.washington.edu/EVS/
Exome Aggregation Consortium (ExAC), http://exac.broadinstitute.org/
KaViar database, http://db.systemsbiology.net/kaviar/
1000 Genomes Project, http://www.internationalgenome.org/
BrainSpan project, http://www.brainspan.org/
Genotype-Tissue Expression project, https://www.gtexportal.org/home/datasets
Allen Brain Atlas, http://mouse.brain-map.org/
Abbreviations
Acknowledgements
We are indebted to the pedigrees members for their generous participation in this study.
Funding
This study was supported by the National Natural Science Foundation of China (Proj. No. 81571089, No. 81371266 and No. 81600850).
Supplementary material
Supplementary material is available at Brain online.
References
Author notes
Zhidong Cen and Zhengwen Jiang authors contributed equally to this work.


