-
PDF
- Split View
-
Views
-
Cite
Cite
Jocelyn O. Eidahl, Carlee R. Giesige, Jacqueline S. Domire, Lindsay M. Wallace, Allison M. Fowler, Susan M. Guckes, Sara E. Garwick-Coppens, Paul Labhart, Scott Q. Harper, Mouse Dux is myotoxic and shares partial functional homology with its human paralog DUX4, Human Molecular Genetics, Volume 25, Issue 20, 15 October 2016, Pages 4577–4589, https://doi.org/10.1093/hmg/ddw287
- Share Icon Share
Abstract
D4Z4 repeats are present in at least 11 different mammalian species, including humans and mice. Each repeat contains an open reading frame encoding a double homeodomain (DUX) family transcription factor. Aberrant expression of the D4Z4 ORF called DUX4 is associated with the pathogenesis of Facioscapulohumeral muscular dystrophy (FSHD). DUX4 is toxic to numerous cell types of different species, and over-expression caused dysmorphism and developmental arrest in frogs and zebrafish, embryonic lethality in transgenic mice, and lesions in mouse muscle. Because DUX4 is a primate-specific gene, questions have been raised about the biological relevance of over-expressing it in non-primate models, as DUX4 toxicity could be related to non-specific cellular stress induced by over-expressing a DUX family transcription factor in organisms that did not co-evolve its regulated transcriptional networks. We assessed toxic phenotypes of DUX family genes, including DUX4, DUX1, DUX5, DUXA, DUX4-s, Dux-bl and mouse Dux. We found that DUX proteins were not universally toxic, and only the mouse Dux gene caused similar toxic phenotypes as human DUX4. Using RNA-seq, we found that 80% of genes upregulated by Dux were similarly increased in DUX4-expressing cells. Moreover, 43% of Dux-responsive genes contained ChIP-seq binding sites for both Dux and DUX4, and both proteins had similar consensus binding site sequences. These results suggested DUX4 and Dux may regulate some common pathways, and despite diverging from a common progenitor under different selective pressures for millions of years, the two genes maintain partial functional homology.
Introduction
Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominant disorder affecting 1 in 8,333 to 1 in 20,000 individuals worldwide (1,2). FSHD is characterized by weakness of facial, shoulder and limb muscles although presentation is often non-uniform. Variability can be seen in the types of muscles affected, severity of weakness, age-at-onset, rates of progression and the involvement of asymmetrical or bilateral phenotypes (3,4). FSHD is associated with aberrant expression of the toxic DUX4 gene, which encodes a double homeodomain transcription factor, embedded within an array of D4Z4 repeats located on the chromosome 4q subtelomere. The prevailing FSHD pathogenesis model states that 4q D4Z4 repeats are normally located in a transcriptionally silent region of heterochromatin, but in FSHD the 4q subtelomere chromatin structure is relaxed, allowing for DUX4 transcription (5–11). This shift to an euchromatin-like state can occur through contraction in D4Z4 repeat copy number (FSHD1) or mutation of the structural maintenance of chromosomes flexible hinge domain 1 (SMCHD1) gene (FSHD2), which functions to silence the 4q D4Z4 array (7,12–14). Although chromatin loosening and DUX4 transcription are required for FSHD development, these events are not pathogenic unless they occur on a specific chromosome 4q variant, which harbours an untranslated exon containing a polyadenylation signal for the most telomeric DUX4 open reading frame (ORF) (7,15–18). As a result, DUX4 is absent or detected at extremely low levels in non-FSHD muscle and other adult tissues except testes, but more abundantly expressed and polyadenylated in FSHD muscles (11,18–20).
Over-expression strategies have been used to study the role of DUX4 in FSHD, and it is clear that DUX4 is toxic to muscle and non-muscle cells of numerous organisms, including humans, mice, zebrafish and frogs (21–29). The conserved toxic properties of over-expressed DUX4 are consistent with the hypothesis of its causation in FSHD, and seem to justify the methodology and the model systems used. On the other hand, the conservation of DUX4 toxicity across species also raised some concerns about the over-expression methodology. Specifically, D4Z4 repeats exist only in placental mammals, and within this group, close DUX4 homologs are present only in primates (30,31). Thus, organisms that lack DUX4-like genes would not evolve with selective pressure to maintain DUX4-regulated networks, thereby suggesting that the toxic phenotypes associated with DUX4 over-expression in species lacking D4Z4 repeats might not be due to activation of some conserved deleterious pathways, but instead could be related to non-specific effects of protein overload in general, or more specifically related to over-expression of a double homeodomain (DUX) family transcription factor family member, of which DUX4 is but one member.
To begin addressing these questions, in this study, we used identical methods to deliver and express seven DUX family members, including human DUX4, DUX1, DUX5, DUXA, DUX4-short (DUX4-s) and the mouse genes Duxbl and Dux, to human cells and mouse muscle, and assessed toxic phenotypes, or lack thereof, using several outcome measures. Among the genes tested, we found that human DUX4 and mouse Dux were similarly damaging while other homeodomain family members were non-toxic. These results were consistent with prior studies suggesting that Dux arises from D4Z4-like repeats in the mouse genome, and possesses DUX4-like toxic properties, including reducing C2C12 cell viability and/or myogenic differentiation, and causing morphological defects and death when expressed in Xenopus laevis embryos (32). Nevertheless, it was unclear if DUX4 and Dux function by activating similar toxic pathways. Thus, to further explore the mechanism of DUX4 and Dux toxicity, we performed ChIP-seq and RNA-seq to identify genomic regions commonly bound by the two proteins and define common gene expression changes. Our results suggest that DUX4 and mouse Dux share partial functional homology.
Results
DUX family transcription factors are not universally toxic in vitro and in vivo

DUX family proteins used in this study. (A) Schematic of 7 DUX family proteins showing location of DNA binding homeodomains (HOX1 and HOX2) and variable carboxyl-terminal domains. Numbers indicate amino acid residues. In the DUX4.HOX1 and Dux.Hox1 DNA binding mutants, underlined residues were mutated to alanines. CT, conserved extreme C-terminus in DUX4 and Dux proteins, shown in (C). (B) (Top) Comparison of amino acid sequence homology in DUX family proteins. Columns 1 and 2 show amino acid identity in the HOX1 and HOX2 domains of indicated proteins, relative to the same domains in DUX4. ‘N-term length’ indicates the amino acid length from residue 1 to the beginning of the first homeodomain. ‘C-term length’ indicates the length from the end of the second homeodomain to C-terminus of the protein. ‘HOX spacer length’ indicates the number of amino acids between the first and second homeodomains. (Bottom) Alignment of conserved DNA binding residues within HOX1 and HOX2. In DUX4 and Dux, 5 of these were mutated to alanines to construct DUX4.HOX1 and Dux.Hox1, as shown in (A). (C) Alignment of extreme C-terminal residues of DUX4 (residues 365-424) and Dux (residues 615-674). Asterisks indicate residue identity. Note high identity in the last 14 amino acids.

In vitro analysis of DUX family proteins. (A) Luminescence-based ATP assay. C2C12s were transfected with indicated plasmids and abundance of ATP was monitored 48 h later. Data are reported in relative luminescent units (RLU) with background removed (cells only), and each condition was performed in triplicate in two independent experiments. * indicates significant difference from “pCI-neo” empty vector control, P < 0.05, ANOVA. (B) Western blot. Protein extracts from cells transfected with expression plasmids encoding V5-epitope tagged DUX family ORFs were visualized using an HRP-coupled anti-V5 antibody. Predicted molecular weights of each protein: DUX4 and DUX4.HOX1, 52 kDa; DUX4s, 22 kDa; Dux and Dux.Hox1, 76 kDa; DUX1, 22 kDa; DUXA, 24 kDa; Duxbl, 38 kDa; DUX5, 22 kDa.

Dux and DUX4 cause myotoxicity in vivo. (A) Schematic of adeno-associated viral vector constructs (AAV6). Each DUX family ORF was driven by the cytomegalovirus (CMV) promoter and contained a carboxyl-terminal V5-epitope tag followed by an SV40 poly-adenylation signal (PA). Each construct was flanked by AAV2 inverted terminal repeats (ITR). (B) Photomicrographs of mouse muscle cryosections stained with hematoxylin and eosin (H&E) staining and anti-V5 immunofluorescence at indicated times following AAV.DUX4 or AAV.Dux delivery. (C) DUX4s, Dux.Hox1, DUXA, Duxbl, DUX5, and DUX1 are shown at 4 weeks post-injection. Arrows indicate positive V5-stained (red) nuclei. Blue, DAPI stain to identify nuclei. * denotes degenerating fibres positive for V5 staining and # show examples of regenerated myofibres with central nuclei. Scale bar = 100 μm. (D) (Top) Percent central nuclei in muscle cryosections 4 weeks post-injection. Central nuclei from fibres of 5 independent animals were counted per treatment. * denote significantly increased numbers of centralized nuclei (P < 0.0001, chi-square) (Bottom) Distribution of fibre diameter (microns) as a percentage of total fibres counted during sampling at 4 weeks post-injection for each construct (n = 5 muscles per group; 5 representative x20 photomicrographs per section).
Identification of human DUX4 and mouse Dux binding sites in human myoblasts

Characterization of DUX4 and Dux DNA binding sites in myoblasts. (A) (Left) Scatter plot of merged peak regions for DUX4 and Dux in human myoblasts. Gray dots correspond to DUX4 specific peaks, blue dots correspond to Dux specific peaks and red dots correspond to peaks in common between DUX4 and Dux. The XY axes denote the number of alignments (“tag”) within each of the peak regions. (Right) Venn diagram showing the overlap of peak regions for DUX4, Dux and DUX4-G in human myoblasts. DUX4-G refers to a published dataset of DUX4 ChIP-seq data by Geng, et al (33). Additional readout of the merged peak regions (heatmap and density plot) to visualize the unique peak populations for DUX4 and Dux can be found in Supplementary Figure 3. (B) Conserved sequence motifs identified by MEME (http://meme.nbcr.net/meme/) in the DUX4 (top) and Dux (bottom) top 1,000 ChIP-Seq peak sequences (37). Nucleotides (measured in bits) of the identified consensus motif are displayed in a sequence logo representation (62). X-axis corresponds to the 11 nucleotide positions (Supplementary Material, Table 2). (C) Table listing genes annotated to DUX4 and Dux overlapping DNA binding sites and their corresponding Gene Ontology IDs. The peak values associated with these DNA binding sites are within the top 100 values for both DUX4 and Dux samples. Columns indicate genomic binding sites identified by ChIP-Seq and gene expression fold change identified by RNA-Seq. (D) Twenty-four hours after transfection of DUX4, DUX4.HOX1, Dux, Dux.Hox1, and GFP in human myoblasts, quantitative RT-PCR was performed to measure expression of the indicated genes containing ChIP-seq binding sites for DUX4 and Dux. Expression was normalized to GFP transfected cells with human RPL13A used as the reference gene, with means +/- SEM shown. * indicates significant difference from GFP control, P < 0.05, ANOVA. Each individual assay was performed in triplicate from n = 3 samples for each condition. In addition, ZSCAN4 and PRAMEF12 experiments were performed three independent times; RFPL1, GTF2F1 and SFRS8 experiments were performed two independent times; and ANKRD1, CWC15, TRAF6, SFRS8 and NFYA expression levels were determined in one experiment. The graphs here show representative data from one experiment each.
Next, we mapped DUX4 and Dux enriched peaks from the ChIP-seq study to the human genome. Each peak was annotated to an associated gene if the interval location was within a set distance of 10,000 bp upstream or downstream of the gene transcriptional start site (TSS) (Supplementary Material, Fig. S2). A correlation analysis between all DUX4 and Dux peaks (percent of total peaks in each sample) revealed that the majority (96.2 and 92.0%, respectively) of DUX4- and Dux-bound sites were unique to each respective protein. We displayed the total merged peak regions as a scatter plot showing three separate populations, one corresponding to DUX4-specific peaks, a second to Dux-specific peaks and third to peaks shared by both DUX4 and Dux (Fig. 4A, Supplementary Material, Fig. S3). The common peak population shared by the two proteins consisted of 319 binding sites (3.8% of all DUX4 sites; 8.0% of all Dux sites) annotated to 235 individual genes (Supplementary Material, Table S3). We next used “Find Individual Motif Occurrences” (FIMO) software to identify the consensus binding motifs for DUX4 and Dux within the set of 319 common ChIP-Seq binding sites (38). Of these sequences, 78% (250/319) contained consensus motifs for both DUX4 and Dux, while 18% contained only one type of consensus motif (29/319 DUX4 only; 27/319 Dux only) (Supplementary Materials, Fig. S4, Table S4). Thirteen of the 319 common binding sites (4%) contained no consensus binding sites for either protein. As an example, the previously validated DUX4 target gene, ZSCAN4, contained 6 DUX4 binding sites located at positions +474, +496, +513, +561, +583 and +600. Four of these six DUX4 consensus binding sites in ZSCAN4 also matched the Dux consensus binding motif (+474, +513, +561 and +600) (Supplementary Materials, Fig. S4, Table S4).
The presence of only the DUX4 consensus binding motif in genes bound by Dux in our ChIP-Seq experiment, and conversely, the presence of the Dux consensus binding motif in genes bound by DUX4, suggested both proteins were able to recognize either sequence. To confirm that DUX4 and Dux were capable of binding similar sequences, we co-transfected each DUX family protein into HEK293 cells along with a DUX4-responsive GFP reporter (39). We found that DUX4 and Dux, and not their DNA binding mutants, activated GFP expression while other DUX family proteins did not (Supplementary Material, Fig. S5) (39).
We also performed Ingenuity Pathway Analysis (IPA) to identify some broad categories of overlap between human DUX4- and mouse Dux-bound genes, as well as pathways unique to Dux alone (Supplementary Material, Table 5). Of the 235 genes annotated to the common binding sites, only 6 (ZSCAN4, GTF2F1, PALM, C19orf21/MISP, CLK1, PPIL3) were among the top 100 DUX4 and Dux annotated genes when sorted by ChIP-seq peak value (Fig. 4C). We performed a GO enrichment analysis on these 6 genes to identify their corresponding functions, which included transcriptional factor activity, kinase and isomerase activity, plasma membrane association and actin binding (Fig. 4C).
We next compared our annotated DUX4 binding sites in human myoblasts (referred to here as DUX4) with the raw data (FASTQ files) from the previously reported DUX4 ChIP-seq study (for distinction, we refer to this published dataset here as DUX4-G, where G = Geng) (33). We performed an identical analysis using our methodology, and identified 33,919 peaks for DUX4-G (whereas Geng, et al. reported 39,737 to 62,028 peaks using different analysis parameters). DUX4-G had ∼4 times as many called peaks compared to our DUX4 sample, but 71% (6,020 of 8,442) were overlapping between the two samples (Fig. 4A). This represented 88% (3,385) of all genes in our DUX4 gene list, with overlapping network functions (Supplementary Material, Table 5). These results support good reproducibility between the two datasets. Similar to our DUX4 and Dux comparison, the correlation analysis between all DUX4-G and Dux peaks (percent of total peaks in a sample) revealed that the majority (98.8 and 90.0%, respectively) of DUX4-G- and Dux-controlled genes were unique to each respective protein and only 396 binding sites overlapped (1.2% of all DUX4-G sites; 9.9% of all Dux sites) in the ChIP-Seq experiments. The percentage of overlapping binding sites was greater between DUX4-G and Dux compared to DUX4 (9.9 verses 8.0%) likely due to the more abundant number of peaks detected by Geng et al. for DUX4.
In an attempt to correlate ChIP-seq binding with gene activation, we initially used a candidate approach to validate some potential gene targets of DUX4 and Dux by QPCR. We first focused on expression of ZSCAN4, PRAMEF12, GTF2F1, SFRS8 (aka SFRS2B), NFYA, CWC15 and RFPL1 because these genes were bound by DUX4 and Dux in our ChIP-seq experiments and previously identified as DUX4-upregulated targets (33). We also measured levels of ANKRD1 (which had previously been shown to be significantly downregulated upon DUX4 over-expression), and TRAF6, which was bound by Dux but not DUX4 in our ChIP-seq study. To do this, we transfected human myoblasts with expression plasmids containing DUX4, DUX4.HOX1, DUX4-s, Dux, Dux.Hox.1 or GFP, harvested RNA, and performed quantitative RT-PCR using primers/probes for each respective human gene. We found that ZSCAN4 and PRAMEF12, which have been used as DUX4-responsive biomarkers, were absent or very low in normal human myoblasts expressing GFP, DUX4.HOX1, DUX4-s, and Dux.Hox1, but significantly upregulated 34,772-fold and 5,790-fold in cells expressing DUX4, and 13,338-fold and 4,665-fold in cells expressing Dux, respectively (Fig. 4D, Supplementary Material, Table S6A). In contrast, GTF2F1, RFPL1, SFRS8, NFYA and CWC15 were significantly upregulated only in DUX4 expressing cells (4.7-fold, 1685-fold, 5.9-fold, 5.1-fold and 5.4-fold, respectively) but not in Dux expressing cells despite the presence of Dux binding sites within each gene (Fig. 4D, Supplementary Material, Table S6A). Interestingly, ANKRD1 was significantly downregulated by DUX4, which was consistent with a prior study, but significantly upregulated in Dux expressing cells (33). Finally, TRAF6 was not significantly changed in response to any treatment (Fig. 4D, Supplementary Material, Table S6A).
In a second candidate gene approach to validate our ChIP-seq data with QPCR, we investigated potential gene expression changes in the P53 pathway. We focused on the P53 pathway because we and others previously showed that DUX4 toxicity operated at least partly through activation of P53-associated cell death (21–23,40). A total of 33 genes in the P53 pathway had binding sites for DUX4, Dux, or both in human myoblasts (Supplementary Material, Table 6B). Using similar methodology to the experiment in Figure 4, we performed Taqman QPCR assays on 29 of these genes containing DUX4 or Dux binding sites. None were significantly changed following Dux over-expression, and only 2 were increased upon DUX4 treatment (SIRT1, BRIP1), while one was decreased (NBA1).
DUX4 and Dux upregulate common genes in human myoblasts

Common expression changes in DUX4- and Dux-expressing myoblasts (A) Venn diagram of genes up- or down-regulated in human myoblasts transfected with DUX4 and Dux identified by RNA-seq. (B) Compilation of ChIP-seq and RNA-seq data. The fourteen genes listed were significantly upregulated by both DUX4 and Dux and contained ChIP-seq binding sites for both proteins within 10,000 bp upstream or downstream of the corresponding gene’s TSS. FC, fold-change. PANTHER description (http://pantherdb.org/). Asterisks indicate genes with identical binding sites in our ChIP-seq experiments. One asterisks (*) indicates the presence of one consensus binding site sequence, and two asterisks (**) indicates the presence of more than one consensus binding site sequences.
We next compared our ChIP-seq and RNA-Seq results to identify upregulated gene targets common to both DUX4 and Dux. For DUX4, 316 of the 3,244 upregulated genes (9.7%) contained ChIP-seq binding sites. The same analysis for Dux revealed that 33 of 226 upregulated genes had Dux ChIP-seq binding sites (15%). Of these, 14 were in common between DUX4 and Dux, including the DUX4/FSHD biomarkers PRAMEF12 and ZSCAN4, and ERBB4, which can promote apoptosis upstream of P53 (Fig. 5B, Supplementary Materials, Table S3) (41,42). In addition to ERBB4, we also found that DUX4 and Dux both significantly upregulated GADD45G, which could contribute to P53 pathway activation by activating p38 and JNK pathways (43,44), although neither protein had binding sites located in or near the GADD45G gene, suggesting an indirect activation.
A recent report showed that DUX4 required at least two binding sites to induce gene expression of reporter plasmids (45). To address this, we generated a list of genes upregulated by DUX4 containing ChIP-seq binding sites, and a similar list for Dux. We then used FIMO software to identify consensus-binding sequences within the ChIP-Seq merged peaks corresponding to genes from these lists (Supplementary Materials, Table S4). We sorted the gene lists by fold change and number of consensus binding sites found within the peak sequences (Supplementary Material, Table S10). We did not detect an obvious correlation between fold change upregulation and quantity of consensus binding sites within the peak sequence.
Discussion
Several systems for studying DUX4 expression have emerged in recent years, but in general they can be divided into two categories: those in which DUX4 arises from an endogenous D4Z4 locus, and DUX4 over-expression models (23,25–27,46–48). These “endogenous” models hold advantages as they allow the study of DUX4 from its natural cis-acting control elements, and recapitulate epigenetic phenotypes associated with FSHD. In contrast, they also present some difficulties, as DUX4 expression in these systems may be absent, very low, or alternatively, elevated but only in a small percentage of myonuclei (0.01 – 0.1%) (11,25,40,46–49). As a result, cytotoxic or myopathic phenotypes caused by DUX4 expression in “endogenous” systems may be completely lacking, very subtle, and/or inconsistent. Alternatively, over-expression methods permit more uniform and detectable DUX4 expression across a population of cells, in whole muscle groups, or throughout an organism. In addition, since elevated levels of DUX4 are considered a key component of FSHD pathogenesis, there is logic to modeling FSHD through DUX4 over-expression. Using this approach, numerous groups have shown that DUX4 is toxic to muscle and non-muscle cells of several vertebrate species, including zebrafish, frogs, mice and humans (20–22,24–28,32,33,46,50). For example, we found that DUX4 caused dose-dependent muscle lesions and weakness in adult mice transduced with AAV.DUX4 vectors (23). In contrast, a DUX4 DNA binding domain mutant (DUX4.HOX1) was benign when over-expressed in human cells, mouse muscle and developing zebrafish, suggesting that DUX4 toxicity was specifically related to its ability to bind DNA and activate downstream deleterious pathways (23). Taken together, these data could be interpreted in two opposing ways. On one hand, the conserved toxic properties of DUX4, as well as the absence of toxicity from the DUX4.HOX1 mutant, support the hypothesis of DUX4 causation in FSHD, and seemed to justify the use of over-expression methodology in human and non-human model systems alike. Moreover, these data suggested DUX4 toxicity was associated with its transcription factor function, and unrelated to non-specific protein overload within target cells, which has been reported for otherwise benign proteins such as GFP (51). On the other hand, DUX4 is a primate-specific gene, so a case could be made that the conservation of DUX4 toxicity in zebrafish, frogs, mice and humans would be unlikely to occur through similar mechanisms, and therefore suggested that data obtained from DUX4 over-expression studies should be cautiously interpreted for its specificity and relevance to FSHD. This interpretation of the current data is based on the notion that animals without DUX4 would lack selective pressure to maintain DUX4 binding sites within critical pathways, and would therefore not co-evolve DUX4-regulated networks.
If the toxic effects of DUX4 were related to its ability to transactivate-specific genes, it is difficult to envision that DUX4 would control the same genes in primates as it would in non-primates, which lack DUX4. Nevertheless, over-expressing DUX4 would certainly impact gene expression in each of these organisms; the DUX4 consensus binding site is an 11 nucleotide (nt) motif occurring with a random frequency of 1 in 3,815,000, meaning that the zebrafish, mouse and primate genomes would contain about 393, 708 and 786 DUX4 consensus sites based upon chance alone. Although not all sites would impact gene expression - since transcription factors and their binding sites may be regulated by transcriptional co-factors, chromatin status and position near a promoter or enhancer region - we considered the possibility that the conserved toxic properties of DUX4 could be related to non-specific, promiscuous transcription caused by over-expressing a DUX family transcription factor in general. Our results did not support this hypothesis, as only mouse Dux showed toxic phenotypes similar to DUX4 (Figs. 2 and 3). Together, these data demonstrated that DUX family members were not universally toxic, and suggested that human DUX4 and mouse Dux could be functionally homologous. If true, this has positive implications for studying DUX4 and FSHD in mice but does not explain DUX4 toxicity in zebrafish or frogs, since they do not contain D4Z4 repeats or express DUX4-like proteins.
D4Z4 repeats have been identified in at least 11 different placental mammal species, including humans and rodents, with the greatest conservation occurring in the DUX4 or DUX4-like ORF (30,31,52). Indeed, non-primate and human D4Z4 sequences have almost no nucleotide similarity outside the ORF. Nevertheless, evolutionary studies suggest close DUX4 homologs exist only in primates, while the mouse Dux gene is considered a distantly related paralog that emerged independently of primate DUX4 by retrotransposition from a common progenitor (DUXC), which has since been lost in primates and rodents (30,31,52). Although the two paralogs would have been subject to different selective pressures over millions of years, the shared toxic phenotypes could have resulted from maintenance of some primordial role of their DUXC progenitor. Unfortunately, we did not directly test the potential toxicity of DUXC in our system. The DUXC ORF has been suggested to exist in dogs and cows, and although we were able to isolate DUXC clones from the dog, we were never able to identify complete open reading frames to determine if DUXC was toxic like Dux and DUX4 (Supplementary Material, Fig. S6). However, we performed ChIP-seq and RNA-seq to explore the hypothesis that DUX4 and Dux could have maintained some conserved gene regulation function. We found that DUX4 and Dux have similar, but not identical, consensus binding motifs and share only a small percentage of binding sites within the human myoblast genome (4 – 8% of all sites bound by DUX4 and Dux, respectively). Our overall analysis of ChIP-Seq binding sites reveals a unique population is bound by Dux (Fig. 4A). Moreover, DUX4 had 2X as many binding sites as Dux, and activated 14X more genes in the same myoblast cells. Considering the genes are paralogous and evolved in different species, it is perhaps not surprising that the human cells were more responsive to the human DUX4 gene than to mouse Dux. Nevertheless, although a relatively small number of genes were activated by mouse Dux, 80% of these were also activated by DUX4. Thus, combined, our in vitro and in vivo toxicity experiments, and ChIP-seq and gene expression data, suggested that DUX4 and Dux may share some partial functional homology. We identified 14 genes that are bound and upregulated by both DUX4 and Dux, including ERBB4, which can promote cell death through P53 signalling (Fig. 5B) (41,42). Another potentially important candidate gene involved in P53-mediated cell death was GADD45G, which was significantly increased in both DUX4- and Dux-expressing cells, but did not appear to be a direct transcriptional target of either protein (Supplementary Material, Table 7B) (43,44). We are interested in P53 signalling because we previously showed that DUX4 overexpression in human HEK293 cells and adult mouse muscle caused cytotoxicity or myopathy that was dependent on its ability to bind DNA and indirectly activate the p53 signalling pathway (23). Several other groups reported similar findings in different systems (21,22,28,40). In future studies, we will explore the potential role of ERBB4 and GADD45G alone, or in combination, as well as other genes commonly regulated by DUX4 and Dux, as part of the mechanism underlying DUX4- and Dux-associated toxicity. However, we also note that our previous investigation of the P53 pathway utilized intact, whole mouse muscles as a model system, and it is therefore possible that DUX4 and Dux could regulate different genes in myoblasts versus myotubes or whole muscle. Thus, it may be prudent to explore other cell types, including differentiated human muscle cells.
Regarding another potential mechanism of toxicity, one hypothesis involves DUX4 activation of the DEFB103B gene, which was proposed to modulate innate immunity and inhibit muscle differentiation. Although 88% of the genes changed in our DUX4 ChIP-seq were also present in the prior report that identified DEFB103B as a potentially important gene in FSHD pathogenesis, in our study DEFB103B was not bound by DUX4 or Dux, nor was its expression significantly changed. In contrast, several previously reported DUX4 target genes were upregulated in both DUX4- and Dux-treated cells, including ZSCAN4, PRAMEF1 and TRIM43, thereby further supporting the validity of our overall methods (Fig. 4D, Supplementary Material, Table S7).
By comparing the structure of the various DUX family members, we gained additional insight in the domains mediating DUX4 and Dux toxicity. First, we showed that conserved residues within the first DNA binding domain (residues 65-70 in mouse Dux) were critical for Dux-associated toxicity, which was an identical finding to what we reported for DUX4. Importantly, these conserved residues are also present in both the HOX1 and HOX2 homeodomains of all DUX family members analysed here, including non-toxic DUX1 and DUX5, which have 81-95% conservation with the DUX4 homeodomains (Fig. 1B). These results suggested that the high homology between the DNA binding domains is not predictive of their toxicity. Indeed, among the DUX family genes tested, DUX4 and Dux had the least amount of homology within their homedomains but were the only ones with toxic properties. The fact that DUX4 and Dux had similar consensus binding motifs but few overlapping binding sites could suggest that other transcriptional co-factors play a role in the binding specificity and function (i.e. toxicity) of the various DUX family proteins. We hypothesize that the variable carboxy-terminal regions of the respective DUX family proteins could play a role in mediating protein interactions required for efficient transactivation function (Fig. 1A and B). Indeed, overall the DUX4 and Dux carboxy-terminal domains are divergent, but their extreme C-terminal residues show high conservation indicating a common functional role including through stimulating the toxic phenotypes reported here (Fig. 1C) (31). None of the other non-toxic DUX family members contain this extreme C-terminal motif, including the benign DUX4-s isoform that has identical homeodomains as DUX4 but lacks this carboxy-terminal motif (Figs. 2 and 3). In addition, the interactions of DUX protein carboxy-terminal domains could result in their tethering to particular chromatin regions, which is necessary for temporal and spatial patterning (53–55). We now know that DUX4 recruits the histone acetyltransferase, p300, to chromatin thereby enhancing transcriptional activation of target genes, including ZSCAN4 (56). By triggering an increase in acetylation to particular regions in the chromatin, DUX4 is hypothesized to function as a pioneer transcription factor (56). Conservation between DUX4 and Dux within their corresponding carboxy-terminal motifs suggest interactions with co-activators, like p300, could remain conserved as well. Further studies should be designed to address whether Dux is capable of facilitating H3K27 acetylation at particular chromatin regions and in turn promote gene expression, as recent reports have described for DUX4 (56). Finally, we note that excluded from our list of DUX family members is DUX4c, which is identical to DUX4 in the double homeodomain region, but divergent in the C-terminus. DUX4c shares some common protein binding partners with DUX4 (57), but its over-expression causes muscle differentiation defects and not cell death (58,59). Future work should include domain-switching experiments to determine which C-terminal residues are important for the toxic function of DUX4 and Dux, as well as to identify proteins that could interact with these motifs.
In conclusion, we reported here that DUX family members are not universally toxic, and only mouse Dux caused toxic phenotypes similar to those seen in cells and tissues over-expressing DUX4. Like DUX4, Dux-associated toxicity required residues important for DNA binding in HOX1. Finally, mouse Dux and human DUX4 showed largely divergent patterns of potential gene regulation based upon their ChIP-seq binding sites, but 80% of the genes upregulated by Dux were also upregulated by DUX4 in human myoblasts. These results suggest that DUX4 and Dux may share some partial functional overlap.
Materials and Methods
Protein alignment
Protein alignments of all Dux proteins were compared using EMBOSS Needle (http://www.ebi.ac.uk/Tools/psa/emboss_needle/). The following UniProt/Swiss-Prot identifications were used DUX4_HUMAN (Accession Q9UBX2), A1JVI8_MOUSE (Accession A1JVI8), Q7TNE6_MOUSE (Accession Q7TNE6), DUXA_HUMAN (A6NLW8), DUX1_HUMAN (O43812) and DUX5_HUMAN (Q96PT3) for the amino acid sequence alignments of DUX4, Dux, Duxbl, DUXA, DUX1 and DUX5, respectively.
Cell culture
Human embryonic kidney cells (HEK293) were cultured in DMEM supplemented with 10% fetal bovine serum, L-glutamine and penicillin/streptomycin at 37 °C in 5% CO2. Human immortalized myoblasts (WS236, 15V, bicep, unaffected control cells) (19) were cultured in media containing DMEM supplemented with 16% Medium 199, 15% fetal bovine serum, 30 ng/ml zinc sulphate, 1.4 µg/ml vitamin B12, 55 ng/ml dexamethasone, 2.5 ng/ml human growth factor, 10 ng/ml fibroblast growth factor and 20 mM HEPES. Cells were maintained as myoblasts and were not differentiated for any of the described experiments.
Chromatin immunoprecipitation and sequencing (ChIP-Seq)
Human immortalized myoblasts were transfected with plasmids encoding C-terminal epitope tagged DUX4 and Dux using the Human Dermal Fibroblast Nucleofector kit (Amaxa). Cell fixation was performed 24 h post-transfection by the addition of 1% formaldehyde for 15 min and quenched with 0.125 M glycine. Chromatin was isolated by the addition of lysis buffer, followed by disruption with a Dounce homogenizer. Lysates were sonicated and DNA sheared to an average length of 300-500 bp. Genomic DNA (Input) was prepared by treating aliquots of chromatin with RNase, proteinase K and heat for de-crosslinking, followed by ethanol precipitation. Pellets were resuspended and the resulting DNA was quantified on a NanoDrop spectrophotometer. An aliquot of chromatin (30 ug) was precleared with protein A agarose beads (Invitrogen). Genomic DNA regions of interest were isolated using 4 µg of antibody against V5-tag (Abcam ab15828). Complexes were washed, eluted from the beads with SDS buffer then subjected to RNase and proteinase K treatment. Crosslinks were reversed by incubation overnight at 65 °C, and ChIP DNA was purified by phenol-chloroform extraction and ethanol precipitation.
Illumina sequencing libraries were prepared from the ChIP and Input DNA by the standard consecutive enzymatic steps of end-polishing, dA-addition and adaptor ligation. After a final PCR amplification step, the resulting DNA libraries were quantified and sequenced on HiSeq 2500. Sequences (50 nt reads) were aligned to the human genome (hg19) using the BWA algorithm (60). Duplicate reads were removed and uniquely mapped reads (mapping quality ≥ 25) were used for further analysis. Alignments were extended in silico at the 3’-ends to a length of 200 bp, which is the average genomic fragment length in the size-selected library, and assigned to 32-nt bins along the genome. The resulting histograms were stored in BAR and bigWig files. Peak locations for DUX4 and Dux were determined using the MACS algorithm (v1.4.2) with a cutoff of P-value = 1e-6 (36). Signal maps and peak locations were used as input data to Active Motif’s (http://www.activemotif.com/) proprietary analysis program, which created Excel tables containing detailed information on sample comparison, peak metrics, peak locations and gene annotations. The data discussed in this publication have been deposited in the NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE75791 (61).
Our transfection conditions differed from the Geng et al. report for the following reasons. We transfected unaffected control immortalized myoblasts using plasmids encoding C-terminal epitope tagged DUX4 or Dux using a nucleofector kit whereas their group transduced unaffected control primary myoblasts using a lentivirus expressing DUX4. Prior to ChIP-Seq, we immunoprecipitated DUX4 using an antibody targeting the C-terminal V5 tag, whereas Geng et el immunoprecipitated DUX4 using polyclonal rabbit anti-sera targeting DUX4. In both cases, cells were harvested 24 h post-transfection or -transduction.
RNA sequencing (RNA-seq)
Human immortalized myoblasts were transfected identical to cells described above for ChIP-Seq. Following assessment of the quality of total RNA using the Agilent 2100 bioanalyzer and RNA Nano Chip kit (Agilent Technologies, CA), RNA was DNAse treated and 1.5 µg was subjected to rRNA with Ribo-ZeroTM rRNA removal kit for human/mouse/rat (Illumina). To generate directional signal in RNA seq data, libraries were constructed from first strand cDNA using ScriptSeqTM v2 RNA-Seq library preparation kit (Epicentre Biotechnologies, WI). Briefly, 50 ng of rRNA-depleted RNA was fragmented and reverse transcribed using random primers containing a 5’ tagging sequence, followed by 3’end tagging with a terminal-tagging oligo to yield di-tagged, single-stranded cDNA. Following purification by a magnetic-bead based approach, the di-tagged cDNA was amplified by limit-cycle PCR using primer pairs that anneal to tagging sequences and add adaptor sequences required for sequencing cluster generation. Amplified RNA-seq libraries were purified using AMPure XP System (Beckman Coulter). Quality of libraries were determined via Agilent 2200 Tapestation using High Sensitivity D1000 screen tape, and quantified by Qubit flourometer with dsDNA BR assay (Invitrogen by Thermo Fisher Scientific). Paired-end 150 bp sequence reads were generated using the Illumina HiSeq 4000 platform. The data discussed in this publication have been deposited in the NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE85935 (61).
Real-time polymerase chain reaction
Human myoblasts (500,000 cells, n = 3) were transfected with 6 µg of AAV.DUX4.V5, AAV.Dux.V5, AAV.DUX1.V5, AAV.DUX5.V5, AAV.DUXA.V5, AAV.Duxbl.V5, AAV.Dux.Hox1.V5 or AAV.eGFP plasmid using the NHDF Nucleofector Kit (Lonza). Cells were harvested in TRIzol RNA isolation reagent (Life Technologies) 18–20 h post-transfection. RNA was isolated, DNase treated and reverse transcribed into cDNA using High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems). TaqMan gene expression assays (Applied Biosystems) were used to quantify human RPL13A (Hs01494366_g1), PRAMEF12 (Hs04193637_mH), ZSCAN4 (Hs00537549_m1), NFYA (Hs00953589_m1), SRSF8 (Hs00259455_s1), CWC15 (Hs00982912_m1), GTF2F1 (Hs00157845_m1), RFPL1 (Hs00798084_s1), ANKRD1 (Hs00173317_m1) and TRAF6 (Hs00939745_g1). Efficiencies were comparable among all probes. RPL13A was used as the reference gene. The normalized expression (ΔΔCq) was calculated relative to control GFP transfected cells.
Western blot protein visualization
HEK293 cells (900,000 cells) were transfected with 1.6 μg of AAV.DUX4.V5, AAV.Dux.V5, AAV.DUX1.V5, AAV.DUX5.V5, AAV.DUXA.V5, AAV.Duxbl.V5, AAV.Dux.Hox1.V5 plasmid using Lipofectamine 2000 reagent (Thermo Scientific). Cells were harvested in M-PER Mammalian Protein Extraction Reagent (Thermo Scientific) 24 h post-transfection. Total protein (15 ug) was quantified using the DC Protein Assay (Bio-Rad) was analysed, respectively, by 12% SDS-polyacrylamide gel. Protein was visualized by western blot using an anti-V5 antibody (Invitrogen; R961-25).
Luminescent cell viability assay
C2C12 cells (32,650 cells/well) were transfected (n = 3) with 400 ng AAV.DUX4.V5, AAV.DUX4.HOX1, AAV.Dux.V5, AAV.DUX1.V5, AAV.DUX5.V5, AAV.DUXA.V5, AAV.Duxbl.V5, AAV.Dux.Hox.V5 or pCI-neo plasmid using Lipofectamine 2000 (Thermo Scientific) and plated simultaneously on 96-well plates. Abundance of ATP was monitored using the CellTiter-Glo Luminescent Cell Viability Assay (Promega). Luminescence, proportional to the abundance of ATP, was measured 48 h post-transfection using a luminescent plate reader (GloMax microplate luminometer, Promega) after 30 min. Data were reported as mean luminescence with the background (cells only) value removed.
DUX4-activated reporter in cells
HEK293 cells (1x106 cells/well) were cotransfected with 600 ng AAV.DUX4.V5, AAV.DUX4.HOX1, AAV.Dux.V5, AAV.Dux.Hox, or pCI-neo and 600 ng pLenti.DUX4-activated GFP, previously reported in (39), reporter plasmids using Lipofectamine 2000 (Thermo Scientific) and plated simultaneously on 6-well plates. GFP expression was visually monitored 24 h post-transfection with a fluorescent stereo microscope (Leica M165 FC microscope., Leica Microsystems).
Adeno-associated virus injections
Six- to eight-week-old C57BL/6 females received 3e9 or 3e10 DNase resistant particles (DRP) of adeno-associated virus (AAV) 6 bilaterally via direct 50 µl intramuscular injection into the anterior (TA), and harvested 2 or 4 weeks post-injection (n = 10 muscles per group each time point at (DRP) units). In vivo transduction was determined by V5 immunofluorescence staining as described below. To ensure repeatability of data, two different AAV lots were produced per vector, such that two cohorts of animals were used with n = 5 per cohort per dose.
Histological analysis
Tibilalis anterior (TA) muscles were dissected from intramuscular injected mice at indicated times post-injection for histological analysis (n = 10 muscles per group each time point at 3e9 or 3e10 DRP units). Ten micron cryosections were H&E stained as previously described (23). Fibre diameter and central nuclei quantifications were determined from muscles 4 weeks post-injection (n = 5 muscles per group; 5 representative x20 photomicrographs per section), using cellSens Standard Software (Olympus, Tokyo, Japan). Fibre size diameter histograms represent percentage of total fibres analysed.
Immunohistochemistry
Gene expression and subcellular localization of all Dux proteins were visualized using V5 immunofluorescence as previously described (23).
Statistical analysis
All statistical analyses (Luminescent Cell Viability Assay, in vivo histological analysis, RT-PCR) were performed in GraphPad Prism 5 (GraphPad Software, La Jolla, CA) using indicated statistical tests.
Supplementary Material
Supplementary Material is available at HMG online.
Acknowledgements
We thank members of the Harper laboratory for assistance and support, Dr. Yi-Wen Chen for mouse Dux cDNA, Dr. Alexandra Belayew for DUX1 plasmid, Dr. Daniel G. Miller for the DUX4-responsive reporter plasmid, TRINCH Viral Vector Core Facility members for assistance with AAV production and the University of Massachusetts Wellstone Muscular Dystrophy Program for providing the human immortalized myoblasts used in our studies. ChIP-seq analysis was performed under contract by ActiveMotif.
Conflict of Interest statement. None declared.
Funding
This work was supported by grants from the Chris Carrino Foundation, the National Institutes of Health [5R01AR062123-04] to S.Q.H.]; Wellstone Muscular Dystrophy Centers at Nationwide Children’s Hospital [U54HD066409 to C.R.G.] and University of Massachusetts [5U54HD060848-09 to S.Q.H.]; the National Center for Advancing Translational Sciences Award [TL1TR001069 to C.R.G.]; FSH Society [FSHS-MGBF-020] to J.O.E.]; OSU/NCH Center for Muscle Health and Neuromuscular Disorders [5T32NS077984-02 to J.O.E.] and [82132515 to C.R.G.]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Advancing Translational Sciences or the National Institutes of Health. Funding to pay the Open Access publication charges for this article was provided by the National Institutes of Health [5R01AR062123-04].
References
Author notes
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.