-
PDF
- Split View
-
Views
-
Cite
Cite
Mohamad Karaky, María Fedetz, Victor Potenciano, Eduardo Andrés-León, Anna Esteve Codina, Cristina Barrionuevo, Antonio Alcina, Fuencisla Matesanz, SP140 regulates the expression of immune-related genes associated with multiple sclerosis and other autoimmune diseases by NF-κB inhibition, Human Molecular Genetics, Volume 27, Issue 23, 1 December 2018, Pages 4012–4023, https://doi.org/10.1093/hmg/ddy284
- Share Icon Share
Abstract
SP140 locus has been associated with multiple sclerosis (MS) as well as other autoimmune diseases by genome-wide association studies (GWAS). The causal variant of these associations (rs28445040-T) alters the splicing of the SP140 gene transcripts reducing the protein expression. We aimed to understand why the reduction of SP140 expression produced by the risk variant can increase the susceptibility to MS. To this end, we determined by RNA sequencing (RNA-seq) analysis the differentially expressed genes after SP140 silencing in lymphoblastoid cell lines (LCLs). We analyzed these genes by gene ontology (GO), comparative transcriptome profiles, enrichment of transcription factors (TFs) in the promoters of these genes and colocalization with GWAS risk variants. We also monitored the activity of the nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) in SP140-silenced cells by luciferase reporter system. We identified 100 genes that were up-regulated and 22 genes down-regulated in SP140-silenced LCLs. GO analysis revealed that genes affected by SP140 were involved in regulation of cytokine production, inflammatory response and cell–cell adhesion. We observed enrichment of NF-κB TF in the promoter of up-regulated genes and NF-κB-increased activity in SP140-silenced cell lines. We showed enrichment of genes regulated by SP140 in GWAS-detected risk loci for MS (14.63 folds), Crohn’s disease (4.82 folds) and inflammatory bowel disease (4.47 folds), not observed in other unrelated immune diseases. Our findings showed that SP140 is an important repressor of genes implicated in inflammation, suggesting that decreased expression of SP140, promoted by the rs28445040-T risk variant, may lead to up-regulation of these genes by means of NF-κB inhibition in B cells.
Introduction
Genome-wide association studies (GWAS) have allowed to determine more than 300 susceptibility loci for autoimmune diseases. Many of these associated regions overlap across different diseases (1). This suggests that there are common pathological mechanisms underlying autoimmune diseases. One of these loci, which have been associated with multiple sclerosis (MS) (2), Crohn’s disease (CD) (3,4) and inflammatory bowel disease (IBD) (5), is the SP140 encoding region. All the described risk variants are in high linkage disequilibrium indicating that the same causal variant could be responsible for all the described disease associations. We recently have identified the causal variant of this association in MS. This is an SNP that alters the splicing of 7th exon of the SP140 gene producing a transcript lacking this exon and decreasing the expression of the full-length transcripts and the encoding protein in the blood of patients with MS carrying the associated variant (6).
SP140 is a nuclear protein belonging to the speckled protein (SP) family, composed of several members implicated in transcriptional regulation. SP family members are mainly expressed in leukocytes (7,8) and their expression is strongly enhanced by interferons and other viral infection-related stimuli (9). All members of the SP family have a multidomain structure typical of the proteins able to bind to chromatin. They harbor a plant homeodomain (PHD) finger that can read the N-terminal tail of histone H3 (10) and a bromodomain at its C-terminal which recognizes acetylated lysine residues, such as those on the N-terminal tails of histones (11). PHD-Bromo cassettes of Sp100C and Sp140 have been shown as unmethylated histone H3 Lys4 readers, establishing a direct chromatin targeting function that may regulate transcriptional gene silencing (12). They also contain a nuclear localization signal, a dimerization domain (heat shock respose (HSR) or caspase-recruitment domain (CARD)) and a SAND domain (an 80-residue sequence present in Sp100, AIRE-1, NucP41/75 and DEAF-1/suppressin proteins) and an 80-residue sequence present in Sp100, AIRE-1, NucP41/75 and DEAF-1/suppressin proteins (SAND domain) found in nuclear proteins, many of which function in chromatin-dependent transcriptional control (13). The presence of chromatin-related modules in the SP140 protein as well as the homology with chromatin-mediated regulators as the autoimmune regulator (AIRE) or the genes of the SP family suggests that SP140 regulates gene expression (14). A recent work has shown that SP140 is essential for transcription programs that maintain the status of macrophages (15).
MS is an autoimmune disease of the central nervous system (CNS). It has been considered as a T cell-mediated disease because, among other evidences, experimental autoimmune encephalomyelitis (EAE), the animal model of MS, can be induced by transferring T cells from an EAE myelin-induced animal. However, there are many evidences of the important role of B cells in the pathology of MS as the detection of intrathecal synthesis of oligoclonal immunoglobulins (oligoclonal bands) in most of the patients with MS, detection of B cell populations in the lesions of patients with MS or the presence of meningeal B cell follicles in secondary-progressive patients with MS, associated with cortical demyelination and atrophy (16–18). Furthermore, in the past few years, it has become more evident the essential role of B cells in MS pathology due to the success of B cell-depleting therapies for MS (19). The functional effect of MS-associated variants on genes that are expressed mainly in B lymphocytes, as the SP140 gene, can be essential to understand the relevance of this variant in MS as well as to depict the pathways implicated in the MS pathology. In this work, we focused our attention on the identification of the genes altered by the reduction of SP140 expression in lymphoblastoid cells, the biological processes affected by the SP140-regulated genes and its implication in pathways that have been demonstrated to be involved in the development of MS and other autoimmune diseases.
Results
SP140 repressed a set of genes involved in inflammatory response, cytokine production and cell–cell adhesion
To mimic gene expression reduction of SP140 that takes place in lymphoblastoid cell lines (LCLs) bearing the rs28445040-T variant, we knocked down cells by electroporation with small interfering RNA (siRNA), achieving a 6-fold reduction of the SP140 full-length and spliced transcripts and 2.5-fold reduction in the expression of the protein (Fig. 1A and B).

Genes regulated by SP140. LCLs were transfected with an siRNA non-targeting sequence (siNT) or siRNA sequence targeting SP140 (siT). (A) Plots representing mRNA expression of the SP140 full transcript and the spliced transcript lacking exon 7 (SP140Δ7) after 48 h of transfection measured by ddPCR (copies/μL). (B) Relative quantification of SP140 protein was performed by densitometry of SP140 protein bands in western blots of siNT and siT nuclear extracts after 48 h of siRNA transfection (n = 2). Loading control was performed by quantification of nuclear extract proteins and western blots of Lamin B. (C) Quantitative ddPCR analysis of the expression of six of the SP140-regulated genes in control and SP140-silenced cells (copies/μL). (D) Quantification of CCL4 in supernatants of LCLs after transfection with the siRNAs by ELISA.
To identify the genes that could be altered by the reduction of SP140 expression at genome-wide scale, we performed RNA sequencing (RNA-seq) gene expression analysis in LCLs treated with a SP140-target siRNA or a control non-target siRNA in triplicate. A strong batch effect was observed in the principal component analysis (PCA) and the hierarchical clustering of the samples (Supplementary Material, Fig. S1), likely due to different time processing. After correction for the batch effect of the model design for differential expression analysis, we identified 100 genes that were up-regulated and 22 genes down-regulated with a false discovery rate (FDR) < 0.05 (Supplementary Material, Table S1).
We confirmed with the RNA sequencing (RNA-seq) the six times reduction of the SP140 gene expression. We validated the results of differential gene expression analysis obtained from RNA-seq data by reverse transcription and droplet digital polymerase chain reaction (ddPCR) analysis. The expression of eight genes was analyzed in two biological replicates of SP140-target siRNA and a non-specific control. The analysis revealed similar expression pattern of the selected genes in ddPCR analysis as observed with RNA-seq data (Fig. 1C). The statistical analysis also showed very good correspondence (R2 = 0.9657) among the results of ddPCR and RNA-seq data analyses (Supplementary Material, Fig. S2). In addition, we quantified the expression of the chronic lymphocytic leukemia (CLL4) chemokine in the supernatant of LCL at 24, 48 and 72 h after SP140 silencing, obtaining an 11% increase in the concentration of the secreted CLL4 after 48 h (Fig. 1D).
A protein–protein interaction network (STRING) based on known and predicted functional associations (20) showed that most differentially expressed genes were connected, indicating that the differentially expressed genes were physically or functionally related (Fig. 2). To gain insights into the relationships of the decreased expression genes in real data we performed a cross correlation heatmap based on their expression patterns (Supplementary Material, Fig. S3). We observed several highly connected clusters.

Protein–protein interaction network of genes differentially expressed after SP140 silencing. The network was based on known and predicted functional associations using STRING database. Line thickness indicates the strength of data support.
To determine the pathways that are altered with the decrease expression of SP140 we performed the gene ontology (GO) analysis of the 122 SP140-regulated genes. We obtained 235 biological process terms, 11 molecular function terms and five cellular component terms (Fig. 3A) (Supplementary Material, Table S2). The most enrichment was observed in the biological processes which were related to positive regulation of cell–cell adhesion, positive regulation of cytokine and inflammatory responses (Fig. 3B).

GO enrichment analyses. (A) GO terms (biological process, molecular function and cellular component) resulted from GOrilla analysis and REVIGO summary of the differentially expressed genes in SP140 down-regulated LCLs. (B) Hierarchical tree of GO terms of the biological process category. P-value color scale: Read < 10−9; dark orange: 10−7–10−9, light orange: 10−5–10−7, yellow 10−3–10−5.
Genes altered by SP140 silencing were also mapped into Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. We observed enrichment in five KEGG pathways with false discovery rate (FDR) < 0.05 (Supplementary Material, Table S3). The most enriched pathways were hsa04064 (NF-kappa B signaling pathway) (Supplementary Material, Fig. S4), hsa04060 (Cytokine–cytokine receptor interaction) and hsa04062 (Chemo-kine signaling pathway).
Genes affected by SP140 knockdown were regulated by nuclear factor kappa-light-chain-enhancer of activated B cells
Our next goal was to identify the transcription factors (TFs) in the promoter region of the SP140-regulated genes. To this end, we searched within the 5000 bp upstream region of these genes for TF consensus motifs from TRANSFAC (database of eukaryotic TF), obtaining high enrichment of nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) consensus-binding motifs. We also analyzed the deoxyribonucleic acid (DNA)-associated TF in LCLs from Encyclopedia of DNA Elements (ENCODE) obtaining the highest enrichment for NF-κB (Fig. 4 and Supplementary Material, Table S4).

Analysis of TF enrichment of gene promoters regulated by SP140. Overrepresentation of TF binding sites (TFBS) within 5000 bp upstream of each gene differentially expressed between SP140 specific siRNA and siRNA control. (A) TRANSFAC motifs. (B) TF experimentally detected by chromatin immunoprecipitation sequencing (ChIP-seq) in LCLs from ENCODE project. Represented are the results of TF enrichment analysis using Motif Enrichment Tool (MET).
We compared the SP140-regulated genes with gene sets generated by perturbation of single gene in genome-wide experiments from the Gene Expression Omnibus (GEO) database using Enrichr web server (http://amp.pharm.mssm.edu/Enrichr). We obtained 138 datasets with common up-regulated genes and adjusted P-value < 0.05. Significant enrichment was not observed for the down-regulated genes. The three datasets with more common genes were those generated by the activation of the NF-κB by over-expression of (Inhibitor of nuclear factor Kappa-B Kinase subunit 2) in a Burkitt lymphoma cell line (Ramos), an acute lymphocytic leukemia cell line (Reh) or a knocked down of NFKB2 (Nuclear Factor Kappa B Subunit 2) in HapMap (International Haplotype Map project) LCL GM19238 (Fig. 5 and Supplementary Material, Table S5).

Similarities between SP140 and other gene perturbation profiles. Results obtained with Enrichr web server using GEO databases. Represented are the type of treatment and the cell type or tissue object of experiment in brackets. The most significant values are represented in the graph together with the number of shared genes between both profiles (n). Supplementary Material, Table S5 contains all the results from the analysis.
In addition, we measured NF-κB activity with a luciferase reporter assay in LCLs where SP140 was silenced. Nine and 11% increase of luciferase activity was observed at 48 and 72 h, respectively, after SP140 silencing, suggesting that it had an effect on the inducibility of NF-κB activity (Fig. 6).

Effect of SP140 siRNA silencing on NF-κB promoter activity. Cells stably transfected with pNL3.2 NF-κB-RE (luciferase) were additionally transfected with SP140 siRNA and non-targeting siRNA and luciferase activity measured at different times. Mean ± standard deviation is indicated. Statistical significance was estimated by Student’s t-test from 12 independent experiments.
Colocalization of the SP140-regulated genes and GWAS-associated loci
To determine if the SP140-regulated genes were implicated in susceptibility to MS or other diseases, we performed colocalization analysis between the GWAS-associated variants and the genes regulated by SP140. To this end, we created windows of different size around the GWAS-associated SNPs and identified the genes located in the loci. Thus, enrichment of SP140-regulated genes was observed in all size windows for MS, CD and IBD (Table 1). There was also enrichment in the larger windows for Type 1 Diabetes (T1D) but not for diseases as Type 2 Diabetes (T2D), cancer, Parkinson and Alzheimer or traits as height. The highest enrichment was observed in the 20 kb window for MS. Several SP140-regulated genes colocalizing with GWAS single nucleotyde polymorphisms (SNPs) were common for MS, IBD and CD (Fig. 7 and Supplementary Material, Table S6). We also performed colocalization analysis of the 323 genes altered by SP140 knockdown in human macrophages, reported by Mehta et al. (15). With these data, we obtained enrichment of SP140-regulated genes in MS, IBD and CD risk loci. We also observed enrichment in cancer and CLL not observed with SP140-regulated genes in LCLs (Supplementary Material, Table S7).
Colocalization of the SP140-regulated genes in LCLs and GWAS-associated regions in different traits and diseases
Disease or trait . | Window size . | Number of SP140-regulated genes localized in windows containing GWAS-associated SNPs (b) . | Number of genes not regulated by SP140 localized in windows containing GWAS-associated SNPs (n) . | Enrichment (b/n) / (B/N) . |
---|---|---|---|---|
500 K | 16 | 2364 | 3.218 (P = 0.0001) | |
250 K | 15 | 1266 | 5.634 (P = 0.0000001) | |
MS | 100 K | 9 | 601 | 7.121 (P = 0.00001) |
50 K | 8 | 364 | 10.45 (P = 0.000002) | |
20 K | 6 | 195 | 14.632 (P = 0.000006) | |
500 K | 17 | 3802 | 2.126 (P = 0.007) | |
250 K | 13 | 2156 | 2.867 (P = 0.0017) | |
IBD | 100 K | 8 | 1049 | 3.626 (P = 0.0027) |
50 K | 4 | 619 | 3.072 (P = 0.05) | |
20 K | 3 | 319 | 4.472 (P = 0.035) | |
500 K | 17 | 3416 | 2.366 (P = 0.0025) | |
250 K | 13 | 1992 | 3.103 (P = 0.0006) | |
Crohn | 100 K | 10 | 983 | 4.837 (P = 0.00008) |
50 K | 6 | 592 | 4.819 (P = 0.002) | |
20 K | 4 | 298 | 6.382 (P = 0.004) | |
500 K | 4 | 788 | 2.413 (P = 0.09) | |
250 K | 3 | 452 | 3.156 (P = 0.07) | |
CLL | 100 K | 2 | 218 | 4.363 (P = 0.08) |
50 K | 2 | 136 | 6.993 (P = 0.03) | |
20 K | 0 | 79 | 0 (P = 1) | |
500 K | 8 | 1507 | 2.524 (P = 0.02) | |
250 K | 6 | 852 | 3.349 (P = 0.01) | |
T1D | 100 K | 4 | 395 | 4.815 (P = 0.01) |
50 K | 2 | 234 | 4.064 (P = 0.09) | |
20 K | 1 | 108 | 4.403 (P = 0.2) | |
500 K | 12 | 5872 | 0.972 (P = 1) | |
250 K | 9 | 3460 | 1.237 (P = 0.6) | |
Height | 100 K | 4 | 1732 | 1.098 (P = 0.8) |
50 K | 3 | 1069 | 1.334 (P = 0.5) | |
20 K | 2 | 55 | 1.714 (P = 0.3) | |
500 K | 20 | 7687 | 1.237 (P = 0.4) | |
250 K | 13 | 4225 | 1.463 (P = 0.2) | |
Parkinson+ Alzheimer | 100 K | 3 | 1963 | 0.727 (P = 0.8) |
50 K | 2 | 1084 | 0.877 (P = 1) | |
20 K | 1 | 494 | 0.962 (P = 1) | |
500 K | 26 | 11053 | 1.118 (P = 0.6) | |
250 K | 17 | 6564 | 1.231 (P = 0.4) | |
Cancer | 100 K | 12 | 3371 | 1.692 (P = 0.09) |
50 K | 9 | 1999 | 2.140 (P = 0.04) | |
20 K | 4 | 1020 | 1.864 (P = 0.18) | |
500 K | 10 | 3893 | 1.221 (P = 0.5) | |
250 K | 8 | 2181 | 1.744 (P = 0.15) | |
T2D | 100 K | 5 | 1050 | 2.264 (P = 0.08) |
50 K | 3 | 622 | 2.293 (P = 0.15) | |
20 K | 1 | 310 | 1.534 (P = 0.5) |
Disease or trait . | Window size . | Number of SP140-regulated genes localized in windows containing GWAS-associated SNPs (b) . | Number of genes not regulated by SP140 localized in windows containing GWAS-associated SNPs (n) . | Enrichment (b/n) / (B/N) . |
---|---|---|---|---|
500 K | 16 | 2364 | 3.218 (P = 0.0001) | |
250 K | 15 | 1266 | 5.634 (P = 0.0000001) | |
MS | 100 K | 9 | 601 | 7.121 (P = 0.00001) |
50 K | 8 | 364 | 10.45 (P = 0.000002) | |
20 K | 6 | 195 | 14.632 (P = 0.000006) | |
500 K | 17 | 3802 | 2.126 (P = 0.007) | |
250 K | 13 | 2156 | 2.867 (P = 0.0017) | |
IBD | 100 K | 8 | 1049 | 3.626 (P = 0.0027) |
50 K | 4 | 619 | 3.072 (P = 0.05) | |
20 K | 3 | 319 | 4.472 (P = 0.035) | |
500 K | 17 | 3416 | 2.366 (P = 0.0025) | |
250 K | 13 | 1992 | 3.103 (P = 0.0006) | |
Crohn | 100 K | 10 | 983 | 4.837 (P = 0.00008) |
50 K | 6 | 592 | 4.819 (P = 0.002) | |
20 K | 4 | 298 | 6.382 (P = 0.004) | |
500 K | 4 | 788 | 2.413 (P = 0.09) | |
250 K | 3 | 452 | 3.156 (P = 0.07) | |
CLL | 100 K | 2 | 218 | 4.363 (P = 0.08) |
50 K | 2 | 136 | 6.993 (P = 0.03) | |
20 K | 0 | 79 | 0 (P = 1) | |
500 K | 8 | 1507 | 2.524 (P = 0.02) | |
250 K | 6 | 852 | 3.349 (P = 0.01) | |
T1D | 100 K | 4 | 395 | 4.815 (P = 0.01) |
50 K | 2 | 234 | 4.064 (P = 0.09) | |
20 K | 1 | 108 | 4.403 (P = 0.2) | |
500 K | 12 | 5872 | 0.972 (P = 1) | |
250 K | 9 | 3460 | 1.237 (P = 0.6) | |
Height | 100 K | 4 | 1732 | 1.098 (P = 0.8) |
50 K | 3 | 1069 | 1.334 (P = 0.5) | |
20 K | 2 | 55 | 1.714 (P = 0.3) | |
500 K | 20 | 7687 | 1.237 (P = 0.4) | |
250 K | 13 | 4225 | 1.463 (P = 0.2) | |
Parkinson+ Alzheimer | 100 K | 3 | 1963 | 0.727 (P = 0.8) |
50 K | 2 | 1084 | 0.877 (P = 1) | |
20 K | 1 | 494 | 0.962 (P = 1) | |
500 K | 26 | 11053 | 1.118 (P = 0.6) | |
250 K | 17 | 6564 | 1.231 (P = 0.4) | |
Cancer | 100 K | 12 | 3371 | 1.692 (P = 0.09) |
50 K | 9 | 1999 | 2.140 (P = 0.04) | |
20 K | 4 | 1020 | 1.864 (P = 0.18) | |
500 K | 10 | 3893 | 1.221 (P = 0.5) | |
250 K | 8 | 2181 | 1.744 (P = 0.15) | |
T2D | 100 K | 5 | 1050 | 2.264 (P = 0.08) |
50 K | 3 | 622 | 2.293 (P = 0.15) | |
20 K | 1 | 310 | 1.534 (P = 0.5) |
Total number of genes analyzed (N): 58 015, total number SP140-regulated genes (B): 122. P-values of Fisher’s exact test.
Colocalization of the SP140-regulated genes in LCLs and GWAS-associated regions in different traits and diseases
Disease or trait . | Window size . | Number of SP140-regulated genes localized in windows containing GWAS-associated SNPs (b) . | Number of genes not regulated by SP140 localized in windows containing GWAS-associated SNPs (n) . | Enrichment (b/n) / (B/N) . |
---|---|---|---|---|
500 K | 16 | 2364 | 3.218 (P = 0.0001) | |
250 K | 15 | 1266 | 5.634 (P = 0.0000001) | |
MS | 100 K | 9 | 601 | 7.121 (P = 0.00001) |
50 K | 8 | 364 | 10.45 (P = 0.000002) | |
20 K | 6 | 195 | 14.632 (P = 0.000006) | |
500 K | 17 | 3802 | 2.126 (P = 0.007) | |
250 K | 13 | 2156 | 2.867 (P = 0.0017) | |
IBD | 100 K | 8 | 1049 | 3.626 (P = 0.0027) |
50 K | 4 | 619 | 3.072 (P = 0.05) | |
20 K | 3 | 319 | 4.472 (P = 0.035) | |
500 K | 17 | 3416 | 2.366 (P = 0.0025) | |
250 K | 13 | 1992 | 3.103 (P = 0.0006) | |
Crohn | 100 K | 10 | 983 | 4.837 (P = 0.00008) |
50 K | 6 | 592 | 4.819 (P = 0.002) | |
20 K | 4 | 298 | 6.382 (P = 0.004) | |
500 K | 4 | 788 | 2.413 (P = 0.09) | |
250 K | 3 | 452 | 3.156 (P = 0.07) | |
CLL | 100 K | 2 | 218 | 4.363 (P = 0.08) |
50 K | 2 | 136 | 6.993 (P = 0.03) | |
20 K | 0 | 79 | 0 (P = 1) | |
500 K | 8 | 1507 | 2.524 (P = 0.02) | |
250 K | 6 | 852 | 3.349 (P = 0.01) | |
T1D | 100 K | 4 | 395 | 4.815 (P = 0.01) |
50 K | 2 | 234 | 4.064 (P = 0.09) | |
20 K | 1 | 108 | 4.403 (P = 0.2) | |
500 K | 12 | 5872 | 0.972 (P = 1) | |
250 K | 9 | 3460 | 1.237 (P = 0.6) | |
Height | 100 K | 4 | 1732 | 1.098 (P = 0.8) |
50 K | 3 | 1069 | 1.334 (P = 0.5) | |
20 K | 2 | 55 | 1.714 (P = 0.3) | |
500 K | 20 | 7687 | 1.237 (P = 0.4) | |
250 K | 13 | 4225 | 1.463 (P = 0.2) | |
Parkinson+ Alzheimer | 100 K | 3 | 1963 | 0.727 (P = 0.8) |
50 K | 2 | 1084 | 0.877 (P = 1) | |
20 K | 1 | 494 | 0.962 (P = 1) | |
500 K | 26 | 11053 | 1.118 (P = 0.6) | |
250 K | 17 | 6564 | 1.231 (P = 0.4) | |
Cancer | 100 K | 12 | 3371 | 1.692 (P = 0.09) |
50 K | 9 | 1999 | 2.140 (P = 0.04) | |
20 K | 4 | 1020 | 1.864 (P = 0.18) | |
500 K | 10 | 3893 | 1.221 (P = 0.5) | |
250 K | 8 | 2181 | 1.744 (P = 0.15) | |
T2D | 100 K | 5 | 1050 | 2.264 (P = 0.08) |
50 K | 3 | 622 | 2.293 (P = 0.15) | |
20 K | 1 | 310 | 1.534 (P = 0.5) |
Disease or trait . | Window size . | Number of SP140-regulated genes localized in windows containing GWAS-associated SNPs (b) . | Number of genes not regulated by SP140 localized in windows containing GWAS-associated SNPs (n) . | Enrichment (b/n) / (B/N) . |
---|---|---|---|---|
500 K | 16 | 2364 | 3.218 (P = 0.0001) | |
250 K | 15 | 1266 | 5.634 (P = 0.0000001) | |
MS | 100 K | 9 | 601 | 7.121 (P = 0.00001) |
50 K | 8 | 364 | 10.45 (P = 0.000002) | |
20 K | 6 | 195 | 14.632 (P = 0.000006) | |
500 K | 17 | 3802 | 2.126 (P = 0.007) | |
250 K | 13 | 2156 | 2.867 (P = 0.0017) | |
IBD | 100 K | 8 | 1049 | 3.626 (P = 0.0027) |
50 K | 4 | 619 | 3.072 (P = 0.05) | |
20 K | 3 | 319 | 4.472 (P = 0.035) | |
500 K | 17 | 3416 | 2.366 (P = 0.0025) | |
250 K | 13 | 1992 | 3.103 (P = 0.0006) | |
Crohn | 100 K | 10 | 983 | 4.837 (P = 0.00008) |
50 K | 6 | 592 | 4.819 (P = 0.002) | |
20 K | 4 | 298 | 6.382 (P = 0.004) | |
500 K | 4 | 788 | 2.413 (P = 0.09) | |
250 K | 3 | 452 | 3.156 (P = 0.07) | |
CLL | 100 K | 2 | 218 | 4.363 (P = 0.08) |
50 K | 2 | 136 | 6.993 (P = 0.03) | |
20 K | 0 | 79 | 0 (P = 1) | |
500 K | 8 | 1507 | 2.524 (P = 0.02) | |
250 K | 6 | 852 | 3.349 (P = 0.01) | |
T1D | 100 K | 4 | 395 | 4.815 (P = 0.01) |
50 K | 2 | 234 | 4.064 (P = 0.09) | |
20 K | 1 | 108 | 4.403 (P = 0.2) | |
500 K | 12 | 5872 | 0.972 (P = 1) | |
250 K | 9 | 3460 | 1.237 (P = 0.6) | |
Height | 100 K | 4 | 1732 | 1.098 (P = 0.8) |
50 K | 3 | 1069 | 1.334 (P = 0.5) | |
20 K | 2 | 55 | 1.714 (P = 0.3) | |
500 K | 20 | 7687 | 1.237 (P = 0.4) | |
250 K | 13 | 4225 | 1.463 (P = 0.2) | |
Parkinson+ Alzheimer | 100 K | 3 | 1963 | 0.727 (P = 0.8) |
50 K | 2 | 1084 | 0.877 (P = 1) | |
20 K | 1 | 494 | 0.962 (P = 1) | |
500 K | 26 | 11053 | 1.118 (P = 0.6) | |
250 K | 17 | 6564 | 1.231 (P = 0.4) | |
Cancer | 100 K | 12 | 3371 | 1.692 (P = 0.09) |
50 K | 9 | 1999 | 2.140 (P = 0.04) | |
20 K | 4 | 1020 | 1.864 (P = 0.18) | |
500 K | 10 | 3893 | 1.221 (P = 0.5) | |
250 K | 8 | 2181 | 1.744 (P = 0.15) | |
T2D | 100 K | 5 | 1050 | 2.264 (P = 0.08) |
50 K | 3 | 622 | 2.293 (P = 0.15) | |
20 K | 1 | 310 | 1.534 (P = 0.5) |
Total number of genes analyzed (N): 58 015, total number SP140-regulated genes (B): 122. P-values of Fisher’s exact test.

SP140-regulated genes that colocalize with known susceptibility loci of MS, CD and IBD. Venn diagram of SP140-regulated genes in LCL and macrophages.
Gene expression profile similarities in drug responses and SP140 knockdown cells
We looked for drugs that could produce similar or contrary patterns of gene expression as the ones of the SP140 knockdown cells. To this end, we used GEO database comparing SP140 altered genes with single-drug perturbation experiments in mammalian systems (cells or tissues) where gene expression was measured before and after drug administration. We obtained 53 drug datasets with genes down-regulated that were up-regulated when the SP140 expression was knocked down in LCLs (Fig. 8A and Supplementary Material, Table S8). On the other hand, other drugs produced up-regulation of genes that overlap with those increased in LCLs with SP140 silenced. This was the case of monocytes treated with lipopolysaccharide in which 29 genes were common to those up-regulated by SP140 in LCLs (Fig. 8B).

Similarities between expression profiles obtained from silencing of SP140 and perturbation of drugs on tissues or cell lines. (A) Similarity between genes up-regulated by the silencing of SP140 and down-regulated by the drug. (B) Similarity between genes up-regulated in both profiles. The most significant values are represented in the graph together with the number of shared genes between both profiles (n). The cell line or tissues treated with each drug is indicated in brackets. Details are indicated in Supplementary Material, Table S8.
Discussion
We have previously demonstrated that the causal variant of the MS association in the SP140 locus (rs28445040-T) alters the splicing of the gene driving a significant allele-dependent reduction of the SP140 protein expression (6). To determine the function of SP140 and the reason of its low expression association with this disease, we analyzed the genes regulated by SP140. To this end we performed RNA-seq gene expression analysis in LCLs transfected with an SP140-target siRNA or a control non-target siRNA. The RNA expression reduction was very similar to the one produced by rs28445040-T variant in LCLs (6) and the protein level reported by Mehta et al. (15). Most of the genes altered by the silencing of SP140 gene were up-regulated. This implies that SP140 functions as gene repressor. This SP140-expression repression activity has also been reported in differentiated macrophages by Mehta et al. (15).
GO analysis revealed that SP140-repressed genes involved in cytokine production, inflammatory response and regulation of cell–cell adhesion. Many of the SP140-repressed genes were involved in the development of MS, IBD and CD. One of these genes was IL12B which encodes the p40 subunit of the IL12 and IL23 heterodimeric cytokines (21,22). Among other cytokines, interleukin IL12/IL23 have been considered important inflammation mediators of innate and adaptive immunity (23,24). Our results indicated that the individuals who carry the SP140 polymorphism may have increased the expression of IL12/ IL23 due to the lower repression of the SP140. This could be translated in an increase of IL12/23 signaling. IL12 is a mediator of CD4+ T cell differentiation towards T helper 1 (TH1) and the IL23 functions to amplify and maintain TH17 subset of CD4+ T cells (25,26). These type of cells have been key elements in autoimmunity development (27). An increase signal of the IL12 and IL23 cytokines could lead to imbalance in the TH cell differentiation towards more TH1- and TH17-type production.
Other group of genes that were repressed by SP140 and have been associated with MS and IBD was T cell receptor (TCR) co-signaling molecules as CD226, CD86, TNFSF4, TNFRSF8, TNFRSF9 and TNFSF14 (28). B cells serve as efficient antigen-presenting cells in which co-stimulatory molecules interact with their ligand on T cells and the result is the activation of effector T cells (29). Integration of signals downstream of the TCR and co-signaling receptors directs functions in all phases of T cell responses from native T cell priming to T cell differentiation and development and function of TReg cells. The up-regulation of the TCR co-signaling molecules in B cells bearing the rs28445040 risk variant could interfere in all these important processes, crucial for the homeostasis of the immune system.
A third group of related genes regulated by SP140 was chemokines and chemokine receptors as CCL4L2, CCL4, CCL3L3, CCL3, CXCR5, XCL1, CCL17, CCR7, CCR6 and CCR4. Chemokines are a large family of small, chemotactic cytokines with roles in adhesion and directional homing of immune and inflammatory cells (30). The chemokines produced in gastrointestinal mucosa have an essential role in directed trafficking of immune cells to the gut mucosa and epithelial barrier repair (31). Also, some chemokines and chemokine receptors are increased in blood and cerebrospinal fluid of patients with MS. They are implicated in the migration of pathogenic cells, such as T cells and macrophages, to the site of lesions in CNS, which is a vital aspect of MS pathogenesis (32). Therefore, low SP140 expression would up-regulate these chemokines and chemokine receptors affecting important processes in the pathology of MS and IBD.
Other important genes that were regulated by SP140 were members of NF-κB TF system, NFKB2, NFKBIE and NFKBIA. NFKB2 gene encodes the p100 subunit of the NF-κB TF complex. The p100 protein in the complex functions as repressor but it can become active by proteasome-mediated processing, generating its mature protein, p50. NFKBIE and NFKBIA are NF-κB repressor proteins which bind to components of NF-κB complex, preventing it from entering the nucleus. These three genes have been found up-regulated in genome-wide expression studies activating the NF-κB complex (Supplementary Material, Table S5). It could be a feedback mechanism to compensate the over-activity of NF-κB.
In SP140-silenced LCLs, we obtained an important enrichment of NF-κB bound to promoters of differentially expressed genes and an increase of NF-κB activity in NF-κB reporter experiments. All these data, together with the similarities between NF-κB activation and SP140 knockdown profiles, pointed to a role of SP140 in repressing the expression of genes regulated by NF-κB. NF-κB family of TFs are regulators of a very large number of genes in many different cell types in response to a wide type of stimuli but their function is particularly important in the coordination of innate and adaptive immunity (33). NF-κB has numerous regulatory layers to get target specificity in response to different stimulus. These regulatory layers are constituted by different mechanisms such as composition of the dimers in the NF-κB complex that regulate distinct sets of target genes, interaction of NF-κB dimer with unique sets of cooperating TFs, co-regulatory proteins, chromatin proteins or general TFs (33,34). Since SP140 is a chromatin reader, it is tentative to suggest that it is acting as a functional link between the histone code and NF-κB, repressing its activity.
We have observed that many of the genes regulated by SP140 in LCLs were located in risk loci for IBD, CD and MS and other autoimmune diseases. It indicates that the rs28445040 variant, by itself, promotes the alteration of multiple disease-associated genes, possibly, mimicking the effect of having several risk variants. On the other hand, it is interesting that the colocalization analysis between risk loci and SP140-regulated genes in differentiated macrophages showed enrichment not only in autoimmune disease but also in cancer. This could be an indication of the importance of each cell type in each disease with respect to the contribution of SP140 variant.
Other interesting result of our work was the determination of drugs that have similar or opposite effects on gene expression than the silencing of SP140. We observed that out of the 100 up-regulated genes in SP140 down-regulated cells, 29 were also up-regulated by the activation with LPS in monocytes or 26 in peripheral blood mononuclear cells. LPS is an outer membrane component of Gram-negative bacteria which induces the transcription of a set of genes involved in inflammatory reactions by activation of several types of TFs, particularly NF-κB (35,36). This is in concordance with all our results that relate SP140 down-regulation with NF-κB activation. In the group of drugs with expression profiles contrary to the SP140-silenced profile, there was dexamethasone, a corticosteroid which has anti-inflammatory and immunosuppressant effects. Corticosteroids are used for the treatment of acute exacerbations in MS and are also well-established treatment for active CD and IBD. Other drugs that could compensate the effect of silencing SP140 were formoterol, a long-acting β2 agonist used in the management of asthma and chronic obstructive pulmonary disease, and rosiglitazone, an antidiabetic drug. Though they have not been used for the treatment of autoimmune diseases, our results suggest the potentiality of these drugs as therapies for these diseases.
In this study, we have shown how the reduction of SP140 gene expression, promoted by the rs28445040 risk variant, can affect key pathways in the development of MS and we propose that SP140 exerts its function in B cells by the inhibition of the NF-κB activity.
Materials and Methods
Cell culture and siRNA
LCL (NA07000) was obtained from Coriell Institute for Medical Research, NJ, USA. Cells were cultured in RPMI medium (Gibco, Life Technologies, USA), supplemented with 10% fetal bovine serum (Gibco, Life Technologies, USA) and 1% penicillin/streptomycin (Gibco, Life Technologies, USA). 1.5 × 106 cells were transfected by electroporation using the Mirus Ingenio kit and Amaxa Nucleofector II Device at 300 nM of siRNA-SP140 or 300 nM of control siRNA (Dharmacon, USA) according to the manufacturer’s instructions. Cells were harvested 48 h after transfection. All sequences of the siRNA used in SP140 knockdown are listed in Supplementary Material, Table S9.
Western blots and protein quantification
Protein extracts from 5 × 106 siRNA-SP140 or control siRNA transfected LCLs were obtained using NE-PER™ (Thermo Scientific) extraction reagent according to the manufacturer’s instructions. Protein was quantified with Bicinchoninic Acid (BCA) Kit for Protein Determination (Thermo Scientific) and 8 μg of nuclear extract was electrophoresed in 8% sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Immunoblots were incubated with anti-hSP140 (Sigma, HPA006162) or anti-lamin B1 (Abcam). Proteins were detected by enhanced chemiluminescence (Pierce ECL Western Blotting Substrate, ThermoFisher Scientific) and evaluated by densitometry with ImageJ.
Determination of CCL4 concentration in the culture supernatant of SP140-silenced LCLs was carried out at 24, 48 and 72 h in four independent experiments. ELISA quantification was performed with CCL4 Human Instant ELISA™ Kit (Thermo Fisher Scientific), following manufacturer’s recommendations.
RNA extraction and absolute quantification by ddPCR
The total RNA of transfected NA07000 cell line was extracted using RNeasy Mini Kit (QIAGEN GmbH, Hilden, Germany) according to the manufacturer’s instructions. Then RNA was quantified and qualified by the Experion automated electrophoresis system (BioRad); the RNA quality indicator was 10 (intact RNA) for all RNA samples.
150 ng RNA was reverse transcribed into cDNA using oligodT according to the standard conditions using the Superscript III First-Strand Synthesis SuperMix (Invitrogen, USA). Then, SP140,IL12B, CD226, CXCR5, CCL4, CCL3 and CD58 mRNA (cDNA) levels were quantified by absolute ddPCR in duplicates using QX200 Droplet Digital PCR EvaGreen Supermix (BioBad, USA). Data were analyzed using QuantaSoft (BioRad), which counts the fluorescent positive and negative droplets to calculate target DNA concentration. The Mean mRNA values from different experimental conditions were compared using the Student’s t-test. The primers were designed by Primer 3 software or from the database (37) (https://primerdepot.nci.nih.gov/) (Supplementary Material, Table S10).
RNA-seq library preparation and sequencing
The RNA-Seq libraries of the six total RNA samples were prepared using a TruSeq™ Stranded Total RNA kit protocol (Illumina Inc.) according to manufacturer’s protocol. Briefly, rRNA was depleted from 0.5 μg of total RNA using the RiboZero Magnetic Gold Kit and fragmented by divalent cations at elevated temperature resulting into fragments of 80–450 nt, with the major peak at 160 nt. Following the fragmentation, first- and second-strand synthesis was performed, the latter in the presence of 2′-deoxy-uridine, 5′-Triphosphate (dUTP) instead of 2′-deoxy-thymidine-5′-triphosphate (dTTP). The Illumina barcoded adapters were used for the adapter ligation. Libraries were enriched with 15 cycles of PCR. The size and quality of the libraries were assessed in Agilent DNA 7500 Bioanalyzer assay (Agilent). The libraries were sequenced on HiSeq2000 (Illumina, Inc) in paired-end mode with a read length of 2 × 76 bp using TruSeq SBS Kit v4in a fraction of a sequencing v4 flow cell lane, following the manufacturer’s protocol. Image analysis, base calling and quality scoring of the run were processed using the manufacturer’s software Real Time Analysis (RTA 1.18.66.3) and followed by generation of a text-based format storing both a nucleotide sequence and its corresponding quality scores (FASTQ) sequence files using CASAVA software.
RNA-seq quantification and differential expression
RNA-seq reads were mapped against the human reference genome (GRCh38) with Spliced Transcripts Alignment to a Reference (STAR v2.5.1b) using ENCODE parameters for long RNAs (38). Genes were quantified using a software package for estimating gene and isoform expression levels from RNA-Seq data (RSEM v1.2.28) with default parameters (39). The human annotation file was downloaded from Gencode (v24). Differential expression analysis was performed with the DESeq2 (v1.10.1) R package with default parameters (40).
GO, KEGG pathway enrichment analysis and STRING
Cellular component, molecular function and biological process GO terms were annotated for the differentially expressed genes, using GOrilla tool (41). Benjamini–Hochberg post hoc test was used for multiple-test correction. REVIGO tool (Reduce + visualize gene ontology) was used for visualization of non-redundant GO (42). KEGG pathway enrichment analysis was performed with The Database for Annotation, Visualization and Integrated Discovery (DAVID) (43,44). The settings used for STRING (20) were medium confidence (0.400).
TF binding site analysis
Genes significantly up- or down-regulated in siRNA-SP140-silencing experiments were subjected to TF binding sites (TFBS) enrichment analysis using Motif Enrichment Tool (MET) (45). We used the genome-wide scoring profile of a TF binding of MET that has been created computationally from motifs using TRANSFAC (46) or from ChIP data of 143 ChIP datasets from LCLs in humans from ENCODE (47). This profile assigns a score to every 500 bp window in the genome (in shifts of 250 bp), representing the strength of that feature in the window. We used the chromatin accessibility data filter for increasing the accuracy of predicted TF-DNA binding profiles. Predicted TFBS were identified in gene regions 5000 bp upstream of the transcription start site for all genes analyzed. MET tested the significance of overlap between the target gene set of each regulatory feature and the SP140-regulated set using a one-sided Fisher’s exact test.
Colocalization analysis
Data of the different GWAS were obtained from GWAS catalogue (https://www.ebi.ac.uk/gwas/), downloaded in January 2017. Windows were defined as 500, 250, 100, 50 or 20 kb for each GWAS-associated SNPs, centering the SNP in the middle of the window. Genes localized in part of totally in the windows were counted. The analyzed diseases and traits for colocalization were MS, IBD, CD, T1D, CLL, cancer (Lung cancer, Colorectal cancer, Breast cancer, Prostate cancer, Urinary bladder cancer, Thyroid cancer, Testicular cancer, Pancreatic cancer, Ovarian cancer, Bladder cancer, Esophageal cancer, Non-small cell lung cancer, Gastric cancer, Testicular germ-cell cancer, Breast cancer in BRCA2 mutation carriers, Upper-aerodigestive-tract cancers, Endometrial cancer, Gallbladder cancer, Epithelial ovarian cancer, Cervical cancer, Non-melanoma skin cancer, Non-cardia gastric cancer, Cardia gastric cancer), Parkinson–Alzheimer diseases, T2D and Height.
NF-κB luciferase reporter assay
Stable transfected LCL with NF-κB inducible luciferase construct was obtained by the transfection of pNL3.2 NF-κB-RE plasmid (Promega) into NA07000 LCL using Nucleofector II device (Lonza-Amaxa) with Ingenio solution (Mirus). Selection was performed with 200 μg/ml of hygromycin B (Invitrogen). Stable 1.5 × 106 LCL (NF-κB-RE) line was transfected with siRNA-SP140 or control siRNA. After 2 h from the transfection, cells were aliquoted and incubated for 5, 24, 48 and 72 h in RPMI1640 medium, containing 10% FCS at 37°C. Nano-Glo Luciferase Assay System (Promega) and Tecan plate reader Infinite 200 were used for luminescence measurements.
Conflict of Interest statement. None declared.
Funding
Agencia Estatal de Investigación del Ministerio de Ciencia, Innovación y Universidades of Spain (SAF2016-80595 to A.A. and F.M.); Junta de Andalucía-Fondo Europeo de Desarrollo Regional (FEDER) (CTS2704 to F.M.).
References
Author notes
These authors contributed equally to this work.