Discovering common pathogenetic processes between COVID-19 and diabetes mellitus by differential gene expression pattern analysis

Abstract Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the newly discovered coronavirus, SARS-CoV-2. Increased severity of COVID-19 has been observed in patients with diabetes mellitus (DM). This study aimed to identify common transcriptional signatures, regulators and pathways between COVID-19 and DM. We have integrated human whole-genome transcriptomic datasets from COVID-19 and DM, followed by functional assessment with gene ontology (GO) and pathway analyses. In peripheral blood mononuclear cells (PBMCs), among the upregulated differentially expressed genes (DEGs), 32 were found to be commonly modulated in COVID-19 and type 2 diabetes (T2D), while 10 DEGs were commonly downregulated. As regards type 1 diabetes (T1D), 21 DEGs were commonly upregulated, and 29 DEGs were commonly downregulated in COVID-19 and T1D. Moreover, 35 DEGs were commonly upregulated in SARS-CoV-2 infected pancreas organoids and T2D islets, while 14 were commonly downregulated. Several GO terms were found in common between COVID-19 and DM. Prediction of the putative transcription factors involved in the upregulation of genes in COVID-19 and DM identified RELA to be implicated in both PBMCs and pancreas. Here, for the first time, we have characterized the biological processes and pathways commonly dysregulated in COVID-19 and DM, which could be in the next future used for the design of personalized treatment of COVID-19 patients suffering from DM as comorbidity.


Introduction
SARS-CoV-2 is responsible for the novel coronavirus disease 2019 (COVID- 19), which has become a massive threat for humanity worldwide [1,2]. Many factors influence the outcome of the disease, such as age, diabetes, hypertension and lung disease [3].
Diabetes mellitus (DM) is a chronic metabolic disease characterized by elevated blood glucose levels, due to the lack of insulin production, the resistance to insulin signaling or both. There are two main conditions, namely type 1 DM (T1D) and type 2 DM (T2D), with pathogenetic and clinical differences. T1D is characterized by the destruction of pancreatic beta cells by T cells of the immune systems [4]. On the other hand, although subchronic immune-inflammatory pathways are implicated in the pathogenesis of T2D, the primary culprits in the development of the disease seem to depend on peripheral insulin resistance, which is often secondary to obesity.
The prevalence of DM has increased worldwide with changing lifestyles and rising obesity. Approximately, 90% of all cases of diabetes regards T2D, and another around 10% of cases are classified as T1D. According to the International Diabetes Federation, in 2019, around 463 million people aged between 20 and 79 years had DM [5]. People with both T1D and T2D may experience diabetes-related complications, including cardiovascular disease, kidney disease, neuropathy, blindness and lower extremity amputation [6].
As expected, patients with DM and especially those with long-lasting disease and related complications [7,8] have been shown to exhibit a more severe course of COVID-19 than nondiabetic individuals [7,8]. It has also been proposed that COVID-19 exposure can precipitate T1D onset [9].
The synergistic effect of hyperglycaemia and COVID-19related hyperinflammation might predispose DM patients to increased vulnerability and lethal outcomes in SARS-CoV-2 infection [10,11].
Overall, these observations forge the hypothesis that DM patients are more susceptible to COVID-19 infection [12]. As COVID-19 patients with DM have an elevated risk of complicated outcomes, it becomes essential to identify eventual synergistic biomolecular pathways triggered by COVID- 19 and DM, which could lead to a tailored approach for the treatment of diabetic patients with COVID-19. However, the molecular alterations in common between COVID-19 with DM have not yet been investigated. The characterization of the molecular pathways is crucial to identify therapeutic targets or drug repurposing for DM patients infected with COVID-19. Recently, several studies have been profiled the transcriptomic changes occurring in COVID-19 [13][14][15][16]. However, the link between the increased vulnerability to COVID-19 in DM is still not completely clear.
This study has adopted a bioinformatic approach to identify commonly deregulated gene signatures, gene ontologies, pathways and regulators that underlie COVID-19 and DM (a summary of the study layour is presented as Figure 1).

Transcriptomic data acquisition
In order to identify suitable datasets generated from peripheral blood mononuclear cells (PBMCs) from COVID-19 and T2D patients, we queried the NCBI Gene Expression Omnibus (GEO) and the ArrayExpress databases. We retrieved the GSE152418 dataset generated by RNA sequencing (RNA-Seq) [17]. The dataset included 16 PBMCs samples from COVID-19 subjects and 17 healthy controls. The median age (years) of the COVID-19 patients was 56 (range: 25-94) [17]. We also obtained another PBMCs COVID-19 dataset from the Genome Sequence Archive, which included a total of six samples representing PBMCs from 3 COVID-19 and 3 healthy controls (Accession number: CRA002390) [18]. In order to retrieve datasets generated from PBMCs of T1D and T2D patients, our inclusion criteria were: (i) gene expression data, (ii) the dataset should contain case and matched control, (iii) human PBMCs samples and (iv) no drug treatment. Considering these criteria, we identified one microarray dataset (GSE9006) that included 12 T2D patients and 24 healthy controls. Finally, for T1D, we retrieved three datasets with accession number GSE9006 [19], GSE72377 and GSE55100 [20]. The GSE9006 dataset contained 81 T1D and 24 healthy control data [19]. The GSE72377 dataset contained mRNAs gene expression profile of unstimulated PBMCs from 15 T1D and 20 healthy controls (quality control data are presented as Supplementary Figure 1, see Supplementary Data available online at http://bib.oxfordjournals.org/). The GSE55100 dataset contained mRNAs gene expression profiles from T1D PBMCs from 12 cases and 10 healthy controls [20].
For the determination of the common biological processes affecting the pancreas in COVID-19 and DM, the GEO and Array-Express databases were also interrogated. The GSE151803 [21] dataset was chosen as it included whole-genome transcriptomic data from human pancreas organoids infected in vitro with SARS-CoV-2 at an MOI of 0.1 for 24 h. For the determination of the T2D pancreas signature, the GSE20966 [22], GSE25724 [23] and GSE38642 [24] datasets were obtained, which comprised a total of 25 T2D and 71 controls islet data. For T1D, the GSE72492 [25] and the E-MEXP-1140 [26] dataset were found, which included wholegenome mRNA data from a total of 24 T1D and 13 controls. The characteristics of the datasets included in this study are summarized in Table 1.

Data processing and differential expression analysis
The two datasets of COVID-19 PBMCs (with accession number: GSE152418 and CRA002390) were first analyzed using the DESeq2 R package, using as statistical threshold an absolute log-fold change (logFC) ≥ 1 and Benjamini-Hochberg corrected adjusted P-value (false discovery rate-FDR) < 0.05. Then, we meta-analyzed these two RNA-seq datasets using Fisher's inverse method, using the metaRNASeq R package (designed for meta-analysis of RNA transcriptomic datasets).
For the meta-analysis of the microarray datasets, we used the Network Analyst (NA) web utility tool [27]. For the normalization of the datasets, we employed variance stabilizing normalization algorithm [28], followed by the quantile normalization [29]. ComBat procedure embedded into NA was also performed to adjust study batch effect in meta-analysis. Finally, the metaanalysis of the datasets was performed using the Fisher's P-value combination method.
Since we had only one T2D PBMCs gene expression dataset (GSE9006) and one for SARS-CoV-2 infected pancreas organoids (GSE151803), the significant genes were selected using the R package LIMMA (linear models for microarray data) [30], considering an FDR < 0.05.

Functional insights into the DEGs
The enrichment analysis was performed using the widely utilized tool 'Metascape' [31]. Compared with other GO-based enrichment methods, it includes additional, optimized ontology databases, such as MSigDB, and clusters corresponding terminology to eliminate repetition [31]. By default, the Metascape gene functional enrichment makes use of several databases, including gene ontology (GO), KEGG, Reactome, and MSigDB databases. Metascape uses hypergeometric tests and the Benjamini-Hochberg P-value correction algorithm to classify all ontological parameters containing a substantially higher set of genes common to an input list than expected casually. The pairwise similarity between any two enriched terms is computed based on a Kappa-test score. The matrix of similarity is clustered hierarchically and the 0.3 threshold is applied to trim the resulting tree into different clusters. Metascape determines the most significant (lowest P-value) term in each cluster to depict the cluster in bar graph and heatmap. Significant GO terms were selected using as threshold a Bonferroni-corrected P-value <0.05.

Transcription factor analysis
The putative transcription factors (TFs) regulating the expression of the differentially expressed genes (DEGs) were predicted via Metascape [31], using the TRRUST database, which is a manually curated database of human and mouse transcriptional regulatory networks [32,33].

Drug prediction analysis
The web-utility L1000FDW was employed to detect potential repurposable pharmacological agents for treating COVIDpatients with DM as comorbidity. In order to rank drugs that can theoretically reverse a transcriptional signature, L1000FWD computes the similarity between the input gene expression vector and the data for LINCS-L1000. LINCS-L1000 includes the transcriptional profiles of around 50 human cell lines upon exposure to ∼20 000 compounds, at different doses and times. An adjusted P-value (q-value) <0.05 has been regarded as threshold for significance.

Identification of common transcriptional signatures between COVID-19 and T2D PBMCs
We obtained two PBMCs gene expression datasets from COVID-19 infected subjects and matched healthy controls (with accession numbers, GSE152418 and CRA002390) for a total of 39 samples, 19 cases and 20 healthy controls. The meta-analysis of the two datasets identified 3274 significant (FDR < 0.05) genes. The analysis of the GSE9006 dataset identified 486 DEGs (FDR < 0.05) characterizing T2D PBMCs. Among the upregulated DEGs, 32 were found to be commonly modulated in COVID-19 and T2D (P < 0.05) (Figure 2A), while 10 DEGs were commonly downregulated ( Figure 2B).

Identification of common functional GO terms in COVID-19 and DM PBMCs
Several GO terms were found in common between COVID-19 and DM ( Figure 3A, Supplementary Table 1, see Supplementary Data available online at http://bib.oxfordjournals.org/). The terms significantly enriched by the upregulated DEGs in common between COVID-19 and T2D, but not with T1D, were related to the mRNA metabolism, sub-cellular organelle organization and nucleotide synthesis ( Figures 3B and 4). The pathways that were significantly enriched by the downregulated DEGs, and in common between COVID-19 and T2D, but not in T1D were related to the mRNA metabolism and to immune responses ( Figures 3B and 4). On the other hand, the  autophagy process was found to be enriched among the upregulated DEGs in T1D and COVID-19, but not in T2D ( Figure 3B and Figure 4).

Prediction of TF overlapping between COVID-19 and DM PBMCs
Determination of the putative TFs involved in the regulation of the DEGs implicated in COVID-19 and DM showed that SPI1 and RELA are involved in the expression of commonly upregulated genes in COVID-19, T1D and T2D. On the other hand, no significant enrichment was observed for TFs controlling the downregulated DEGs ( Figure 5).

Drug prediction for COVID-19 patients suffering from DM
To predict drugs potentially helpful in treating COVID-19 patients with DM comorbidity, we used the L1000FWD webbased utility. For T1D, the top five drugs with the most antisimilar signature were: WAY-213613; gitoxigenin; fatostatin; BRD-K63425657 and emetin (Table 2). No drug anti-signature was found to be significantly enriched when considering the DEGs overlapping the COVID-19 and T2D PBMCs profiles (data not shown).

Identification of common functional GO terms and TFs in COVID-19 and DM pancreas
Comparative analysis revealed that 35 DEGs were commonly upregulated in COVID-19 infected organoids and T2D islets, while no overlapping was found for T1D. Among the downregulated DEGs in COVID-19 infected organoids, 14 overlaped with the downregulated DEGs in T2D islets. No overlapping was instead found for T1D.
GO term enrichment analysis revealed several common processes characterizing both SARS-CoV-2 infection and T2D, including 'transmembrane receptor protein tyrosine kinase signaling pathway', 'regulated exocytosis' and 'apoptotic signaling pathway. The 'regulation of secretion' and 'inorganic ion homeostasis' terms were enriched by both the up-and down-regulated DEGs characterizing COVID-19 infection and T2D ( Figure 6A Figure 6B).

Drug prediction for COVID-19 and DM pancreas
Analysis of drugs potentially useful for the treatment of COVID-19 patients concomitantly suffering from T2D was performed using the L1000FWD web-based utility, on the 35 upregulated and 14 downregulated genes in common between COVID-19 infected pancreas organoids and T2D islets. The top three drugs with the most anti-similar signature were: vemurafenib, milnacipran and erbstatin-analog (Table 3). No drug was found to be significantly enriched when considering the DEGs overlapping the COVID-19 and T1D pancreas profiles (data not shown).

Discussion
Comorbidities are associated to a higher risk of develop severe forms of COVID-19, with consequent need of mechanical ventilation and increased death rate [34]. Among the comorbidities that affect the prognosis of COVID-19 patients, DM has emerged as a potential risk factor [10]. This may in part be related to the observation that SARS-CoV-2 binds to the angiotensinconverting enzyme 2 (ACE2) receptors, which are expressed in metabolic organs and tissues, such as pancreatic beta cells, adipose tissue, the small intestine and the kidneys [35]. Therefore, elucidating the key genes and pathways in COVID-19 and DM is crucial to decipher the molecular association and mechanisms shared by these pathologies. The use of whole-genome transcriptomic analyses has been largely exploited by ourselves and others to study autoimmune diseases, cancer and neurodegenerative disorders [36][37][38][39][40], as well as to characterize potential pathogenetic mechanisms and novel therapeutic targets [41][42][43]. Here, we have comprehensively studied the pancreas and blood transcriptomic changes occurring in DM and COVID-19    by using integrative bioinformatics approaches. We detected 10 common down-regulated genes and 32 common upregulated genes between COVID-19 and T2D in PBMCs. Among them, it is worth noting TSPYL4 (Testis-specific Y-encoded-like protein 4), for which a genetic variants has been associated with chronic pulmonary obstructive disorder [44], and ACSL1 (Acyl-CoA Synthetase long chain family member 1) that encodes for the isozyme of the long-chain fatty-acid-coenzyme A ligase family. It has been shown that the inflammatory phenotype in rat model of T1D is linked to increased expression of ACSL1, and deletions of this gene prevent the acquisition of the inflammatory phenotype of macrophages associated with T1D [45]. Along the same lines, our study revealed altered immune response signaling pathways modulated in both COVID-19 and T2D PBMCs. In a recent study, Codo et al. [46] suggested that increased levels of blood glucose in T2D patients and glycolysis may promote the replication of the SARS-CoV-2 and higher production of cytokines by monocytes via mitochondrial ROS/hypoxia-inducible factor-1a-dependent pathway, which may result in T cell dysfunction and epithelial cell death. These observations suggest that metabolic inflammation in T2D patients may be favorable for viral replications and could enhance the release of cytokines. Our results also suggested crucial pathways involved in T1D and COVID-19, such as lymphocyte activation, cellular protein catabolic process, immune response and autophagy. It has been previously shown that higher activity of lymphocytes is observed in T1D [47]; however, lymphocytes and immunoinflammatory events play a role in T2D, as well [48]. Our results suggest the 'activation of immune response' and 'signaling by interleukins' as crucial GO terms in COVID-19 and T1D, which is consistent with previous observations confirming the dysregulation of immune systems in response to viral infections [49]. The impairment of innate immunity has been linked to the increased prevalence of infections in DM patients [49]. Similarly, in an animal model, comorbid DM results in immune dysregulation and increases the severity of the diseases following infection with the other coronavirus, MERS-CoV [50].
It is also believed that metabolic inflammation compromises the immune systems in DM patients, which corroborates our results that systemic immune alterations in PBMCs could play a crucial role in both COVID-19 and T1D [51]. The impairment of the innate immunity has been also linked to the increased prevalence of infections in DM patients [49]. Moreover, the 'cytokine storm' occurring in COVID-19 results from uncontrolled host immune responses, which are believed to be associated with increased complications and critical conditions in COVID-19 patients. Therefore, clinical trials on antibodies/drugs that target these pathways might be conducted to identify potential use in COVID-19 patients with comorbid DM.
The two main TFs that regulate the shared DEGs between COVID-19 and DM are 'SPI1' and 'RELA'. While SPI1 is involved in the inflammatory process and modulates host immune systems [52], RELA is implicated in NF-kB development and regulates proliferative and inflammatory cell responses [53]. We have previously detected these two TFs as critical regulators of the DEGs identified from transcriptomic analysis of COVID-19 infected lung tissue, suggesting that these two TFs might represent both biomarkers and pharmacological targets in COVID-19 [16].
Next, we profiled the gene expression signature of pancreas organoids infected with SARS-CoV-2 to detect transcriptional alterations characterizing the response to COVID-19 infection and to compare with T1D and T2D-associated transcriptomic pancreas profiles. Our analysis identified the dysregulated process involved in the 'regulation of secretion', which indicates that COVID-19 alters the secretory pathways of the pancreas, likely affecting the insulin secretion and impacting the prognosis of DM patients.
Glycaemic deterioration is often observed in COVID-19 patients with DM or with impaired glucose homeostasis [54]. Accordingly, SARS-CoV-2 infection was associated with need for higher doses of insulin [54]. Moreover, although ketoacidosis typically occurs in T1D patients, it was observed that more than 75% of COVID-19 patients who developed ketoacidosis had T2D [55].
It has been proposed an association between ACE2 and glucose homeostasis. ACE2 knockout mice are more susceptible than the wild-type counterpoart to high-fat diet-induced β-cell dysfunction [56]. This observation and the expression of ACE2 in the endocrine pancreas support the notion that coronaviruses could damages the pancreatic islets, leading to hyperglycaemia [57]. Notably, it was previously shown that hyperglycaemia in SARS patients persisted for up to 3 years after recovery, indicating long-term damage to β-cells upon coronovirus infection.
In our analysis, a disruption of the regulation of insulin secretion has been observed among the DEGs characterizing COVID-19 and DM. Also, among the GO terms commonly enriched by the downregulated DEGs characterizing both COVID-19 infection and T2D, we found 'response to glucose'. Overall, our data suggest the hypothesis that SARS-CoV-2 tropism for the β-cell could cause acute impairment of insulin secretion and/or destruction of β-cells causing deterioration of the metabolic control in people with pre-existing DM or leading to the development of new-onset DM.
Finally, we have used an anti-signature-based approach [16] that aimed to detect candidate drug molecules that could reverse the DEGs signatures identified from DM PBMCs and pancreas. Interestingly, when considering the PBMCs gene expression profiles, we only found drugs potentially reverting the T1D/COVID-19, but not the T2D/COVID-19, signature. This is likely to be ascribed to the predominant immunoinflammatory processes underlying T1D pathogenesis, characterizing the autoimmune destruction of the β cells of the endocrine pancreas, but not T2D, where insulin resistance and reduced secretion of insulin by the β cells are the main culprit of the disease [58]. On the other hand, when investigating the pancreas data, we found drugs with a signature reverting the T2D/COVID-19 profile, but none for the T1D/COVID-19 common gene expression profile. This is in line with the GO analysis, which showed prevalent metabolic alterations in COVID-19 and T2D pancreas, and not common immunoinflammatory processes. In particular, on the top three predicted drugs, we have found Vemurafenib, an FDA approved drug that can induce cellular apoptosis in melanoma cells that contained BRAF mutation via interfering B-raf/MEK/ERK pathway [59]. Also, we found Milancipran, an anti-depressant of serotonin-norepinephrin class, which is used for the treatment of psychotic disorders and fibromyalgia with dubious results [60,61], and an erbstatin-analog, which is a EGFR inhibitor [62]. It is known that many growth factor receptors are associated with viral infections, and previous studies have suggested their repurpose for COVID-19 pandemic [63,64]. However, despite the importance of the bioinformatics and systems biology-based analyses, the clinical validation of these findings is crucial before establishing them as biomarkers and/or interventional targets for the management of COVID-19 patients.

Conclusions
The present study aimed at discovering gene expression signatures, TFs, and dysregulated molecular pathways, which were in common between COVID-19 and T1D and T2D. Our analysis revealed common dysregulated immune-related pathways in COVID-19 and both T1D and T2D PBMCs. Accordingly, the top predicted TFs that regulate the shared DEGs between COVID-19 and DM were SPI1 and RELA, which are involved in the regulation of the infmammatory processes. However, as expected, we have observed different DEGs shared between COVID-19 and T1D and T2D.
In addition, several pathways, including those associated to the response to glucose and insulin pathways, were enriched by the DEGs in COVID-19 and DM, as observed in the pancreasrelated transcriptomic profiles. This observation supports the hypothesis that SARS-CoV-2 could directly determine an impairment of insulin secretion, with consequent diruption of the metabolic control in people already suffering from DM or leading to the development of new-onset DM.
However, it should be pointed out that there are some limitations to the present study. Indeed, although the molecular pathways commonly disregulated in COVID-19 and either of the two forms of diabetes may help to better understand the pathophysiology of the increased vulnerability of DM patients to COVID-19, the mechanisms by which DM affects COVID-19 susceptibility, severity, prognosis, mortality and long-term complications require more in-depth analysis to associate gene expression changes with respective traits. In addition, there are other major organs that are involved in the pathology of DM, which may be affected upon SARS-COV-2 infection, including kidneys, heart and blood vessels. Hence, it will be important to evaluate whether these extrapulmonary tissues are affected by pathogenetic pathways in common between COVID-19 and DM.
Notwithstanding the above-mentioned limitations, our work represents the first effort to characterize the overlapping gene expression profiles that may explain the negative prognostic association between COVID-19 and DM and, hence, may set the basis for future tailored pharmacological strategies for the better management of COVID-19 patients.

Key Points
• Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the newly discovered coronavirus, SARS-CoV-2. Increased severity of COVID-19 has been observed in patients with diabetes mellitus (DM).
• This study implemented a bioinformatic frmaework to detect common transcriptional signatures, regulators and pathways between COVID-19 and DM.
• We have integrated human whole-genome transcriptomic datasets from COVID-19 and DM, followed by functional assessment with gene ontology and pathway analyses.
• Here, for the first time, we have characterized the biological processes and pathways commonly dysregulated in COVID-19 and DM, which could be in the next future used for the design of personalized treatment of COVID-19 patients suffering of DM as comorbidity.

Supplementary Data
Supplementary data are available online at https://academi c.oup.com/bib.

Data Availability Statement
All data here analyzed are publicly available on NCBI Gene Expression Omnibus (GEO) and the ArrayExpress databases.