-
PDF
- Split View
-
Views
-
Cite
Cite
Ming Zhang, Dejie Zhang, Qicai Wang, Guoliang Lin, Construction of a prognostic model for breast cancer based on moonlighting genes, Human Molecular Genetics, Volume 33, Issue 12, 15 June 2024, Pages 1023–1035, https://doi.org/10.1093/hmg/ddae040
- Share Icon Share
Abstract
Breast cancer (BRCA) is a highly heterogeneous disease, with significant differences in prognosis among patients. Existing biomarkers and prognostic models have limited ability to predict BRCA prognosis. Moonlighting genes regulate tumor progression and are associated with cancer prognosis. This study aimed to construct a moonlighting gene-based prognostic model for BRCA. We obtained differentially expressed genes (DEGs) in BRCA from The Cancer Genome Atlas and intersected them with moonlighting genes from MoonProt to acquire differential moonlighting genes. GO and KEGG results showed main enrichment of these genes in the response of BRCA cells to environmental stimuli and pentose phosphate pathway. Based on moonlighting genes, we conducted drug prediction and validated results through cellular experiments. After ABCB1 knockdown, viability and proliferation of BRCA cells were significantly enhanced. Based on differential moonlighting genes, BRCA was divided into three subgroups, among which cluster2 had the highest survival rate and immunophenoscore and relatively low tumor mutation burden. TP53 had the highest mutation frequency in cluster2 and cluster3, while PIK3CA had a higher mutation frequency in cluster1, with the majority being missense mutations. Subsequently, we established an 11-gene prognostic model in the training set based on DEGs among subgroups using univariate Cox regression, LASSO regression, and multivariable Cox regression analyses. Model prognostic performance was verified in GEO, METABRIC and ICGC validation sets. In summary, this study obtained three BRCA moonlighting gene-related subtypes and constructed an 11-gene prognostic model. The 11-gene BRCA prognostic model has good predictive performance, guiding BRCA prognosis for clinical doctors.
Introduction
Surpassing lung cancer, breast cancer (BRCA) has now become the primary cause of cancer incidence globally. The year 2020 witnessed 2 261 419 fresh incidences of BRCA, which covered 11.7% of all cancer cases [1]. Notwithstanding the progress made in treatment approaches, certain patients continue to have unfavorable prognoses as a result of inadequate therapeutic targets and imprecise prognostic tools [2]. BRCA is a highly heterogeneous disease, and even among patients with the same pathological morphology, molecular genetic changes result in high heterogeneity at the molecular level, leading to significant differences in prognosis and response to treatment [3]. Genotyping BRCA based on genes can more accurately reflect the biological behavior of tumors and predict prognosis, thereby selecting more targeted treatment methods.
Moonlighting genes encode proteins that have two or more functions despite having only one structure [4]. This means that the secondary functions of these proteins may only appear during the progression of pathological conditions [5]. For example, HMMR is a nuclear microtubule-associated protein in normal cells but is used by tumor cells during tumorigenesis to perform multiple functions, including intracellular regulation of spindle assembly during mitosis, CD44-mediated motility/invasion, and activation of the RAS signaling pathway [6]. Some moonlighting proteins have been reported to regulate tumor progression, such as cell motility, angiogenesis, DNA synthesis, or repair [7]. Ye et al. [8] have shown that moonlighting genes may be involved in the progression of hepatocellular carcinoma and may help predict the prognosis of patients. Other studies have suggested that the moonlighting gene GAPDH plays a role in tumor cell survival and tumor angiogenesis, and is associated with poor prognosis in cancer patients [9]. Ros et al. [10] found that the moonlighting gene Calnexin not only participates in extracellular matrix degradation but also inhibits the growth and lung metastasis of BRCA and liver cancer cells. Molavi et al. [11] found that low expression of the moonlighting gene RPL13A may be associated with the expression of p53 and HER2/neu in BRCA. These studies suggest that moonlighting genes may participate in the pathological process of BRCA. However, it is currently unclear whether moonlighting genes are linked to the immunity and prognosis of BRCA patients.
To investigate the role of these moonlighting genes in BRCA, this study aimed to subdivide BRCA into subgroups based on moonlighting genes in BRCA, analyzed the immune response of different subgroups, and constructed a prognostic model based on subgroup screening of prognosis-related genes to provide help for the prognosis of BRCA patients.
Results
Screening and analysis of BRCA-associated moonlighting genes
Comparison of gene expression between healthy individuals and BRCA patients led to the identification of 5280 DEGs in BRCA, including 3335 upregulated genes and 1945 downregulated genes (Fig. 1A). 22 BRCA-associated moonlighting genes were identified by overlapping BRCA DEGs with 103 moonlighting genes. Among them, 14 genes were upregulated, and 8 genes were downregulated (Fig. 1B) (Supplementary Table 1). A PPI network was constructed to show the interactions among the BRCA-associated moonlighting genes, which may have an impact on the expression of biological phenotypes (Confidence score = 0.4) (Fig. 1C). The remaining proteins did not interact with each other, therefore, they were not shown in the fig. GO analysis of these genes showed that they were significantly enriched in biological functions such as cellular response to abiotic stimulus, cellular response to environmental stimulus, and cell chemotaxis (Fig. 1D). The KEGG enrichment analysis showed that the related genes were enriched in pathways such as Carbon metabolism, Biosynthesis of amino acids, BRCA, Gastric cancer, and Pl3k-Akt (Fig. 1E). To better understand the copy number variation (CNV) mutation status of these moonlighting genes, we displayed their CNV locations and variation information on 24 chromosomes in Fig. 1F.

Screening and analysis of BRCA-related moonlighting genes. (A) Volcano plot of upregulated and downregulated genes differentially expressed in BRCA versus healthy samples in the training set. (B) Diagram of the intersection of DEGs and moonlighting genes in BRCA. (C) PPI network of BRCA-related moonlighting genes. (D) GO analysis of BRCA-related moonlighting genes. (E) KEGG pathway analysis of BRCA-related moonlighting genes. (F) CNV of BRCA-related moonlighting genes (outer circle represents 24 chromosomes, inner circle represents the distribution of CNV).
Molecular subtyping based on BRCA-associated moonlighting genes
After removing patients with a survival time of 0 days, the remaining samples underwent K-means clustering analysis using the expression matrix of BRCA-related pleiotropic genes. When k was 3, the optimal clustering stability was achieved, and cluster1 included 321 patients (Average age = 59.87), cluster2 included 688 patients (Average age = 58.82), and cluster3 included 83 patients (Average age = 59.02) (Fig. 2A and B). Subsequent survival analysis of the three subgroups showed a higher survival rate in cluster2 compared with cluster1 and cluster3 (Fig. 2C). This suggested that the younger age of cluster 2 patients may be one of the important factors contributing to their higher survival rate.

Subtyping of BRCA based on moonlighting genes. (A) Delta area plot showed relative changes in the area under the CDF curve compared to k-1 when k equals 3, indicating optimal cluster stability. (B) K-means clustering of BRCA. (C) Survival analysis of the 3 BRCA subgroups. (D) The distribution of BRCA patients of different subtypes within the three clusters.
We further investigated the proportions of different BRCA subtypes among the various clusters. The results revealed that in cluster 1, the majority of patients belonged to the ER-positive, PR-positive, HER2-negative, Luminal A, and Luminal B subtypes, while the proportion was relatively lower in cluster 2 and the lowest in cluster 3. Conversely, subtypes such as normal and basal showed the opposite trend (Fig. 2D). This may have a certain impact on the clustering of BRCA.
Analysis of immune among three BRCA subgroups
Approximately 20% of cancer patients experience extended survival after receiving immunotherapy, while some patients do not respond to this treatment [12]. Immunotherapy is an emerging treatment option for BRCA patients [13]. We first performed CIBERSORT analysis to observe the infiltration levels of different immune cells (Fig. 3A). The results showed that in cluster 1, the most highly infiltrated cells were B cells naive and Mast cells resting. In cluster 2, T cells regulatory (Tregs) and Plasma cells exhibited the highest infiltration levels. In cluster 3, the most highly infiltrated cells were T cells follicular helper, Macrophages M0, and Macrophages M1. Furthermore, we observed that HLA (especially Class I molecules) and immune checkpoint molecules appeared to have higher expression levels in cluster 1 and cluster 3 (Fig. 3B and C). In addition to these findings, we performed an IPS difference analysis on three BRCA subgroups to identify which BRCA patients may derive benefits from immunotherapy. As shown in Fig. 3D–G, the IPS, IPS-CTLA4, IPS-PD1/PDL1/PDL2, and IPS-CTLA4 and PD1/PDL1/PDL2 scores of cluster2 were higher than cluster1 and cluster3 (most of which were statistically significant). This suggested that patients in cluster 2 may be more likely to benefit from immunotherapy based on the IPS scores. However, this seems contradictory to the lower expression levels of HLA and immune checkpoint molecules in cluster 2. The specific reasons for this discrepancy require further investigation.

Immune analysis in three clusters. (A) Cibersort analysis. (B) Expression levels of HLA. (C) Expression levels of immune checkpoint. (D) Differential analysis of IPS. (E) Differential analysis of IPS-CTLA4. (F) Differential analysis of IPS-PD1/PDL1/PDL2. (G) Differential analysis of IPS-CTLA4 and PD1/PDL1/PDL2.
Mutation differences analysis among three BRCA subgroups
Studies have shown that different TMB may be related to the overall survival of BRCA patients [14]. Therefore, we calculated the TMB of samples in different subgroups using BRCA mutation data, with outcomes showing that the TMB of cluster3 was higher than that of cluster1 and cluster2 (Fig. 4A). Due to the absence of genetic mutations in some samples, only the number of samples with mutations and their associated mutation features are presented here. Analysis of commonly mutated genes among the subgroups revealed that PIK3CA appeared to have a higher frequency of missense mutations in cluster 1, while TP53 exhibited a higher frequency of nonsense mutations in cluster 3 (Fig. 4B).

TMB analysis of the three BRCA subgroups. (A) Comparison of TMB between the three BRCA subgroups. (B) Shared mutated genes among the three clusters.
Drug sensitivity analysis of BRCA-related moonlighting genes
Utilizing the CellMiner database, an analysis was performed to forecast the association between moonlighting genes linked to BRCA and drug responsiveness. The findings revealed a negative correlation between ABCB1 and the standardized IC50 values of Depsipeptide, Actinomycin D, Mithramycin, PKI-587, Sepantronium bromide, ONX-0914, and Dinaciclib. MMP3 was positively correlated with the standardized IC50 value of PF-04217903 (Fig. 5A).

Drug sensitivity analysis. (A) Correlation plot of drug sensitivity prediction related to BRCA-related moonlighting genes in the CellMiner database. (B) Transfection efficiency of ABCB1. (C) Cell viability assay. (D) Cell proliferation assay.
To further validate the relationship between the aforementioned drugs and the moonlighting gene ABCB1, we selected two widely studied drugs, Actinomycin D and Mithramycin, and conducted in vitro cell experiments. Firstly, we generated ABCB1 knockdown BRCA cell lines and confirmed the transfection efficiency using qRT-PCR (Fig. 5B). The ABCB1 knockdown MCF7 cells and control (si-NC) were treated with the two drugs separately (Actinomycin D and Mithramycin). The results of the CCK-8 cell viability assay demonstrated a significant enhancement in the vitality and proliferation capacity of the BRCA cells after ABCB1 knockdown (Fig. 5C and D). This indicated that the expression of ABCB1 might play a crucial role in the action of Actinomycin D and Mithramycin.
Construction and validation of the prognostic model
Given the significant differences between cluster2 and cluster1/cluster3, we combined the samples of cluster1 and cluster3 and selected DEGs in cluster1 and cluster3 with cluster2 as a control (|logFC| > 1, FDR < 0.05). 614 DEGs were acquired, including 275 upregulated genes and 339 downregulated genes. We intersected the DEGs in subgroups with those in TCGA-BRCA to obtain 504 differentially candidate genes (Supplementary Table 2). By subjecting the candidate genes together with clinical information to univariate Cox regression analysis, we identified 27 genes that displayed a significant association with survival (P < 0.05). To avoid overfitting, we determined the parameter λ through ten-fold cross-validation and conducted LASSO analysis to obtain 22 candidate DEGs (Fig. 6A and B). We further conducted multivariate Cox regression analysis on the 22 candidate DEGs and ultimately obtained 11 genes (GKN2, FABP7, MYBPH, AMPD1, PDIA2, IP6K3, NR0B1, CAVIN4, SLC1A1, LIN28A, CEL) to construct a prognosis model (Fig. 6C). Here is the model formula:

Construction of a prognostic model. (A) Coefficient distribution plot generated for the logarithm (λ) sequence in the LASSO model. (B) Cross-validation for parameter selection using LASSO cox regression. (C) Forest plot of 11 feature genes obtained by multivariate cox regression analysis.
Performance evaluation and validation of the model
Based on gene expression levels and corresponding coefficients, a prognostic model was developed to calculate the riskscore for each sample in the training set. The riskscore was found to be positively linked with the percentage of deceased patients, and negatively linked with patient survival time (Fig. 7A and B). Following the division of the samples into high-risk and low-risk groups based on the median riskscore, 11 feature genes exhibited differential expression between the two groups (Fig. 7C). Survival curves were generated to investigate the prognostic implications associated with varying riskscores. Notably, the results revealed that patients categorized as high-risk exhibited poorer survival outcomes in contrast to those in the low-risk group (Fig. 7D). To assess the predictive performance of the model, the ROC curve was plotted for 1, 3, and 5 years of the training set, and the corresponding AUC values were calculated, which amounted to 0.69, 0.72, and 0.72, respectively (Fig. 7E). These outcomes provided evidence supporting the model’s high accuracy in forecasting outcomes. Moreover, the model’s predictive ability was further confirmed through its validation in the independent validation set (GEO data set, METABRIC data set and ICGC data set) (Supplementary Figs 1–3). The survival curves of the METABRIC dataset exhibited significant statistical significance, but the results of the ROC curve were not satisfactory, which may be attributed to the larger sample size. Although the high- and low-risk groups of patients in the ICGC dataset did not show significant differences in survival, it could be observed that the high-risk group had relatively lower survival rates (Supplementary Fig. 3D). It was also possible that the sample size was too small because we removed TCGA patient data from the ICGC data set. Additionally, we analyzed the prognostic capability of the model feature gene, and it was found that patients with high expression of SLC1A1 exhibited significantly worse survival rates (P < 0.05) in the TCGA dataset (Fig. 7F).

Performance evaluation and validation of the BRCA prognostic model. (A) Distribution of riskscore in BRCA samples in the training set. (B) Distribution of survival status in BRCA samples in the training set. (C) Heatmap of expression levels of the 11 characteristic genes in the high- and low-risk groups in the training set. (D) Kaplan-Meier survival curves of high-risk and low-risk patients. (E) ROC curves for predicting survival at 1-, 3-, and 5-year of BRCA patients based on riskscore. (F) Comparison of patient survival rates based on the expression levels of the prognostic gene SLC1A1.
Independent prognostic factor analysis
The training set was subjected to univariate and multivariate prognostic analyses (P < 0.05), aimed at assessing whether the riskscore derived from the model, together with relevant clinical information such as age, gender, T, N, M, and stage, could serve as independent prognostic factors, whose outcomes revealed that age, stage, and riskscore were significant factors for BRCA prognosis (Fig. 8A and B). Combined with clinical parameters, nomograms were drawn to predict the survival probabilities of patients at 1-, 3-, and 5-year (Fig. 8C). The calibration curves plotted for 1-, 3-, and 5-year demonstrated a close alignment between the predicted survival rate of the model and the actual survival rate (Fig. 8D–F). These observations provided compelling evidence for the high predictive reliability of the model, thus supporting its utility for future analyses. Therefore, the prognostic scoring model constructed by these 11 feature genes could serve as an independent prognostic factor for BRCA.

Riskscore independent prognostic analysis and evaluation of nomogram. (A) Forest plot of univariate cox regression analysis of riskscore and clinical information. (B) Forest plot of multivariate cox regression analysis of riskscore and clinical information. (C) BRCA prognostic nomogram with riskscore and clinical information. (D) Calibration plot for 1-year survival prediction. (E) Calibration plot for 3-year survival prediction. (F) Calibration plot for 5-year survival prediction.
We compared and analyzed the risk scores of BRCA patients with different clinical features and subtypes. The results revealed significant differences in risk scores among patients of different ages, T stage, stage, HER2 status, and PAM50 BRCA subtypes, particularly in terms of age and HER2 status (Supplementary Fig. 4). BRCA patients aged over 65 and those with HER2-positive status exhibited higher risk scores. This finding may be related to the earlier observation of fewer HER2-positive patients in cluster 1, which corresponded to relatively poorer prognosis (Fig. 2D). Thus, the combined use of moonlighting genes and BRCA subtypes for clinical analysis holds promising prospects for prognostic evaluation in BRCA patients.
Discussion
BRCA is the most common cancer among women worldwide, and its pathological type, molecular subtype, and gene expression characteristics can lead to different survival outcomes [15, 16]. Personalized treatments can be adopted based on the evaluation of different molecular features and prognoses [17]. Moonlighting genes play multiple roles in the tumor microenvironment and disease progression [18]. To effectively evaluate the prognosis of BRCA and predict drug responses to guide treatment, we performed a subgroup analysis of BRCA patients based on moonlighting genes and then screened key genes to build a prognostic model.
BRCA can be classified into subtypes based on immunohistochemical markers, such as ER-positive, PR-positive, HER2-positive, and triple-negative [19]. Furthermore, based on PAM50 gene expression subtype classification, BRCA can be categorized into Luminal A, Luminal B, HER2-enriched, basal-like, and normal-like subtypes [20]. In this study, based on these moonlighting genes associated with BRCA, BRCA patients were divided into three subgroups. We found that ER-positive, PR-positive, HER2-negative, Luminal A, and Luminal B patients constituted the majority in cluster 1, had a relatively lower proportion in cluster 2, and the least proportion in cluster 3. Conversely, other types such as normal-like and basal-like exhibited the opposite pattern (Fig. 2D). These BRCA subtypes may influence the subtype clustering of patients in this study and their prognoses in different clusters. However, the specific underlying mechanisms still require further investigation.
We conducted a series of analyses on the three clusters, where immunological analysis revealed significant differences among the subgroups in terms of survival rate, immune infiltration, immune checkpoint expression levels, HLA expression levels, IPS, and TMB. Cluster 2 demonstrated higher survival rates and IPS compared to the other subgroups. This may be associated with the higher infiltration of immune cells (such as Tregs and Plasma cells) in Cluster 2. In cancer patients, tumor cells can block certain immune checkpoint pathways, evade immune surveillance, and resist the cytotoxic effects of host T cells [21]. Patients with higher IPS exhibit significant immunotherapy advantages and clinical benefits, which can foretell the response to PD-1/PD-L1 and CTLA-4 therapies [22]. Schalper et al. [23] found that PD-L1 is expressed in nearly 60% of BRCA tumors, and the expression status of PD-L1 is associated with better overall disease outcomes. Fang et al. [24] discovered that mRNA overexpression of immune checkpoint molecules such as CTLA-4 and LAG-3 can significantly prolong the survival of triple-negative BRCA (TNBC) patients. Clinical studies have shown that the addition of immunotherapy to first-line chemotherapy can conspicuously improve progression-free survival in PD-L1-positive BRCA patients [25]. These studies indicated that cluster2 in our study may be more suitable for immunotherapy and have better survival benefits. Furthermore, we observed that cluster 3 patients exhibited higher expression levels of immune checkpoints and HLA. In terms of the survival outcome of patients across the three clusters, cluster 3 patients also demonstrated higher survival rates even at the 10-year mark, which may be associated with the elevated expression levels of immune checkpoints and HLA.
We found that the mutation frequency of PIK3CA was higher in cluster1. More importantly, they were predominantly missense mutations. TP53 had the highest mutation frequency in cluster2 and cluster3 in our TMB analysis. Among them, TP53 in cluster 3 had a higher frequency of nonsense mutations. Sousa et al. [26] found that some BRCA patients have mutations, with higher mutation frequency and different mutation characteristics in metastatic tumors. PIK3CA gene mutations can occur in various malignant tumors such as BRCA, endometrial cancer, and colon cancer, and usually have adverse effects on the prognosis of malignant tumors [27]. Reinhardt et al. [28] found that PIK3CA mutations are more common in early BRCA patients and have an impact on patients’ recurrence-free interval and overall survival. Mosele et al. [29] found that HR+/Her2- metastatic BRCA patients with PIK3CA mutations have a poor prognosis and are resistant to chemotherapy, while TNBC patients with PIK3CA mutations show a higher survival rate. TP53 encodes the transcription factor p53, which responds to stress sources such as radiotherapy and chemotherapy, and initiates the transcription of genes involved in processes such as cell cycle arrest, apoptosis, metabolism, DNA repair, and cellular aging [30]. TP53 mutation is the process of its transformation from a tumor suppressor gene to an oncogene. After mutation, its spatial conformation can change, leading to the loss of tumor suppressor function. Studies have found that TP53 genetic mutations are also common in BRCA patients [31, 32]. In addition, studies have found that TP53 mutations can serve as prognostic biomarkers for adverse outcomes and adjuvant endocrine therapy response in metastatic BRCA patients [33]. Similar to the above studies, our study found that PIK3CA and TP53 were common mutant genes in BRCA patients and may be related to BRCA prognosis.
Due to the differences in survival and immunity among the three subgroups we studied, we screened differential genes based on the BRCA subgroup and constructed an 11-gene prognostic model (GKN2, FABP7, MYBPH, AMPD1, PDIA2, IP6K3, NR0B1, CAVIN4, SLC1A1, LIN28A, CEL). This model had a good predictive effect on the prognosis of BRCA, and most of these genes are involved in the occurrence and progression of cancer (Supplementary Table 3). FABP7 is a member of the FABP intracellular lipid chaperone family, which is usually expressed in the breast. Many studies have confirmed the key role of FABP7 in the progression and metastasis of BRCA. For example, high expression of FABP7 is associated with a low survival rate and higher incidence rate in BRCA patients and regulates the metabolic reprogramming process of HER2+ BRCA cells [34]. Other studies have shown that FABP7 is associated with a good prognosis in ER-negative BRCA patients [35]. AMPD1 is an enzyme found only in eukaryotes, encoded by a multi-gene family [36]. Wang et al. [37] found that low expression of AMPD1 in HER2-positive BRCA tumor tissues is associated with longer overall survival of BRCA patients. In addition, AMPD1 may be involved in immune activity and is positively correlated with the expression of PD-L1 and PD-L2 in HER2-positive BRCA patients. LIN28A is an RNA-binding protein that plays an important role in regulating cell cycle and growth, tissue repair, and metabolism [38]. LIN28A is an RNA-binding protein whose upregulation is associated with cancers such as BRCA, colon cancer, and ovarian cancer [39–41]. Shen et al. [42] found that LIN28A and AR are co-expressed in ER-/Her2+ BRCA and are linked with poor prognosis. In addition, other genes selected in this study have been confirmed to have carcinogenic effects and participate in the prognosis of cancer [43–45]. As a statistical tool with wide clinical applications, the nomogram was used to integrate the model and clinical characteristic factors, and a survival curve was drawn to better provide prognostic references for clinical personnel.
Although our study developed a fresh prognostic model for BRCA based on moonlighting genes, shortcomings still exist. For instance, the clinical information gained from the validation set in public databases is limited, which may affect the accuracy of our prognostic model, and more experiments and further clinical validation of the predictive performance of the prognostic model are needed. In summary, we successfully identified three molecular subtypes of BRCA based on moonlighting genes and constructed a novel BRCA prognostic model with strong predictive ability.
Materials and methods
Data download
RNA sequencing transcriptomic data and corresponding clinical profiles of BRCA were downloaded from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/) database [46], which contained 113 healthy samples and 1118 BRCA samples. Microarray dataset GSE48391 was downloaded from the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/) [47] as a validation set, which included 104 BRCA samples. We downloaded the clinical information of TCGA BRCA, including BRCA subtype information such as HER2, ER, PR, and PAM50, from the UCSC Xena database (http://xena.ucsc.edu/). This dataset contained clinical information for a total of 2509 BRCA patients. We also obtained a BRCA patient dataset from the METABRIC database, which consisted of 2509 patients. Additionally, we downloaded ICGC-BRCA-related patient data from the International Cancer Genome Consortium Pan-cancer working group database. After excluding the TCGA patient data, we were left with a subset of 50 BRCA patients. Supplementary Table 4 provides the patient information for all the datasets mentioned. MoonProt (http://www.moonlightingproteins.org/) was used to obtain 103 human moonlighting genes [48] (Supplementary Table 5).
Identification and analysis of BRCA-related moonlighting genes
The R package “edgeR” [49] was used to compare RNA-seq data of healthy individuals and BRCA patients (|logFC| > 1, FDR < 0.05) to obtain differentially expressed genes (DEGs) in BRCA. The DEGs were crossed with moonlighting genes to obtain BRCA-related moonlighting genes. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https://string-db.org/) was used to construct a protein–protein interaction (PPI) network [50] to visualize the intrinsic connections between BRCA-related moonlighting genes. The R package “clusterProfiler” [51] was used to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses on BRCA-related moonlighting genes. The R package “Circos” (http://circos.ca/) was used to show the distribution of BRCA-related moonlighting genes on chromosomes.
Identification of BRCA subgroups based on BRCA-related moonlighting genes
Utilizing the R package “ConsensusClusterPlus” [52], a consistency clustering analysis was performed on the expression data of moonlighting genes related to BRCA. Subsequently, survival analysis was executed on each subgroup. We used the ggplot2 package to create a stacked bar plot illustrating the percentage distribution of BRCA patients with different subtypes (Luminal A, Luminal B, HER2-positive, and triple-negative) across different clusters.
Immunophenoscore (IPS) analysis based on BRCA-related moonlighting genes
IPS is a score calculated based on four different immune phenotypes (antigen presentation, effector cells, inhibitory cells, and checkpoints), which is an indicator of the response to CTLA-4 and PDL-1 antibodies. A higher IPS indicates a higher level of PD-L1 expression in the patient, which correlates with an increased benefit from immunotherapy [22]. The Cancer Immunome Atlas (TCIA) database [22] has characterized the intratumoral immune landscape across 20 different cancer types, identifying the determinants of tumor immunogenicity and developing the IPS score as a measure of immunogenic potential. IPS of BRCA was downloaded from TCIA to compare the IPS between subgroups. We utilized the pheatmap package to construct a heatmap displaying the expression levels of immune checkpoint and HLA genes.
Mutation analysis
We download mutation data for BRCA from TCGA, and utilized the R package “maftools” (https://github.com/PoisonAlien/maftools) to perform a Wilcoxon rank-sum test on tumor mutation burden (TMB) values of each subgroup. We utilized the Maftools and ComplexHeatmap packages to generate visualizations of the shared mutated genes among the three clusters.
Cancer drug sensitivity prediction based on gene expression
The association between genes and drugs can be quickly determined by querying the confirmed 22 379 genes and 20 503 analyzed compounds in the NCI-60 cell line through the CellMiner database (https://discover.nci.nih.gov/cellminer/) [53]. We utilized the CellMiner online database to obtain the correlation between drug treatment efficacy and gene expression levels.
Screening of prognostic-related feature genes, and construction and validation of a prognostic model
We performed differential gene expression analysis (|logFC| > 1.0, FDR < 0.05) between subgroups, and obtained a list of candidate genes by intersecting the DEGs between subgroups with those that are differentially expressed in both TCGA-BRCA and healthy samples. We identified survival-associated genes from the candidate gene list using univariate Cox regression analysis and performed LASSO regression analysis to eliminate redundant variables to prevent model overfitting. We conducted multivariate Cox regression analysis on the retained genes and used a stepwise approach with forward/backward/bidirectional rules to eliminate redundant or non-predictive factors that did not improve the predictive performance. The genes that remained were used to construct the model. Lastly, we divided the samples into high and low expression groups based on the median expression values of each model gene. We utilized the survminer package to generate survival curves for the model genes.
Performance evaluation and validation of the prognostic model
Based on the mRNA expression level and hazard ratio of each gene, the riskscore of each patient was calculated, which was used to predict the prognosis. Patients were classified into a high-risk group with a poor prognosis and a low-risk group with a relatively good prognosis using the median riskscore as a cut-off. The riskscores of all samples were sorted from low to high, and then the score distribution plot and survival status distribution plot of BRCA high- and low-risk groups were plotted. The survival curves of high- and low-risk groups were plotted using the R package “survival”. We utilized the R package “timeROC” [54] to depict receiver operating characteristic (ROC) curves, with false positive rate as the horizontal axis and true positive rate as the vertical axis. We then depicted the 1-, 3-, and 5-year ROC curves to evaluate the predictive performance of the model. Finally, the same analysis was performed on the validation set GSE48391, ICGC and METABRIC to further validate the reliability of the model.
Independent prognostic analysis
The independent clinical prognostic factors for survival were selected through univariate and multivariate Cox regression analyses. Following multivariate Cox regression analysis, “rms” package [55] was imported to generate nomograms of 1-, 3-, and 5-year survival rates. These nomograms were developed by considering age, stage, and risk score as predictors. Additionally, calibration curves were generated to validate the accuracy of nomograms. When evaluating the risk scores of BRCA patients based on different BRCA subtypes and clinical characteristics, we utilized the ggplot package to generate violin plots. The differences among the three groups were assessed using the Kruskal-Wallis test, while comparisons between two groups were conducted using the Wilcoxon test.
Cell culture and drug sources
Actinomycin D (GC16866) and Mithramycin (GC15060) were purchased from GLPBIO. Human BRCA cells MCF7 (CTCC-001-0042) were purchased from MEISEN CELL. All the above cell lines were cultured in RPMI-1640 medium supplemented with 10% fetal bovine serum (FBS) at 37°C in a 5% CO2 incubator.
Cell transfection
The si-ABCB1 and corresponding negative control were purchased from Ribobio (China). Lipofectamine 2000 transfection reagent (Thermo Fisher, USA) was used, and the si-ABCB1 and corresponding negative control were transfected into BRCA cell lines following the instructions provided with the transfection kit. After 24 h of transfection, the transfected cells were used for further experiments.
qRT-PCR
Total RNA was extracted from cells using Trizol reagent (Invitrogen, USA) and reversely transcribed into cDNA. cDNA was used as a template for PCR amplification using the One Step SYBR PrimeScript RT-PCR Kit (Takara, Japan). PCR was performed using a fluorescent quantitative PCR instrument. Gene expression levels were quantitatively calculated by the 2−ΔΔCt method. GAPDH was the internal reference. Primer sequences are shown in Table 1.
Gene . | Forward Primer (5′-3′) . | Reverse Primer (5′-3′) . |
---|---|---|
ABCB1 | CTTCAGGGTTTCACATTTGGC | GGTAGTCAATGCTCCAGTGG |
GAPDH | CAATGACCCCTTCATTGACC | GACAAGCTTCCCGTTCTCAG |
Gene . | Forward Primer (5′-3′) . | Reverse Primer (5′-3′) . |
---|---|---|
ABCB1 | CTTCAGGGTTTCACATTTGGC | GGTAGTCAATGCTCCAGTGG |
GAPDH | CAATGACCCCTTCATTGACC | GACAAGCTTCCCGTTCTCAG |
Gene . | Forward Primer (5′-3′) . | Reverse Primer (5′-3′) . |
---|---|---|
ABCB1 | CTTCAGGGTTTCACATTTGGC | GGTAGTCAATGCTCCAGTGG |
GAPDH | CAATGACCCCTTCATTGACC | GACAAGCTTCCCGTTCTCAG |
Gene . | Forward Primer (5′-3′) . | Reverse Primer (5′-3′) . |
---|---|---|
ABCB1 | CTTCAGGGTTTCACATTTGGC | GGTAGTCAATGCTCCAGTGG |
GAPDH | CAATGACCCCTTCATTGACC | GACAAGCTTCCCGTTCTCAG |
CCK-8 assay
The Cell Counting Kit-8 (MedChemExpress, USA) assay was used to measure cell viability. Transfected cells were collected and seeded into a 96-well plate. After 0, 24, 48, 72, and 96 h of cell culture, 10 μl of CCK-8 detection reagent was added to each well. The cells were then incubated for 2 h in a cell culture incubator. The absorbance was measured at a wavelength of 450 nm using a microplate reader. Each experimental group was repeated three times for biological replicates.
Colony formation assay
The transfected cells from different groups were digested using 0.25% trypsin and seeded into 6-well plates at a density of 4 × 102 cells per well. Subsequently, the cells were incubated at room temperature in DMEM supplemented with 10% FBS and 5% CO2 for 2 weeks. The culture medium was discarded when visible colonies were observed. The colonies were fixed with 4% paraformaldehyde, stained with 0.1% crystal violet for 10 min, and rinsed with PBS. Finally, the cell colonies were counted.
Data analysis
All statistical analyses were performed using GraphPad Prism 6 software (GraphPad Software, USA). All experiments were conducted at least three times, and the results were expressed as mean ± standard deviation. Differences between groups were compared using t-tests or one-way analysis of variance (ANOVA). A P-value of less than 0.05 was considered statistically significant, denoted by * in the figures.
Author contributions
MZ and DZ contributed to conception and design of the study. QW contributed to acquisition of data and data analysis. MZ and GL drafted the manuscript. GL contributed to investigation and visualization.
All the authors revised the article and gave final approval of the version to be published.
Acknowledgements
Not applicable.
Conflict of interest statement: The authors declare that they have no conflict of interest.
Funding
This study has no funding.
Data availability
The data in this study are available from the corresponding author on reasonable request.
Ethics approval
Our study did not require an ethical board approval because it did not contain human or animal trials.