-
PDF
- Split View
-
Views
-
Cite
Cite
Jingyu Yang, Meng Wang, Jürgen Dönitz, Björn Chapuy, Tim Beißbarth, Advancing personalized cancer therapy: Onko_DrugCombScreen—a novel Shiny app for precision drug combination screening, NAR Genomics and Bioinformatics, Volume 7, Issue 1, March 2025, lqaf004, https://doi.org/10.1093/nargab/lqaf004
- Share Icon Share
Abstract
Identifying and validating genotype-guided drug combinations for a specific molecular subtype in cancer therapy represents an unmet medical need and is important in enhancing efficacy and reducing toxicity. However, the exponential increase in combinatorial possibilities constrains the ability to identify and validate effective drug combinations. In this context, we have developed Onko_DrugCombScreen, an innovative tool aiming at advancing precision medicine based on identifying significant drug combination candidates in a target cancer cohort compared to a comparison cohort. Onko_DrugCombScreen, inspired by the molecular tumor board process, synergizes drug knowledgebase analysis with various statistical methodologies and data visualization techniques to pinpoint drug combination candidates. Validated through a TCGA-BRCA case study, Onko_DrugCombScreen has demonstrated its proficiency in discerning established drug combinations in a specific cancer type and in revealing potential novel drug combinations. By enhancing the capability of drug combination discovery through drug knowledgebases, Onko_DrugCombScreen represents a significant advancement in personalized cancer treatment by identifying promising drug combinations, setting the stage for the development of more precise and potent combination treatments in cancer care. The Onko_DrugCombScreen Shiny app is available at https://rshiny.gwdg.de/apps/onko_drugcombscreen/. The Git repository can be accessed at https://gitlab.gwdg.de/MedBioinf/mtb/onko_drugcombscreen.
Introduction
Cancer treatment is an intricate field, with the ongoing quest to develop therapies that effectively target the disease while minimizing side effects. Synergistic drug combinations aim to reduce the concentration of each drug while achieving the same therapeutic effect, thereby minimizing side effects. This represents a key advancement in cancer therapy [1]. This approach aims to overcome the limitations of single-agent treatments by improving efficacy and reducing the likelihood of drug resistance [2]. However, the task of identifying optimal drug combinations is complicated by the significant variability in tumor types and patient responses, along with the complexities of cancer biology, the high dimensionality of data, and the number of drug combinations far beyond what is possible for clinical testing [3, 4]. These challenges make it difficult to predict which combinations are most likely to be effective in specific cancer molecular type cohorts compared to other molecular subtypes, necessitating advanced computational tools and extensive experimental validation to navigate the vast landscape of potential drug combinations and tailor treatments to cohort patients’ needs [5].
Molecular tumor boards (MTBs) are crucial in personalizing cancer treatment, integrating multidisciplinary expertise to interpret genetic data and guide treatment decisions based on a patient’s unique tumor characteristics [6–8]. Drawing inspiration from the methodologies employed by MTBs, the utilization of drug databases alongside detailed drug’s level of evidence information emerges as a crucial strategy in advancing patient-specific treatment recommendations [6–8]. This approach not only facilitates the identification of the most suitable drugs for individual patients based on their genetic profiles but also sets the foundation for drug combination prediction based on patient cohorts. Applying statistical methods based on drug databases to the set of recommended drugs for these patient cohorts enables researchers and clinicians to predict more accurately effective drug combinations. This strategy underscores a significant shift toward a data-driven and evidence-based framework to optimize combination therapy for cancer patients, leveraging the increasingly available genetic and pharmacological data to enhance treatment efficacy and patient outcomes.
In recent decades, an increasing application of computational approaches has been developed for the prediction of drug combinations and their effects. Preuer et al. developed DeepSynergy, a deep learning-based approach that accurately predicts drug combination synergies for cancer treatments, significantly surpassing traditional performance methods [9]. Similarly, Wang et al. introduced DeepDDS, a deep learning model that employs graph neural networks and attention mechanisms to precisely predict and prioritize synergistic drug combinations for cancer treatments, achieving the advantage of enhanced interpretability through chemical substructure analysis [10]. Cheng et al. demonstrated that a network-based methodology, concentrating on the relative configuration of drug–target modules in connection to disease modules, can effectively prioritize potentially efficacious drug combinations for complex diseases such as cancer [11]. GAECDS, presented by Li et al., is an innovative approach combining graph autoencoders and convolutional neural networks to accurately predict drug synergy, showing superior performance in identifying efficacious drug combinations [12]. Concurrently, numerous classical machine learning (ML) models have also exhibited performance comparable to deep learning methods, demonstrating their robustness and utility in this complex domain. Gayvert et al. showcased that a random forest model, utilizing single drug dose responses as features, could accurately predict drug pair synergy and effectiveness in mutant BRAF melanomas [13]. Janizek et al. introduced TreeCombo, an XGBoost-based approach that leverages the power of gradient boosting to improve predictive accuracy, outperforming DeepSynergy by using drug physiochemical features and cancer cell line gene expression data. The use of XGBoost, which combines multiple decision trees to make robust predictions, demonstrated comparable efficacy to deep learning on medium-scale datasets, while offering the additional benefits of reduced complexity in hyperparameter tuning and enhanced interpretability through TreeSHAP, a feature attribution method that identifies the contribution of each variable in a clear and consistent manner [14]. However, current preclinical screenings primarily focus on the synergistic effects of drug combinations, often overlooking key factors for clinical success, such as potential toxicity and selective efficacy against tumors [3]. At the same time, there is a clear lack of innovative computational solutions to demonstrate their feasibility and benefits in translational applications, especially in the field of cancer, where there is an urgent need to identify combination therapies suitable for specific cancer group patients based on patient-specific biomarkers [15, 16].
In this paper, we present an Onko_DrugCombScreen Shiny app designed to address this gap in cancer therapy, which could predict the significant drug combination candidates based on the target patient cohort statistical analysis against the comparison cohort. The primary goal of Onko_DrugCombScreen extends beyond merely providing treatment recommendations based on drug databases such as GDKD [17], CIViC [18], and OncoKB [19]. It integrates statistical methods and data visualization to analyze recommendations based on extensive drug databases within the target cancer cohort and comparison cohort genetic data, thereby uncovering potential drug combinations and mapping them onto cell line data, providing a robust basis for clinical drug screening. Based on the drug evidence levels in the knowledge database for medications, one can directly ascertain whether the variant mapping drugs are selective at the cancer types in the target patient cohort, and the previous studies collected in the database can save workload on drug toxicity analysis. This brings renewed hope for the clinical translation of cancer-type-specific drug combination therapies.
Materials and methods
Fisher’s exact test in cancer subtype recommendations
Besides predicting single drugs, clinicians and researchers are interested in determining whether two drugs are simultaneously recommended for the target tumor type and exhibit significant differences compared to the comparison tumor group. Here, we defined co-recommended drugs as candidate drug combinations that are presented in the Drug_comb column in the DrugComb analysis table. We then counted the number of patients in the target tumor cohort and the comparison cohort for each candidate drug combination. Subsequently, we used these four counts to construct a contingency table (Table 1) and performed a Fisher’s exact test for each candidate drug combination. By analyzing the P-value and odds ratio that are circled with a red rectangle in Fig. 1 results obtained from Fisher’s exact test, we can determine whether the occurrence of a candidate drug combination is significantly different and assess the magnitude of this difference. The P-values were adjusted using the Benjamini–Hochberg method, as reflected in the adjust_p.value column, to account for multiple hypothesis testing and control the false discovery rate. Additionally, we report drug combination candidate recommendations for cell line data in the final four columns in the DrugComb analysis table to assist with wet-lab validation (Fig. 1).
. | Target . | Comparison . | Row total . |
---|---|---|---|
Drug1 + Drug2 | a | b | a + b |
Non-Drug1 + Drug2 | c | d | c + d |
Column total | a + c | b + d | a + b + c + d |
(n) |
. | Target . | Comparison . | Row total . |
---|---|---|---|
Drug1 + Drug2 | a | b | a + b |
Non-Drug1 + Drug2 | c | d | c + d |
Column total | a + c | b + d | a + b + c + d |
(n) |
In this table, a represents the number of patients in the target tumor cohort receiving Drug1 + Drug2 co-recommendation, b is the number of comparison cohort patients receiving Drug1 + Drug2 co-recommendation, c is the number of patients in the target tumor cohort not receiving Drug1 + Drug2 co-recommendation, and d is the number of comparison cohort patients not receiving Drug1 + Drug2 co-recommendation.
. | Target . | Comparison . | Row total . |
---|---|---|---|
Drug1 + Drug2 | a | b | a + b |
Non-Drug1 + Drug2 | c | d | c + d |
Column total | a + c | b + d | a + b + c + d |
(n) |
. | Target . | Comparison . | Row total . |
---|---|---|---|
Drug1 + Drug2 | a | b | a + b |
Non-Drug1 + Drug2 | c | d | c + d |
Column total | a + c | b + d | a + b + c + d |
(n) |
In this table, a represents the number of patients in the target tumor cohort receiving Drug1 + Drug2 co-recommendation, b is the number of comparison cohort patients receiving Drug1 + Drug2 co-recommendation, c is the number of patients in the target tumor cohort not receiving Drug1 + Drug2 co-recommendation, and d is the number of comparison cohort patients not receiving Drug1 + Drug2 co-recommendation.

A “drug co-recommendation” refers to the recommendation of two drugs for a single patient, exemplified as Drug1 + Drug2. The bar plot illustrates the counts of such co-recommendations within the target cancer subtype cohort compared to the comparison cancer subtype cohort, leading to the construction of the contingency table on the right. Fisher’s exact test was applied to each drug co-recommendation to assess statistical significance, resulting in the DrugComb analysis table displayed below. The percentage column (percentage) indicates the proportion of drug co-recommendations, with the P-value and adjusted P-value columns (p.value, adjust_p.value) reflecting the significance level. The odds ratio (oddsRatio) provides a measure of the effect size in comparison to the control group. The final four columns offer details on the cell line drug recommendation status, including specific cell line IDs (individual_id) used for wet-lab validation.
Here, Fisher’s exact test serves as a robust statistical method to determine the significance of the association between the candidate drug combination and the tumor type. Fisher’s exact test is particularly suitable for small sample sizes and for datasets where the assumptions of chi-squared tests are not met.
Drug level of evidence
Here, we adopted the MTB drug’s level of evidence category approach proposed by Perera-Bel et al. [6]. As shown in Table 2. “A” signifies evidence for the same cancer type, while “B” indicates evidence for any other cancer type. Horizontally, Level 1 represents evidence supported by regulatory agencies or clinical guidelines. Level 2 includes evidence from clinical trials. Finally, Level 3 consists of preclinical trial evidence. Therefore, based on the different target cancer types of drugs and their respective clinical evidence, six levels of drug evidence are derived: A1, A2, A3, B1, B2, and B3. With this drug level of evidence, the selection of recommended drugs for specific cancer types and their clinical strength can be clearly defined, which can guide the clinical decision.
Drugs obtained from the drug knowledge database are classified into clinically relevant categories using a system of six levels of evidence
. | Approved . | Clinical . | Preclinical . |
---|---|---|---|
Same cancer | A1 | A2 | A3 |
Other cancer | B1 | B2 | B3 |
. | Approved . | Clinical . | Preclinical . |
---|---|---|---|
Same cancer | A1 | A2 | A3 |
Other cancer | B1 | B2 | B3 |
Drugs obtained from the drug knowledge database are classified into clinically relevant categories using a system of six levels of evidence
. | Approved . | Clinical . | Preclinical . |
---|---|---|---|
Same cancer | A1 | A2 | A3 |
Other cancer | B1 | B2 | B3 |
. | Approved . | Clinical . | Preclinical . |
---|---|---|---|
Same cancer | A1 | A2 | A3 |
Other cancer | B1 | B2 | B3 |
Analysis tools
Onko_DrugCombScreen was implemented using R (v.4.3.1) and R Shiny (v.1.8.0). This Shiny app integrated a variety of R programming language packages for comprehensive bioinformatics analysis. For parsing and generating data structures, we utilized readxl v1.4.1 [20, 21]. To facilitate data manipulation and transformation, we employed packages such as reshape2 v1.4.4 [22], tidyr v1.2.1 [21], and dplyr v1.0.10 [23]. We applied packages such as maftools v2.12.0 [24], clusterProfiler v4.4.4 [25], and VariantAnnotation v1.42.1 [26] for the analysis of somatic variants, functional profiles of genes. For data visualization, we used packages such as circlize v0.4.15 [27] for circular visualizations, ggalluvial v0.12.3 [28, 29] for alluvial diagrams, ggrepel v0.9.2 [30] for label clarity, ComplexHeatmap v2.12.1 [31] for sophisticated heatmaps, and ggplot2 v3.4.0 [32] for creating customizable static plots.
Data source
The harmonized drug database, derived from open-source drug knowledge databases, including GDKD [17], CIViC [18], and OncoKB [19], utilizes the DrugBank Vocabulary dataset from DrugBank [33] to standardize drug synonyms. TCGA-BRCA data and breast cancer cell line data used in the case study were collected from UCSC Xena hubs [34].
Results
Case study: application and validation using TCGA-BRCA data
Dataset selection and processing
In this case study, the Onko_DrugCombScreen was applied to the TCGA-BRCA dataset to validate its efficacy in identifying effective drug combinations for breast cancer. The TCGA-BRCA dataset, derived from the TCGA Pan-Cancer (PANCAN) initiative, was chosen for its comprehensive genetic profiling, including extensive data on copy number variations, single nucleotide variations, and molecular subtype profiles [35]. This dataset provides a broad coverage of genetic variations, making it an ideal resource for this analysis. Additionally, cell line data from the Cancer Cell Line Encyclopedia (Breast) were incorporated to complement the analysis [36]. All of the above datasets are available in UCSC Xena hubs [34].
In the preprocessing phase, somatic mutation data from the PANCAN was converted into a compatible CSV file for analysis by the Onko_DrugCombScreen. This process involved filtering the dataset to isolate BRCA cancer data and further stratifying it into molecular subtypes: luminal A/B, HER2+, normal-like, and basal-like. In this case study, we used normal-like breast cancer as the comparison cohort, and HER2+ and basal-like subtypes as the target cohorts, respectively, to analyze and validate the efficacy of Onko_DrugCombScreen. Additionally, by integrating cell line data, Onko_DrugCombScreen provided guidance on suitable cell lines for subsequent experimental validation.
Validation and results
The drug co-recommendation comparison analysis revealed significant disparities between the three BRCA subtypes (HER2+ and basal-like) and the normal-like BRCA data. Significant drug co-recommendations extracted from Onko_DrugCombScreen were compared with combinational therapies in Wang and Minden’s review [37], as well as FDA-approved drug combinations, to validate the effectiveness. As Supplementary Table S1 shows, the “adjust_p.value” and “OR” (odds ratio), obtained from the Fisher’s exact test, indicate the significance and magnitude of the drug combination in the target cohort compared to the comparison cohort, and the “Percentage” depicts the proportion of the drug combination recommended in the target cohort. Setting the threshold at adjust_p.valueit;.05, OR > 1, and Percentage > 50% retains around 30% of the significant candidate drug combinations (30 250/111 987 in HER2+ versus normal-like and 48 348/112 069 in basal-like versus normal-like). Notably, these stringent criteria preserved almost all approved and clinical trial drug combinations, including the approved combinational therapy of pertuzumab + trastuzumab for the HER2+ subtype and pembrolizumab + paclitaxel for triple-negative breast cancer. These results highlight Onko_DrugCombScreen’s accuracy in identifying clinically relevant drug combinations, confirming its effectiveness. Besides, upon comparison with the DrugComb.org database [38], it was found that none of the approved and currently in clinical trial drug combinations of breast cancer had any recorded synergy scores.
The validation analysis demonstrates that the Onko_DrugCombScreen is adept at identifying established breast cancer drug combinations in the BRCA subtypes such as HER2+ and basal-like when compared to normal-like BRCA subtype. This finding not only validates the tool’s effectiveness but also highlights its potential in discovering novel drug combinations for various cancer types. Consequently, the case study accentuates the utility of the Onko_DrugCombScreen in providing targeted and efficacious drug recommendations.
Data analysis workflow of Onko_DrugCombScreen
The data analysis workflow of Onko_DrugCombScreen is depicted in Fig. 2: Variant data such as single nucleotide variants (SNVs) and copy number variants (CNVs) from both the cancer subtype cohort and the comparison cancer subtype cohort are preprocessed and converted into variant tables compatible with Onko_DrugCombScreen. These patient variant data are then mapped to public drug databases (CIViC, GDKD, OncoKb) after integration with variant interpretation annotations and drug evidence levels for drug recommendations. The resulting drug recommendations are subjected to statistical analysis, focusing on the statistical differences in drug combination candidates observed between the target cancer subtype group and the comparison cancer subtype group. To identify drug combination candidates that are significantly and frequently recommended in the target group compared to the comparison group, Fisher’s exact test is applied. Subsequently, the selected drug combination candidates undergo an integrated analysis with cell line data to identify available cell line samples, facilitating wet-lab validation. Additionally, all analysis results are visualized, making the findings clearer and more intuitive. The integration of these processes is crucial for confirming drug combination recommendations for the cancer type of interest. The final validation stage may include conducting wet-lab drug screenings to confirm the analysis results and deepen the understanding of the underlying biological mechanisms (Fig. 2). To assist users in becoming familiar with the analysis workflow of Onko_DrugCombScreen, Supplementary File S1 is provided as a test dataset for practice and exploration.

The workflow for Onko_DrugCombScreen drug combination data analysis. After the recommendation process based on drug knowledge, SNVs and CNVs are merged into an annotated drug table. Following statistical analysis and integration of cell line data, the final DrugComb analysis table will be used for visualization and wet-lab validation.
Data preprocessing
SNVs and CNVs are typically stored in formats such as VCF, MAF, TXT, or Excel. A preprocessing step is necessary to convert these various formats into CSV format (Fig. 2). These data frames are then suitable for use in the knowledge-based drug recommendation analysis within Onko_DrugCombScreen. To assist users, we have provided an example preprocessing script along with a detailed instruction markdown file for the TCGA-BRCA case study in the GitLab repository under the “test_data” directory. Users can modify the provided example preprocessing script according to their needs to convert their own variant data into a compatible input format (Fig. 2).
Matching rule between variant annotations in patients’ data and database
Due to the different annotation descriptions of variants in the three drug databases (GDKD [17], CIViC [18], and OncoKB [19]) and original patients’ variant data, we harmonized the three drug databases and designed a matching rule based on the interpretation of biological significance (Table 3). All variant classes or effects map to the biological interpretations of “loss,” “gain,” or “mutation.” We can then associate the original variants (Table 4) with the information in the knowledge database based on biological interpretations and obtain the relevant target drug information.
Database . | Variant . | Interpretation . |
---|---|---|
GDKD/CIViC/OncoKB | “splice” | Loss |
GDKD/CIViC/OncoKB | “delins” | Loss |
GDKD/CIViC/OncoKB | “ins” | Insertion |
GDKD/CIViC | “del” | Deletion |
GDKD | “indel” | Loss |
GDKD/CIViC | “fs” | Loss |
GDKD/CIViC/OncoKB | “deletion” | Loss |
GDKD/CIViC/OncoKB | “amplification” | Gain |
GDKD | mut | Mutation |
GDKD | any | Mutation |
CIViC | loss/loss-of-function | Loss |
CIViC | “mutation” | Mutation |
CIViC | “ˆexpression” | Gain |
CIViC | “Overexpression” | Gain |
CIViC | “Underexpression” | Loss |
OncoKB | Truncating mutations | Loss |
OncoKB | Oncogenic mutations | Mutation |
CIViC | “FRAMESHIFT” | Loss |
CIViC | “FRAME SHIFT” | Loss |
OncoKB/CIViC | Exon 17 mutations | Mutation (exact match) |
CIViC | Exon 19 deletion | Loss (exact match) |
CIViC | Exon 14 skipping mutation | Mutation (exact match) |
Database . | Variant . | Interpretation . |
---|---|---|
GDKD/CIViC/OncoKB | “splice” | Loss |
GDKD/CIViC/OncoKB | “delins” | Loss |
GDKD/CIViC/OncoKB | “ins” | Insertion |
GDKD/CIViC | “del” | Deletion |
GDKD | “indel” | Loss |
GDKD/CIViC | “fs” | Loss |
GDKD/CIViC/OncoKB | “deletion” | Loss |
GDKD/CIViC/OncoKB | “amplification” | Gain |
GDKD | mut | Mutation |
GDKD | any | Mutation |
CIViC | loss/loss-of-function | Loss |
CIViC | “mutation” | Mutation |
CIViC | “ˆexpression” | Gain |
CIViC | “Overexpression” | Gain |
CIViC | “Underexpression” | Loss |
OncoKB | Truncating mutations | Loss |
OncoKB | Oncogenic mutations | Mutation |
CIViC | “FRAMESHIFT” | Loss |
CIViC | “FRAME SHIFT” | Loss |
OncoKB/CIViC | Exon 17 mutations | Mutation (exact match) |
CIViC | Exon 19 deletion | Loss (exact match) |
CIViC | Exon 14 skipping mutation | Mutation (exact match) |
Database . | Variant . | Interpretation . |
---|---|---|
GDKD/CIViC/OncoKB | “splice” | Loss |
GDKD/CIViC/OncoKB | “delins” | Loss |
GDKD/CIViC/OncoKB | “ins” | Insertion |
GDKD/CIViC | “del” | Deletion |
GDKD | “indel” | Loss |
GDKD/CIViC | “fs” | Loss |
GDKD/CIViC/OncoKB | “deletion” | Loss |
GDKD/CIViC/OncoKB | “amplification” | Gain |
GDKD | mut | Mutation |
GDKD | any | Mutation |
CIViC | loss/loss-of-function | Loss |
CIViC | “mutation” | Mutation |
CIViC | “ˆexpression” | Gain |
CIViC | “Overexpression” | Gain |
CIViC | “Underexpression” | Loss |
OncoKB | Truncating mutations | Loss |
OncoKB | Oncogenic mutations | Mutation |
CIViC | “FRAMESHIFT” | Loss |
CIViC | “FRAME SHIFT” | Loss |
OncoKB/CIViC | Exon 17 mutations | Mutation (exact match) |
CIViC | Exon 19 deletion | Loss (exact match) |
CIViC | Exon 14 skipping mutation | Mutation (exact match) |
Database . | Variant . | Interpretation . |
---|---|---|
GDKD/CIViC/OncoKB | “splice” | Loss |
GDKD/CIViC/OncoKB | “delins” | Loss |
GDKD/CIViC/OncoKB | “ins” | Insertion |
GDKD/CIViC | “del” | Deletion |
GDKD | “indel” | Loss |
GDKD/CIViC | “fs” | Loss |
GDKD/CIViC/OncoKB | “deletion” | Loss |
GDKD/CIViC/OncoKB | “amplification” | Gain |
GDKD | mut | Mutation |
GDKD | any | Mutation |
CIViC | loss/loss-of-function | Loss |
CIViC | “mutation” | Mutation |
CIViC | “ˆexpression” | Gain |
CIViC | “Overexpression” | Gain |
CIViC | “Underexpression” | Loss |
OncoKB | Truncating mutations | Loss |
OncoKB | Oncogenic mutations | Mutation |
CIViC | “FRAMESHIFT” | Loss |
CIViC | “FRAME SHIFT” | Loss |
OncoKB/CIViC | Exon 17 mutations | Mutation (exact match) |
CIViC | Exon 19 deletion | Loss (exact match) |
CIViC | Exon 14 skipping mutation | Mutation (exact match) |
Variant . | Interpretation . |
---|---|
In_Frame_Ins | Insertion (exact match) |
In_Frame_Del | Deletion (exact match) |
Frame_Shift_Ins | Loss |
Frame_Shift_Del | Loss |
Splice_site | Loss |
amplification | Gain |
deletion | Loss |
Missense_Mutation | Mutation |
Nonsense_Mutation | Loss |
Nonstop_Mutation | Exact match |
Translation_Start_Site | Exact match |
Variant . | Interpretation . |
---|---|
In_Frame_Ins | Insertion (exact match) |
In_Frame_Del | Deletion (exact match) |
Frame_Shift_Ins | Loss |
Frame_Shift_Del | Loss |
Splice_site | Loss |
amplification | Gain |
deletion | Loss |
Missense_Mutation | Mutation |
Nonsense_Mutation | Loss |
Nonstop_Mutation | Exact match |
Translation_Start_Site | Exact match |
Variant . | Interpretation . |
---|---|
In_Frame_Ins | Insertion (exact match) |
In_Frame_Del | Deletion (exact match) |
Frame_Shift_Ins | Loss |
Frame_Shift_Del | Loss |
Splice_site | Loss |
amplification | Gain |
deletion | Loss |
Missense_Mutation | Mutation |
Nonsense_Mutation | Loss |
Nonstop_Mutation | Exact match |
Translation_Start_Site | Exact match |
Variant . | Interpretation . |
---|---|
In_Frame_Ins | Insertion (exact match) |
In_Frame_Del | Deletion (exact match) |
Frame_Shift_Ins | Loss |
Frame_Shift_Del | Loss |
Splice_site | Loss |
amplification | Gain |
deletion | Loss |
Missense_Mutation | Mutation |
Nonsense_Mutation | Loss |
Nonstop_Mutation | Exact match |
Translation_Start_Site | Exact match |
Drug recommendation annotation
After the matching rule is defined, the drug knowledge-based analysis was performed to export the drug recommendation tables across all target and comparison subtype data. However, due to discrepancies in drug nomenclature across the three drug databases, we employed the “DrugBank Vocabulary” dataset from DrugBank to standardize synonymous drug names. Subsequently, each drug name was annotated to its final drug class. This annotation is stored in the columns “Origin_Drug_Name” and “Classified_Drug_Name” of the DrugComb analysis table. Additionally, other useful information such as variant type is annotated in the “mutType” column, and variant match status—which indicates whether the amino acid change in the raw data exactly matches the database records or not—is saved in the “Match_Sign” column.
Data visualization
Onko_DrugCombScreen provides a variety of charts for visual analysis results, allowing users to understand data more intuitively (Fig. 3). The application integrates multiple plotting functions, including volcano plots, heatmaps, Circos plots, alluvial diagrams, UpSet plots, and bar charts. These visualization tools help to easily identify recommended drugs or candidate drug combinations for subsequent wet-lab analysis and validation. Users can configure settings in the left panel of Onko_DrugCombScreen and customize the resolution for PDF file export.

Visualization of the Onko_DrugCombScreen. (A) Volcano plot identifying significant drug combinations. (B) Circos plot depicting the most proportional drug co-recommendations. (C) Alluvial diagram tracing mutations back to recommended single drugs. (D) UpSet plot showing the top-recommended single drugs and their intersections.
Discussion
Drug combinations are widely recognized for their benefits in cancer therapy. Here, we developed the Onko_DrugCombScreen Shiny app integrated statistical analysis to identify the most significant candidate drug combinations for a target tumor type cohort. We utilized drug knowledgebase recommendations derived from mutation data of the targeted cancer patient cohort and the comparison cohort to identify drug co-recommendations. This is complemented by integrating cell line data to assist in the validation of biological experiments. Onko_DrugCombScreen’s ability to identify effective drug combinations, as demonstrated in the TCGA-BRCA case study, suggests a promising way toward more tumor type tailored and effective cancer combination therapy.
In contrast to current computational methods that mainly focus on synergy and dose–response matrices, Onko_DrugCombScreen is a drug knowledge-based analysis approach. It not only provides therapeutic recommendations but also offers guidance for clinical research, thereby integrating more closely with clinical applications. Moreover, all drug recommendations can be traced back to patient genetics and variants through the visual alluvial diagram of Onko_DrugCombScreen. Utilizing the drug database that is also employed by the MTB-Report [6], each recommended drug’s level of evidence and response are explicitly defined. This clarity effectively aids in addressing issues of selectivity in the recommended drug combinations, issues that are often overlooked in previous methods. Moreover, the utilization of drug databases for the recommendation of candidate drug combinations based on patient gene mutation profiles can potentially reduce the effort required for toxicity analysis [39, 40]. These databases provide valuable information on the relationships between individual drugs and specific gene mutations or molecular targets, which can guide the selection of drug combinations with potentially favorable efficacy profiles. Furthermore, the drugs included in these databases are often approved or under clinical trials, meaning that their toxicity profiles have been extensively studied and characterized [41, 42]. These existing safety data can serve as a foundation for assessing the toxicity of drug combinations, as it provides insights into the common adverse events, dose-limiting toxicities, and recommended dosing schedules of the individual drugs. By leveraging this information, researchers can streamline the toxicity assessment process and make more informed decisions when designing drug combination studies. However, the toxicity of a drug combination is not simply the sum of individual drug toxicities. Comprehensive safety assessments are still necessary, considering factors such as drug–drug interactions, dosing, scheduling, and specific patient populations.
Onko_DrugCombScreen also offers flexibility for application across various cancer types beyond breast cancer, provided there is sufficient sample size and well-defined molecular stratification within the target and comparison cohorts. For example, in the TCGA-BRCA case study, the target cohorts included 280 patients with the basal-like subtype and 82 patients with the HER2+ subtype, while the comparison cohort (normal-like subtype) consisted of 143 patients. These cohort sizes provide adequate statistical power for Fisher’s exact test [43, 44], which remains robust even with smaller sample sizes, such as when the sample size is below 5. However, when any expected cell counts in a contingency table are very low, statistical interpretation may become limited. Therefore, we recommend caution when analyzing smaller cohorts for drug combination candidate identification, as reduced statistical robustness can affect result interpretation. The tool’s ability to analyze stratified cohorts makes it versatile across different cancer types, but its success depends on sufficient sample sizes and well-defined molecular subtypes.
Looking forward, the potential for further development of Onko_DrugCombScreen is substantial. While the current version does not yet incorporate detailed drug synergy, dose–response, or combination toxicity analyses, we recognize these as crucial factors for clinical implementation of combination therapies. Future iterations of Onko_DrugCombScreen could integrate drug synergy data from sources such as DrugComb [45, 46], where available, and refine recommendations using dose–response metrics, particularly during validation in cell line or patient-derived model systems. Additionally, toxicity analysis can be streamlined by leveraging existing toxicity profiles from drug knowledgebases, though a more comprehensive assessment of combination-specific toxicity will be necessary. To address these gaps, we propose a possible solution that incorporates artificial intelligence (AI) and ML techniques to enhance the tool’s capabilities. Specifically, ML models such as graph convolutional neural networks [47], random forest, or boosted models could analyze large-scale patient variant data and drug interactions to more accurately predict combination efficacy and minimize adverse effects. By incorporating these AI-driven approaches, Onko_DrugCombScreen will evolve into a more robust and clinically relevant tool, capable of offering precision-guided, synergistic drug combinations tailored to individual patients.
Conclusion
In conclusion, the Onko_DrugCombScreen Shiny app represents an innovative tool in the field of precision cancer therapy, offering a novel drug knowledge-based approach to drug combination screening. This application leverages drug knowledge database analysis along with advanced statistical and visualization techniques to identify effective drug combinations. It effectively utilizes drug recommendations from targeted cancer cohort and comparison cohort, combined with cell line data, to provide prominent drug co-recommendations for targeted cancer type. Validated through a TCGA-BRCA case study, the application has demonstrated its potential in accurately identifying both existing and novel drug combinations, aligning with the evolving field of precision oncology. Looking ahead, the integration of AI and ML technologies holds the promise of further enhancing its predictive capabilities, making it a valuable tool in the quest for more targeted and effective cancer treatment approaches.
Acknowledgements
The authors would like to acknowledge the Volkswagen Foundation’s support in MTB-Report project. J.Y. was supported by the Ph.D. program “Genome Science”—International Max Plank Research School, part of the Göttingen Graduate Center for Neurosciences, Biophysics, and Molecular Biosciences. J.D. and T.B. are members of the Göttingen Campus Institute Data Science.
Author contributions: J.Y., M.W., B.C., J.D., and T.B. designed the study. J.Y. and B.C. provided major contributions to the computational method and visualization. J.Y. designed, developed, and implemented the Shiny app. M.W. tested the Shiny app and helped identify problems and bugs. J.Y. wrote the manuscript. All authors critically reviewed the content and approved the final manuscript.
Supplementary data
Supplementary data is available at NAR Genomics & Bioinformatics online.
Conflict of interest
None declared.
Funding
This project was supported by the Volkswagen Foundation within research project MTB-Report (ZN3424).
Data availability
The Onko_DrugCombScreen source code is available on the GitHub repository at https://gitlab.gwdg.de/MedBioinf/mtb/onko_drugcombscreen, and also on Zenodo at https://doi.org/10.5281/zenodo.14614900. Test datasets are provided in Supplementary File S1. The application is accessible at https://rshiny.gwdg.de/apps/onko_drugcombscreen/.
Comments