SynAI: an AI-driven cancer drugs synergism prediction platform

Abstract Summary The SynAI solution is a flexible AI-driven drug synergism prediction solution aiming to discover potential therapeutic value of compounds in early stage. Rather than providing a finite choice of drug combination or cell lines, SynAI is capable of predicting potential drug synergism/antagonism using in silico compound SMILE (Simplified Molecular Input Line Entry System) sequences. The AI core of SynAI platform has been trained against cell lines and compound pairs listed by NCI (National Cancer Institute)-Almanac and DurgCombDB datasets. In total, the training data consists of over 1 200 000 in vitro synergism tests on 150 cancer cell lines of different organ origins. Each cell line is tested against over 6000 pairs of FDA (Food and Drug Administration) approved compound combinations. Given one or both candidate compound in SMILE sequence, SynAI is able to predict the potential Bliss score of the combined compound test with the designated cell line without the needs of compound synthetization or structural analysis; thus can significantly reduce the candidate screening costs during the compound development. SynAI platform demonstrates a comparable performance to existing methods but offers more flexibilities for data input. Availability and implementation The evaluation version of SynAI is freely accessible online at https://synai.crownbio.com.


S2.1 Preliminary Design and Evaluation of SynAI Core Model
The defintion of DL model is elaborated as below.Due to the genetic variation discussed in early literatures (An2022), DL model is trained for each cell line.During the training of the DL model, a n-fold crossvalidation is performed to understand the essential performance of SynAI model (cf.Fig. 1).In the early experiment we observed a strong tendency of model being overfit (cf.Fig. 2), thus an additional hyperparameter tuning test was performed to understand the influence of parameters on the final model performance (cf.Fig. 2 ~ 4).

Initital PCC (Epoch=0)
The PCC of epoch=0 (cf.Fig. 1) shows that all initial bliss score has no correlation to real bliss score.The data also shows that our model trianing procedure was properly initalized in each cross-validation iteration.
There is no reusing of trained model from other iteration.impossible.However, PCC caculation between measured and predicted synergy reading is independent of algorithm and strongly correlated to loss function such as mean-square-error (MSE) or mean-absolut-error (MAE).In addition, PCC value range is scale-free and independent from data variation.
Random forset regression (RandomForestRegressor from scikit-learn library) Gradient boosted regression (GradientBoostingRegressor from scikit-learn library) Recurrent neural network (RNN from pytorch library) To ensure an objective comparison crossing regressor models, a hyperparameter tuning was performed using grid search strategy provided by scikit-learn library.Search over specified parameter values with successive halving.The search strategy starts evaluating all the candidates with a small amount of resources and iteratively selects the best candidates, using more and more resources.

S2.3.1 training per-cell model with NCI + DrugCombDB combined
In the initial design of SynAI, the deep learning networks were designed to be retrainable with new data.The goal is allowing internal models to be updated with future data.Questions were raisen regarding which strategy is better when (1) networks are trained with all data combined (2) networks are trained with one initial dataset and retrained recursively with additional datasets.Here we run a short comparison regarding these two strategy.
1.In the retraining stategy, per cell networks were trained first using NCI dataset with the standard training workflow.The trained networks were retrained using DrugCombDB dataset again, allowing networks to adapt new information from DrugCombDB dataset.The retrain networks are validated using NCI and DrugCombDB again.The comparison shows that retraining strategy is able to yield a higher final performance (cf.Fig. 19) while combined data training strategy seems producing a less desirable output (cf.Fig. 22).Our further analyses of the comparison data shows that the reason of less desirable output from combined data training is largely due to that model building is dominated by one dataset and the other datasets are showing very unstable performance (cf.Fig. 22).Our hypothesis suggested that a combined data set may requires network complex to increase as well to compensate the data complexity.However, such study is beyond the scope of this paper.At the current stage, the retraining strategy is producing reasonable results.
The interesting phenomena observed in Fig. 17 showing that initial performance of DrugCombDB is already approaching PCC=0.95 while the NCI is around PCC=0.0.This is indeed expected as the model was pretrained with NCI dataset already.The validating performance shows that retrained network provides better outcomes (cf.Fig. 19).

Fig. 1
Fig.1 fitting visualization of training and validation set in each iteration of cross-validation at different cell line.Each mark represents a combination of two PDA-approved chemo-compound for cancer treatment.In general each cell line responses differently to the treatment.For example, treatment potentcy is overal stronger in SK-MEL-5 cell line while much weaker in OVCAR-8.

Fig. 2
Fig.2 is the epoch=0 Pearson cross coefficient (PCC) between real bliss score and predicted bliss score per cell line from all cross-validation iterations

Fig. 3
Fig.3 is the final training set Pearson cross coefficient (PCC) between real bliss score and predicted bliss score per cell line of from all cross-validation iterations

Fig. 4
Fig.4 is the final validating set Pearson cross coefficient (PCC) between real bliss score and predicted bliss score per cell line at epoch=2048 of from all cross-validation iterations

Fig 5 .
Fig 5. hyperparameter tuning results using HalveGridSearching at iter-0.The test starts with a full permutation of 288 possible combination of SynAI hyperparameter set.The results (test set PCC) are dispersing over choices of hyperparameter combinations.

Fig 6 .
Fig 6. hyperparameter tuning results using HalveGridSearching at iter-1.The test set PCC starts to converge and different combination of hyperparameters are producing similar performance in test set.

Fig 7 .
Fig 7. hyperparameter tuning results using HalveGridSearching at iter-2.The test set PCC are converged and the final hyperparameters are producing nearly identical performance.

Fig 8 .
Fig 8. hyperparameter tuning results using HalveGridSearching at iter-0.The test starts with a full permutation of 216 possible combination of hyperparameter set for RNN network.The results (test set PCC) are dispersing over choices of hyperparameter combinations.

Fig 9 .
Fig 9. hyperparameter tuning results using HalveGridSearching at iter-1.The test set PCC starts to converge and different combination of hyperparameters are producing similar performance in test set.

Fig
Fig 11. hyperparameter tuning results of random forset regressor using HalveGridSearching at iter-0, the searching starts with a full combination of 432 parameters for three different cell lines.The PCC between real and predicted bliss score is used as the final performance criteria of the search.

Fig 12 .
Fig 12. Iter-1 HT results of random forest regressor showing improved convergence compared to iter-0

Fig
Fig 14. hyperparameter tuning results of gradient boosting regressor using HalveGridSearching at iter-0.No clear convergent results can be observed.Results of training is dispersing

2.
In the combined training strategy, per cell networks were trained with NCI + DrugCombDB datasets together.The combined datasets are treated as a single dataset for the training.

Fig 17 .
Fig 17.Per-cell model PCC of trained with NCI and retrained with DrugCombDB for initial round Fig 20.Per-cell model PCC of NCI+DrugCombDB dataset for initial round The goal of hyperparameter tuning is to confirm the global optimal choice of parameters for SynAI DL model.Moreover, an additional choices of popular regression solutions were included as a reference/comparison to SynAI performance.The following regression solutions were tested and compared to SynAI.The selection is based on existing literature on drug synergism prediction study (An2022).Here we employ out-of-box toolbox library (scikit-learn) to provide the other regressor implementations.The final test PCC (Pearson cross correlation) results are elobrated at Table. 1.The reason of using PCC as the algorithm performance criteria is based on existing literature study (An2022) as different algorithm may employ different loss function during their fitting/training procedure which makes the algorithm comparison The hyperparameters for each regressor algorithms are elobrated in the following sections.Based on the tuning tests, it is difficult to observe an unanimous choice of hyperparameters crossing different cell line for each algorithms.The hypothesis is that each cell model possesses a different genetic profile which may response differently to the same combination of drugs.In addition, the tuning tests confirms that a single prediction model scenario may not be the optimal solution.Intead, a per-cell model training strategy would yield higher performance.This conclusion is also in-line with previous literature such as Holbeck 2017 andTable.1 comparison of regression solutions crossing three cell line from NCI-Almanac datasetThe result of HalveGridSearching shows that the final test set PCC are converging for each cell lines.In total, a full combination of 288 parameter sets were tested at iteration 0 and it narrows down to a smaller (49) collection of parameter sets (cf.Fig.4~ 6) in the iteration 2 (final).However, there is no clear indication of a global optimal parameter set (cf. Fig.6).
Table.1), compared to SynAI, RNN model is able to produce a very similar yet lower performance to SynAI.Random foreset (RF) and Gradient Boosting (GBX) regression solutions both yield lower performance than SynAI.Inital test of all algorithms show tendency of overfitting using the default parameter setting, thus here we deliberately introduce higher dropout and lowering the complexity of the model by reducing key parameter contributing to algorithm complexity.For example, in RF algorithm we use much higher number of estimator while lower maximum tree depth to avoid overfitting.Similarity for GBX, much lower value of minimum sample leaf and minimum sample split are employed during the tuning test.The same goes for RNN and SynAI as well with much smaller number of hidden layer and hidden layer sizes are employed.

S2.2.2 hyperparameter for RNN regressor RNN
is a popular choice of DL model simlar to multi-layer perceptron (MLP) network employed by SynAI.RNN and MLP are both popualr choices for classification and regression problem.Although CNN is often considered as well, it is more popular in the image domain and sequence domain.Here the compound is first transfered into molecular fingerprint which is a standard tabular data rather than sequence data.Here we employ default implementation of RNN provided by PyTorch library.