KEA3: improved kinase enrichment analysis via data integration

Abstract Phosphoproteomics and proteomics experiments capture a global snapshot of the cellular signaling network, but these methods do not directly measure kinase state. Kinase Enrichment Analysis 3 (KEA3) is a webserver application that infers overrepresentation of upstream kinases whose putative substrates are in a user-inputted list of proteins. KEA3 can be applied to analyze data from phosphoproteomics and proteomics studies to predict the upstream kinases responsible for observed differential phosphorylations. The KEA3 background database contains measured and predicted kinase-substrate interactions (KSI), kinase-protein interactions (KPI), and interactions supported by co-expression and co-occurrence data. To benchmark the performance of KEA3, we examined whether KEA3 can predict the perturbed kinase from single-kinase perturbation followed by gene expression experiments, and phosphoproteomics data collected from kinase-targeting small molecules. We show that integrating KSIs and KPIs across data sources to produce a composite ranking improves the recovery of the expected kinase. The KEA3 webserver is available at https://maayanlab.cloud/kea3.


INTRODUCTION
Protein kinases catalyze the transfer of a phosphate group from ATP to other proteins' threonine, serine, or tyrosine residues (1). The reversible addition of the phosphate group to a protein can affect the substrate protein activity, stability, localization, and interactions with other molecules (2). Each protein kinase recognizes between one to a few hundred substrates (3). Mass-spectrometry phosphoproteomics experiments can yield over 50,000 unique phospho-peptides that span >75% of all cellular proteins (4). Thus, phosphoproteomics experiments can capture the cellular state of cell-signaling networks. However, kinase activity levels are difficult to discern from the results of such experiments. Since kinases serve a critical and central role in regulating essentially all cellular processes (5), and their aberrant constitutive activation is recognized as a cause of many human cancers (6)(7)(8)(9)(10), identifying alterations in kinase state given results from phosphoproteomics experiments is critical.
Protein kinases are one of the most targeted protein families amenable for inhibition by small molecules (11), while most clinically approved protein kinase inhibitors target receptor tyrosine kinases (RTKs) to block cancer prolifera-tion and angiogenesis (12). However, an increasing number of kinase inhibitors for non-oncological indications have been recently approved. New druggable protein kinase targets can be identified by experiments that detect deregulated kinase-mediated processes contributing to disease progression. For example, mass-spectrometry phosphoproteomics profile the differential phosphorylation states of cellular proteins between two cellular states (13). Such data provides a snapshot of the intracellular signaling networks that are differentially activated between two conditions, for instance, between diseased and healthy cells. The enrichment of known kinase substrates in a set of differentially phosphorylated proteins can serve as a potential marker of the upstream kinases' state and provide insights into physiological and pathophysiological mechanisms (14).
Available tools that predict relevant kinases associated with a set of genes, proteins or phosphorylation sites include Expression2Kinases (X2K) (15,16), PTMsigDB (17), Inference of Kinase Activities from Phosphoproteomics (IKAP) (18), Kinase Perturbation Analysis (KinasePA) (19) and Kinase Substrate Enrichment Analysis (KSEA) (20). X2K is a web tool that predicts cell-signaling pathways upstream from differentially expressed mRNAs. It first performs transcription factor enrichment analysis (TFEA) (21)(22)(23); it then connects these factors based on known protein-protein interactions (24), and then performs kinaseenrichment analysis (KEA) to rank the most relevant protein kinases. One of the limitations of X2K is that it performs the enrichment analysis at the protein level. PTM-sigDB is a database of post-translational modification site (PTM-site) specific signatures curated from publications, including kinase state signatures. PTM Signature Enrichment Analysis (PTM-SEA) is an R package for modified gene set enrichment analysis (GSEA) used to query a userinputted set of PTMs against the PTMsigDB database. IKAP consists of a collection of MATLAB functions that estimate kinase state from a phosphoproteomics dataset by minimizing a cost function relating the kinase state to the phosphosite measurements. KinasePA, available as an R package called directPA and as a webserver, uses experimentally determined kinase-phosphosite interactions to perform kinase enrichment analysis applied directly to massspectrometry proteomics readouts. KSEA is a web-based tool that uses known kinase-phosphosite relationships to compute a normalized score for each protein kinase based on the relative hyper-or hypo-phosphorylation of its substrates.
In contrast with prior implementations, Kinase Enrichment Analysis 3 (KEA3), which also computes kinase overrepresentation for a query of human or mouse protein or gene sets, integrates kinase-substrate interactions (KSI) from a multitude of resources to compute a composite kinase enrichment score. To develop KEA3, we adapted the web application and benchmarking framework previously deployed for creating the transcription factor enrichment analysis tool ChEA3 (23). To benchmark KEA3, we evaluated the utility of publicly available KSI, PPI, cooccurrence, and co-expression data to compute overrepresentation of putative kinase substrates for a user inputted protein set. KEA3 expands significantly on and is a complete reimplementation of KEA (25). KEA3 contains more kinase-substrate libraries, incorporates three independent systematic benchmarks, and integrates results across data sources to improve recovery of the expected upstream kinases. This integration method performs consistently better than any single library across the three benchmarks. Two of the KEA3 benchmarks also demonstrate the utility of KEA3 for analyzing signatures from drug perturbation experiments to infer candidate kinase targets for kinase inhibitor drugs and small molecules. To further demonstrate the utility of KEA3, as a case study, we applied kinase enrichment analysis to phosphoproteomics data collected from recent SARS-CoV-2 studies.

Arriving at a consensus list of human kinases
We mapped protein and gene symbols to HGNC-approved gene symbols (26) and discarded gene or protein symbols that did not map using synonym or alias matching. To accomplish this, we developed an R package called genesetr (https://github.com/MaayanLab/genesetr). The union of kinome lists from Manning et al. (5), Miranda-Saavedra and Barton (27), and the Illuminating the Druggable Genome (IDG) project (11) produced the set of 520 unique KEA3 HGNC-mappable human protein kinases.

Protein-protein and kinase-substrate interaction libraries
The KEA3 gene-set libraries are kinase-substrate sets aggregated from several resource types: PPI, KSI, co-occurrence, and transcript co-expression. One additional library not described below, termed STRING, was composed of all human kinase-protein links in version 11.0 of the STRING database (28). The code used to generate the KEA3 libraries and for benchmarking KEA3 can be found at https://github. com/MaayanLab/KEA3webData. The PPI and KSI datasets (Tables 1 and 2) include interactions where at least one interacting partner was a member of the KEA3 consensus kinase set. Within each dataset, all kinases are human kinases that have at least five distinct putative human protein substrates. Kinase-interacting proteins were collected from the following PPI databases: Bi-oGRID (29), mentha (30), hu.MAP (31), prePPI (32,33), MINT (34,35), HIPPIE (36), PIPs (37,38), PSOPIA (39), REACTOME (40), Cheng et al. (41) and STRING (28). The BioGRID and MINT databases contain PPIs from high-and low-throughput experiments that were manually curated from the literature. Mentha is a PPIN that contains molecular interactions aggregated and updated weekly from MINT, IntAct (42), BioGRID, MatrixDB (43), and the Database of Interacting Proteins (DIP) (44). HIPPIE aggregates experimentally determined PPIs from IntAct, MINT, BioGRID, HPRD (45), DIP, BIND (46) and MMPI (47). Cheng et al. used PPIs collected from IntAct, MINT, BioGRID, DIP, HPRD and MIPS MPact (48). There are overlaps and redundancy among these databases, especially those that aggregate PPIs from the literature and other PPI databases. We examined each of them individually despite these overlaps because each incorporates different combinations of resources with varying reliability, quality, and coverage.  (49) is a manually curated and peer-reviewed pathway database with annotations that generally focus on the most extensively studied pathways and molecules. hu.MAP (31) integrates thousands of published mass spectrometry (MS) experiments to find all interactions not identified in the original publications. We also constructed a library from the experimentally derived datasets used for testing the PSOPIA PPI prediction model. PIPs and prePPI both consist of predicted PPIs. PIPs (37) is using a naïve Bayes classifier that integrates information from expression, orthologs, domain co-occurrence, PTMs, and subcellular localization. PrePPI (32) also uses a Bayesian framework but relies on three-dimensional structural information in addition to functional, evolutionary, and expression information to make PPI predictions.
Putative KSIs were collected from PhosphoNetworks (50), Phospho.ELM (51), PTMsigDB (17), Phospho-SitePlus (52), PhosD (53) and Cheng et al. (41). Phos-phoNetworks relies on a combined protein microarray and computational strategy to construct human phosphorylation networks (54,55). PhosphoNetworks 'raw' KSIs consist of kinase-substrate relationships (KSRs) identified by protein microarray. PhosphoNetworks 'reference' KSIs consist of high-confidence KSIs that were filtered by multiple criteria and validated by transfection experiments. Finally, the PhosphoNetworks 'combination' KSIs consist of the union of the reference KSIs and KSIs that were manually curated from the literature. Phospho.ELM is a database of experimentally verified protein phosphorylation sites in eukaryotes, annotated with the phosphorylating kinase when known. PTMsigDB is a collection of phosphosite signatures of kinase activities, perturbations, and signal-ing pathways curated from the literature. PhosphoSitePlus is a database of manually curated kinase-substrate interactions from thousands of publications. PhosD predicts kinase-substrate interactions based on protein domains. We examined three PhosD kinase libraries generated by the PhosD model trained on Phospho.ELM data, on Phospho-SitePlus data, and on multiple datasets. Cheng et al. (41) constructed a KSIN from Phospho.ELM, HPRD, Phos-phoNetworks, and PhosphoSitePlus. Finally, the STRING resource is a collection of direct and indirect protein-protein interactions (28). To generate the STRING.bind library, we subset the STRING interactions to only those interactions that were annotated as involving physical binding to generate the STRING.bind library. We also used the entire STRING database to form the STRING library, including physical interaction, co-expression, co-occurrence in the literature, and evolutionary co-occurrence, among other association types.

Gene co-expression libraries
To create the KEA3 gene co-expression libraries, all GTEx RNA-seq samples were downloaded from the GTEx web server. Samples were quantile-normalized, and for duplicate genes, only the genes with the most significant variance were retained. For each kinase, the 300 genes with the most significant Pearson's correlation coefficients were selected to generate the kinase sets in the GTEx.coexp library. To create the ARCHS4.coexp library, human RNA-seq samples were downloaded from ARCHS4 (56). Fifty-thousand samples were randomly selected for co-expression analysis and then processed in the same way as for the GTEx data to generate the ARCHS4.coexp library.

Generating the benchmarking datasets
The Characteristic Direction (CD) method (57) was used to compute gene expression signatures from 329 kinase perturbation experiments containing 96 kinases. The list of studies was obtained from the manually curated signatures in the CREEDS resource (58). The perturbations include knockdowns, knockouts, overexpression, constitutively active mutants, chemical activation, and chemical inhibition of single kinases followed by microarray profiling. Gene sets containing the top 600 differentially expressed genes were determined by the absolute value of the Characteristic Direction coefficients constructed for each perturbation experiment. We term this benchmarking dataset KinCREEDSupdn.
For the DrugL1000updn benchmarking dataset, LINCS L1000 drug perturbation CD signatures retrieved via the L1000FWD API were subset to drugs with known kinase targets using the L1000FWD drug target annotations (59). For each LINCS perturbation identifier, the signature with the greatest cosine similarity from the batch center was selected. The union of the most significant upregulated and downregulated genes was used to compose 292 signatures.
The human PTMsigDB (17) phosphoproteomics signatures were derived from PhosphoSitePlus (52) quantitative mass-spectrometry experiments. Entries were subset to drug perturbations with known kinase targets. Drug targets were obtained from L1000FWD drug annotations (59). Drug perturbations with fewer than five associated HGNCmappable proteins were discarded, resulting in a benchmarking set of 15 phosphoproteomics drug perturbation signatures which cover 15 unique drugs with 98 annotated kinase targets, 50 of which are unique. We term this set PTMsigDB.drug.

Assessing inter-library concordance
The Fisher's Exact Test (FET) was used to compute similarity between all pairs of protein sets of the 24 kinasesubstrate libraries with a default background of 20,000 genes. For a library pair A and B, an integer ranking of the protein sets in B, termed the 'prediction library' was produced for each protein set in A, termed the 'query library,' based on the FET P-values. A rank of 1 represented the most significant FET P-value and a rank of k represented the least significant FET P-value where k is the number of protein-sets in the 'prediction library'. The rankings were then scaled from 1/k to 1. An empirical cumulative distribution function (ECDF) was computed from the scaled ranks of the protein set pairs in A and B that represent the same kinase. The ranks were scaled to values between 0 and 1 to obtain an area under the ECDF (AUECDF) for each library pair AB and BA.

Benchmarking libraries and integration methods
Each protein set from the KinCREEDSupdn, DrugL1000updn and PTMsigDB.drug benchmarking datasets was submitted to KEA3 and benchmarked. Kinases were ranked within each library according to the FET P-values, with ties broken randomly. Ranks within each library were then scaled between 0 and 1. The R package PRROC was used to compute the area under the receiver-operating characteristic (AUROC) curve and Precision-Recall (PR) curve for each library. The positive class consists of the scaled ranks of the 'true' kinase(s) associated with the query protein set. The negative class consists of the scaled ranks of all other kinases that were not associated with the query set. To generate PR curves, we downsampled the negative class to the same size as the positive class, similarly to the method described by Garcia-Alonso et al. (60). Each library has a different number of kinases and therefore has a different 'random classifier' PR curve. By down sampling the negative class to the same size as the positive class, a random classifier would have a PR area under the curve (AUC) of 0.5. PR curves were bootstrapped in this manner 1000 times and then the mean PR AUC was reported. The base R function approx() was used to interpolate between all points from the 1000 PR curves in order to generate composite PR curves for each library and integration method for visualization. We also employed an additional measure of performance by letting r be the scaled ranks of the 'true' kinase(s) associated with the protein set queries. We then examined the ECDF of this set of ranks, D(r). If the 'true' kinases do not display preferentially low or high ranks, then we expect a uniform distribution D(r) = U. We examined D(r) -U for significant deviations from zero to evaluate different libraries and methods. Anderson-Darling tests implemented via the goftest R package were used to evaluate the null hypothesis, D(r) = U.

Kinase enrichment analysis
KEA3 uses Fisher's Exact Tests with a background set of 20,000 genes to compute the significance of the overlap between the query input protein set and each protein set in the KEA3 protein set libraries. An integer rank from 1 to k for each protein set in a library size of k indicates sets with the lowest and highest P-values accordingly. A scaled rank is computed by dividing each integer rank by k. Thus, for a single query, there is one kinase rank list for each protein set library in KEA3. False discovery rates (FDRs) are computed via the Benjamini-Hochberg correction for each library separately. Out of the 24 candidate libraries, rank lists for the 11 final KEA3 libraries which met the benchmarking threshold are integrated via the MeanRank and TopRank methods (23). MeanRank is calculated from re-ranking a composite list of kinases by each kinase's mean integer rank across all libraries containing that kinase. A composite list of kinases is re-ranked with each kinase's best-scaled rank across all libraries to calculate TopRank.

The KEA3 web application
The backend of KEA3 is a Java servlet running on Tomcat 9 (61). The user interface was constructed with jQuery, Bootstrap and the web template application Mobirise (62). The web application runs in a Docker (63) container. The KEA3 source code repository is available at https://github. com/maayanlab/KEA3web.

Kinase co-expression network visualization
Weighted Gene Co-expression Network Analysis (WGCNA) (64) was applied to GTEx (65), ARCHS4  (67). To annotate the GTEx, ARCHS4 and TCGA networks, WGCNA module eigengenes were correlated to GTEx tissue sample labels, ARCHS4 sample labels, and TCGA tumor types, respectively. Nodes were colored by the most significant tissue/tumor correlation to their parent module.

Kinase co-regulatory network visualization
A kinase co-regulatory network was constructed from all kinase-kinase interactions described by the 11 KEA3 libraries. Edges are directed where kinase-substrate evidence supports the interaction and are undirected in the case of PPI or unspecified interaction evidence only. The network is a subset based on the top kinase results from a user query and is visualized using D3.js.

Computing kinase enrichment
KEA3 computes kinase substrate overrepresentation for a query protein set against 11 kinase-substrate set libraries covering 520 unique protein kinases (Table 3). KEA3 uses the FET to compare a user-submitted protein set query to each kinase-substrate set in each KEA3 library. A kinase ranking is returned for each library separately based on the FET P-values. For a given library, kinase rankings range from 1, which corresponds to the most significant FET, to k, where k is the number of kinase-substrate sets in the library. KEA3 results also return a scaled rank from 1/k to 1.

Constructing the KEA3 libraries
We constructed 24 known and putative kinase-substrate libraries with publicly available data from co-expression analysis, experimentally measured PPIs, predicted PPIs, measured KSIs, predicted KSIs and a database that integrates the interactions mentioned above with literature associations and evolutionary associations (28). Each resource was subset to interactions and associations involving the 520 HGNC-mappable protein kinases identified in Manning et al. (5), Miranda-Saavedra and Barton (27), and the Illuminating the Druggable Genome (IDG) project (11) (Table 3, Figure 1). We evaluated the 24 candidate libraries with three benchmarking datasets. We also assessed the performance of KEA3 in recovering perturbed kinases from microarray kinase gene perturbation experiments, kinase drug targets from microarray drug perturbation experiments, and kinase drug targets from phosphoproteomics drug perturbation experiments. We then selected the top 11 libraries for use in the KEA3 webserver based on these benchmarking results. We also benchmarked two methods that integrate the results from each of the 11 selected top KEA3 libraries to generate a composite kinase ranking.

Assessing inter-library predictability
We examined all pairs of the candidate KEA3 libraries, where one library was designated as the 'query' library, and the other library was designated as the 'prediction' library. We ranked all the protein sets of the prediction library according to the P-values resulting from pairwise FETs calculated for each kinase-associated gene set in the query library. We then constructed empirical cumulative distribution functions (ECDFs) from the scaled rank values where the two sets being compared were associated with the same kinase. The areas under the ECDFs (AUECDFs) were evaluated to visualize pairwise library predictability ( Figure 2).

Benchmarking the KEA3 libraries
We used three independent benchmarking datasets to evaluate the initial 24 KEA3 libraries. Each benchmarking dataset consists of gene/protein sets that are each associated with one or more kinases. The KinCREEDSupdn benchmark dataset consists of gene sets extracted from 329 kinase loss-of-function/gain-of-function (LOF/GOF) human and mouse microarray experiments mined from GEO by contributors to a crowd-sourcing project (58). Each gene set within the KinCREEDSupdn dataset consists of the 600 most differentially regulated genes from each kinase perturbation experiment. The DrugL1000updn is comprised of statistically significant up-regulated and down-regulated genes extracted from transcriptome-wide signatures imputed from the LINCS L1000 drug perturbation signatures (59). We took the subset of the drug perturbation signatures that have annotated kinase drug targets, such that each of the 292 DrugL1000updn gene sets is associated with one or more protein kinase drug targets. The third benchmarking set, PTMsigDB.drug, consists of human phosphoproteomic drug perturbation signatures derived from quantitative MS studies that measured differential phosphorylation states before and after drug perturbations (17,52). The 15 PTMsigDB.drug signatures represent 15 unique drugs with 98 annotated kinase targets total, 50 of which are unique kinase targets.
Each KEA3 candidate library was evaluated to see how well it recovers the 'true' kinase(s) in the query protein set from the benchmark datasets. ROC and PR curves were constructed from the scaled ranks of the 'true' kinases associated with the query set, composing the positive class, with the scaled rankings of the kinases not associated with the query composing the negative class ( Figure 3). The STRING library performed the best for the KinCREED-Supdn and DrugL1000updn benchmarking datasets, but interestingly, its performance falls for the PTMsigDB.drug dataset. In general, the PPI libraries performed better than the KSI libraries. HIPPIE, prePPI, and mentha were the best-performing PPI libraries for KinCREEDSupdn; HIP-PIE, mentha, and String.bind were the best performing PPI libraries for the DrugL1000updn dataset; and HIPPIE, mentha, and Cheng.PPI were the overall best performers for the PTMsigDB.drug dataset. The KSI libraries' performance improved in the PTMsigDB.drug benchmarking dataset compared to the other two benchmarking datasets. This may be because the PTMsigDB.drug dataset is derived from a readout type that directly measures kinase activity. The top performing KSI libraries in this benchmark were Cheng.KSIN, PhosphoSitePlus, and PTMsigDB.

Benchmarking the integrative methods
To construct the final KEA3 library set, we selected the 11 libraries with a ROC AUC and mean PR AUC in the top 50% of all libraries for at least two of the three benchmarks. Using the 11 libraries that passed this threshold, we assessed the predictive performance of two integration methods, MeanRank and TopRank, as previously described (23) for the three benchmarking datasets (Figures 4 and 5). MeanRank was the top-performing method for the Kin-CREEDSupdn dataset and was second-best to STRING for the DrugL1000updn dataset. MeanRank was also second to HIPPIE for the PTMsigDB.drug dataset. The TopRank in- Heatmap showing all pairwise library comparisons. The tile color shows the AUC of the ECDF that represents how well a given 'prediction' library was able to recover the 'correct' kinase associated with gene sets from the 'query' library. tegration method performed third, third and fifth for the KinCREEDSupdn, DrugL1000updn and PTMsigDB.drug datasets. While MeanRank was not the best performer in all three benchmarking datasets, it was the most consistent, as it is the only method to always be among the top two performers.

Using the KEA3 web application
When users first navigate to the KEA3 homepage (https: //maayanlab.cloud/kea3/), they are presented with an input form. To begin an analysis session, users would need to paste a list of proteins encoded as human or mouse gene symbols. Alternatively, users may also upload an existing text file containing the protein names, with one entry per line. KEA3 currently supports HGNC-approved gene symbols, and the webserver application will automatically tell users if there are any invalid or duplicate symbols in their input. Once the input has been submitted, the user may scroll down to view the kinase enrichment results. The 'Integrated results' tab is displayed by default, and shows the bar charts, tables, subnetwork and clustergrammer visualizations for the MeanRank and TopRank methods. These integrated results are shown on the first tab because they account for the results across all libraries, are less redundant, and performed well across the KEA3 benchmarks. Individual library results can be accessed using the other tabs.  'Clustergrammer' tab provides an interactive clustergram of overlapping substrate targets between the input and the top library results, produced using the Clustergrammer application (68).

The KEA3 Appyter
To provide users with the option to obtain KEA3 results as a downloadable Jupyter Notebook, we also developed the KEA3 Appyter. Appyters are standalone web-based applications that generate a Jupyter Notebook from a user input (69). The KEA3 Appyter takes as input a list of proteins, for example, differentially phosphorylated proteins, in the form of plain text or a text file. Using the KEA3 API, the Appyter queries the KEA3 server, and displays the results as a Jupyter Notebook. The notebook displays the results as an interactive bar chart and with tables of the top 10 kinases for integrated scores and all individual KEA3 libraries. This Jupyter Notebook can be saved, repurposed for different inputs, or used as part of other analysis pipelines and workflows. The KEA3 Appyter is available at: https://appyters.maayanlab.cloud/KEA3 Appyter/

The SARS-CoV-2 kinase enrichment analysis case study
Over the past year, the coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus has become a predominant focus of the scientific research community. Many research teams have altered their focus toward gaining a better understanding of the viral mechanisms underlying SARS-CoV-2 infection. Currently, there is still much unknown about SARS-CoV-2 infection and replication within human cells. In this case study, we attempt to demonstrate how kinase enrichment analysis using KEA3 can be applied to data compiled from a recent phosphoproteomics study to provide additional insight on some of those intracellular molecular mechanisms. The phosphoproteomics data were derived from a study of the phosphorylation changes induced by SARS-CoV-2 infection in Vero E6 cells (70). Upand down-phosphorylated consensus protein sets were generated by filtering the data for phosphosites with log2 ure 6A), and removing duplicate entries. To identify potential upstream regulatory mechanisms responsible for the observed changes in protein phosphorylation upon SARS-CoV-2 infection, or involved in viral-host protein interactions, we performed kinase enrichment analysis on each of the consensus sets created above using KEA3 ( Figure 6B and C).
The top ranked most enriched kinases for the upphosphorylated proteins show three members of the casein kinase family. Casein kinases are serine/threonine kinases that participate in many cell-signaling pathways, including DNA repair (71). It was recently shown that CSNK2A2 directly interacts with SARS-CoV-2 N protein (72). Hence, it is possible that viral evading strategies are mediated by altering cell-signaling regulated by CSNK2A2. The top ten enriched kinases for the up-phosphorylated proteins also include SRPK1 and SRPK3. SRPK1 is highly expressed in most tissues and mostly associated with DNA and RNA processing (73), while SRPK3 is involved more specifically in muscle related functions (74) and as such could be linked to cardiac complications observed in some COVID-19 patients (75). Another interesting protein kinase that is found in the top-ranked up-phosphorylated proteins is CDK9. In previous studies, activated CDK9 has been demonstrated to play a role in regulating innate immune responses (76).
The top kinases enriched for the phosphoproteomics consensus down set are CDK1 and CDK2, suggesting downregulation of the cell cycle. This is a common cellular immune response upon viral or bacterial infection of Vero cells. Other top kinases include p38, GSK3B, and AKT1 which are known as the downstream kinases for several interleukin signals. In addition, AURKB is a cell cycle kinase that plays a significant role in chromosome segregation during mitosis (77), while GSK3B is a serine-threonine kinase and part of the glycogen synthase kinase-3 family that has been associated with viral genome replication in COVID-19 (78). While further study is needed to elucidate the specific impact of these kinases in SARS-CoV-2 and other viral infections, this case study illustrates the usefulness and applicability of KEA3 to current and future phosphoproteomics studies. Using the KEA3 approach, kinase inhibitors could be designed to mitigate the effect of SARS-CoV-2 on cells, although this needs to be done carefully because some identified kinases are part of innate immune response pathways, while others are altered by the virus to evade such immune responses.

SUMMARY
Phosphoproteomics efforts have detected tens of thousands of phosphorylation sites in cellular proteins. However, in most cases, the kinases that are responsible for these posttranslational modifications are unknown. For instance, less than 5% of the phosphorylation sites in PhosphoSitePlus are annotated with kinases (52,53). To develop KEA3 we combined directly measured KSIs, withPPIs and coexpression data sources to predict upstream kinases given lists of differentially phosphorylated proteins. PPI detection methods do not uncover the directionality or effect of the interaction between two proteins; however, we used these datasets as a proxy for KSIs. In this same vein, we also included kinase co-expression libraries with the notion that members of pathways tend to be co-expressed (79). While ultimately the co-expression libraries did not show a strong enough signal in our benchmarks to pass the threshold for inclusion in the final set of 11 libraries used within the KEA3 web-server application, these are made available for download from the KEA3 website.
The approach we used to assess interlibrary predictability ( Figure 2) simultaneously evaluates: (i) the concordance of sets associated with the same kinases within the library pair under consideration and (ii) how well a given library can distinguish between kinases. The libraries derived from PPI sources show high inter-library predictabil- ity, which is unsurprising given the substantial redundancy among many sources (29,30,34,41). The AUCs for PPI-KSI library pairs indicate that, while PPIs may be a less direct source for kinase substrates than directly measured KSIs, PPIs are useful in identifying the correct upstream kinases.
We used three independent benchmarking datasets to evaluate the predictive performance of the KEA3 candidate libraries. The KinCREEDSupdn and DrugL1000updn datasets are derived from gene expression signatures. They rely on the assumption that when a kinase is perturbed experimentally or is the target of a small molecule, the transcripts encoding the kinase's substrates will also be measurably perturbed as a downstream effect. The signal in ROC, PR, and bridge plots of the libraries derived from KSI sources tested on these benchmarking datasets supports this hypothesis. The PTMsigDB.drug benchmarking dataset more directly tests the predictive performance of KEA3 by querying the libraries with drug perturbation phosphoproteomics signatures, with the underlying assumption being that the substrates of the small molecule-affected kinase(s) will be differentially phosphorylated. However, experiments measuring global changes in phosphorylation following perturbation are few, and the PTMsigDB.drug benchmarking set is small. Taken together, however, the three benchmarking sets indicate the KEA3 candidate libraries' comparative performance, as well as the performance of the two integrative methods, MeanRank and TopRank. Mean-Rank performed consistently well across the benchmarking datasets. The two top-performing libraries, STRING and HIPPIE, displayed variable performance depending on the benchmark query type. We would therefore recommend that users rely most heavily on the integrated Mean-Rank method. Finally, by reprocessing data from a recent SARS-CoV-2 phosphoproteomics study (70), we demonstrate how KEA3 can complement the analysis of differential mass-spectrometry phosphoproteomics studies. Our results are consistent with the authors of the original study, but also add clarity and confirmation about the key kinases involved.
It should be noted that kinase activity may not change, even if the modification level of their substrates increased or decreased. This is because the kinases and their substrates function in a complex environment that involves other interacting proteins. For example, the increase or decrease in the phosphorylation level of substrate proteins for a specific kinase might be attributed to changes in their localization, interactions with other partners, or due to competition with phosphatases that may also increase or decrease in quantity and/or activity.
Overall, KEA3 can be a useful tool for biologists to generate hypotheses from gene expression and phosphoproteomic profiling experiments. We note that KEA3 relies heavily on libraries with knowledge curated from the literature or high-throughput experiments on well-characterized kinases. Literature-based PPI and KSI interactions suffer from research focus biases where well-studied proteins are overrepresented (80). Expanding KEA3 libraries to incorporate global studies of kinase state and lesser-studied kinases (11) is the subject of future work. Future work will also include connecting top predicted kinases to the known small molecules that target them.