Using single cell atlas data to reconstruct regulatory networks

Abstract Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)–gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.


INTRODUCTION
The reconstruction of regulatory networks from functional genomics data has been a major research focus in computational biology (1)(2)(3)(4). Several methods for inferring transcription factor (TF)-gene interactions focused on the use of the expression of TFs to predict gene expression (5)(6)(7)(8)(9). While computational methodology and data differed, many methods shared a common, underlying, assumption: If the expression of a TF, or a combination of TFs, is a good proxy for the expression of a specific gene, then these TFs are likely the regulators of that gene. Starting from the early days of microarrays, through the use of next generation sequencing and more recently with single cell technologies, several computational methods have been developed and tested using these ideas (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11). Some predictions of these methods, either general or context specific, have also been successfully experimentally validated (4,7,10,11).
While the assumption about the relationship between TF and gene expression levels proved successful, it did not al-ways reflect biological realities. Most studies relied on snapshot or static data. In such studies, the expression levels of genes may not correlate with that of their regulating TFs due to the time delay between TF binding and the accumulation of expressions (12,13). In other cases, gene expression levels may be elevated prior to the expression of the TF and remain high after the activation with no causal relationship between the two (14). These issues can lead to both, false positive and false negative predictions.
Another issue arises from the data itself. Obviously, single cell data is much more powerful than bulk since it learns the interactions based on expression values from the same cells (15). However, expression alone may not be enough to predict such interactions. Several studies have shown that many TFs are post-translationally activated (16)(17)(18). For these TFs, high expression levels may not be directly correlated to activity, leading to false positive predictions (19).
Several methods have been used to infer such interactions. Early efforts mapped expression data to regulatory interactions by evaluating correlations between genes, using mutual information (20) and co-expression (21). Other approaches utilized regression to identify the best TFs for predicting target gene expressions. These methods include linear model based on LARS (22) and LASSO (23)(24)(25), and autoregressive models (26,27) which attempt to capture temporal expression patterns. Models that can learn more complex non-linear relationships have also been developed, including random forest (6,7), gradient-boosting trees (5), neural network (28), and Bayesian network (29)(30)(31). A few methods have also incorporated additional types of data beyond expression including ChIP-seq, and TF motif information (2,32,33), ATAC-seq and other epigenetic data (9) and protein protein interactions (32). However, these methods have also attempted to use the interaction information to predict gene expression which, as discussed above, may not reflect the current activation level of the target gene.
To overcome the above problems and improve the prediction of a global, cross tissue, gene regulatory network we developed methods that combined three novel strategies for regulatory networks inference. First, we predicted RNA velocity (34) instead of gene expression. Unlike gene expres-sion, which may not be able to represent dynamic activity of a gene, RNA velocity measures the real time activity of genes and thus can serve as a much better proxy for the level of gene regulation. A second aspect of our model is the use of scATAC-seq in addition to expression for inferring TF activation. Unlike expression, scATAC-seq is impacted by post-translational modifications and so can be used to infer the actual activity of TFs. Finally, our method integrates much larger single cell data from recent atlas studies (HuBMAP (35)), allowing it to utilize big data for the regulatory network inference task.
To combine different data types across tissues we developed a multi-task based deep learning framework to predict RNA velocity values for target genes. We constructed the final tissue-specific regulatory networks by ranking TFs using deep SHAP algorithm (36) on the trained models.
We tested our method and compared it to several previous methods. As we show, by using RNA velocity we can obtain more accurate results compared to previous methods. We discuss both global tissue-specific networks and specific organ function related subnetworks identified by our method and provide a comprehensive list of tissue specific predictions of TF-gene interactions for use in further downstream analysis.

Data preprocessing
Gene quantification and calculation of RNA velocities. Gene counts for scRNA-seq data were generated using the in-house HuBMAP transcriptomic pipelines which use Salmon (37) to map reads to NCBI GRCh38 reference genome (38) and quantify gene counts. Next, scVelo package (34) was used to estimate RNA velocity of each gene. Only genes having sufficient read counts in both spliced and unspliced regions and cells having enough genes with spliced and unspliced counts were kept.
Peak calling for scATAC-seq and SNARE-seq data. Reads mapping was performed using BWA short read aligner (39). SnapATAC package (40) was used to generate cellby-bin matrix and MACS2 package was used to call peaks (saved as BED format) over all cells. For SNARE-seq data, reads mapping, peak calling was performed using the same pipeline.
TF activity score. Activity score of each TF is computed based on binding sites information from ChIP-seq data and chromatin accessibility information from scATAC-seq data. We downloaded all ChIP-seq data from Cistrome database (41), which includes binding site locations for 1359 human TFs in its recent release. We kept only TFs that satisfied the following criteria: (i) sample median sequence quality score ≥25 (scores calculated from FastQC software); (ii) uniquely mapped ratio ≥60%; (iii) PBC score ≥80%; (iv) FRiP score ≥1%; (v) number of peaks with fold change >10 (PeaksFoldChangeAbove10 score ≥ 500). See instruction page of Cistrome for more details: http://cistrome. org/chilin/ downloads/instructions.pdf. After the filtering steps, 623 TFs were selected for calculating TF activity score matrix A (sum) and A (mean) .
calculates the summed TF activity of the ith TF with respect to the promoter region of jth gene. p TF(i, j,k) represents the kth binding site of the ith TF in the promoter region of jth gene. P (atac) (i, j,k) denotes the set of scATAC-seq peaks overlapping with p TF (i, j,k) . Note that P (atac) (i, j,k) is specific to each binding site of the TF-gene pair (See Figure 1B). In this study, we used ChIPseeker (42) to map the ChIP-seq binding sites to their corresponding nearest genes. We defined the promoter region of a gene using the default parameter in ChIPseeker package (42) (3000 bps upstream ∼3000 bp downstream region relative to the transcription start site). The function S performs the following operations: (i) computes the overlapped length between the kth peak of TF i relative to gene j and any scATAC-seq peak regions and (ii) divide the length from (i) by the length of kth peak of TF i relative to gene j. A (sum) i, j then summed over m such overlapped regions for TF i relative to gene j. Similarly, A (mean) i, j averaged over m such overlapped regions for TF i relative to gene j. For scATAC-seq data, we used peak regions from bed files, which represents aggregated open chromatin regions from all single cells. For SNARE-seq data, we used peak regions from bed files but mapped each region back to each cell if any reads that produced the peak was from that cell. Therefore, for scATAC-seq data, there is a single A (sum) and A (mean) for each tissue and for SNARE-seq data, there is a single A (sum) and A (mean) for each single cell.
Filtering, standardization and scaling. First round of filtering was performed to remove target genes without enough cells producing RNA velocity values. For liver, we removed target genes having <1000 cells, and for other tissues, we removed target genes having <5000 cells. This filtering process was performed on the velocity matrix Y (see Table 2 for number of target genes, and cells after filtering in each tissue). Then we used the remaining cells in the velocity matrix to filter RPKM matrix X. To train MTLRank models, we only used the TF columns in RPKM matrix X and removed other columns (the list of current known human TFs was downloaded from (43)). RPKM values of each TF were then transformed by log 10 and standardized to zeromean and unit-variance. As a result, each tissue is assigned with a column-wise standardized RPKM matrix X ∈ R m×n , where m is the number of cells and n is the number of TFs (see Table 2 for number of TFs in each tissue). Similarly, velocity matrix Y was also column-wise standardized. TF activity scores were scaled by A (sum) = log 2 ( A (sum) + 1) and A (mean) = log 2 ( A (mean) + 1). In the next step, RPKM matrix X, velocity matrix Y, TF activity score matrix A (sum) and A (mean) were used as inputs to the MTLRank pipeline. Hubmap tissue-specific scRNA-seq and scATAC data sets were used. TF RPKM matrix was generated from tissue-specific scRNA-seq data and TF activity matrix was generated from the integration of Cistrome DB ChIP-seq TF prior information and scATAC-seq data. (B) Computation of TF activity scores from ChIP-seq data and scATAC-seq data. Blue blocks represent TF binding sites from ChIPseq data and green block represents open chromatin region from scATAC-seq data. TF binding sites were weighted by the scATAC-seq open chromatin regions (see Materials and Methods for more details). (C) Multi-task based model training and deep SHAP based TF ranking. The final tissue-specific regulatory networks were constructed from the TF ranking results.

MTLRank framework
Overview of MTLRank framework. MTLRank prediction mainly consists of two steps. The first step involves training multi-task models with RPKM expressions, and activity scores of TFs as inputs to predict velocities of genes. Each gene has its own model, but model parameters are shared through a multi-task learning framework. This reduces overfitting and can lead to better performance (44). The second step is to rank the TFs for each of the models trained in the first step. In MTLRank's framework, we adopted deep SHAP algorithm to compute SHAP value for each TF and ranked TFs by sum of absolute SHAP values. A regulatory network was then constructed based the rank- ing results. Details of each step are described in the following subsections.
Model architecture and loss function. We trained a multilayer neural network to identify TFs regulating genes. Supplementary Figure S1 illustrates the model architecture. The first layer is the input layer that takes in TF activity score matrices A (sum) and A (mean) , and RPKM expression matrix X for the same TFs. The second layer is a TF aggregation layer in which each neuron represents a TF that takes the activity scores and RPKM expressions for that TF. For example, in a model that predicts velocity values for the ith gene, the kth neuron in the TF aggregation layer performs the op- for a single input, where w 1,k , w 2,k and w 3,k are learned during training and are specific to kth TF, and X c,k represents standardized RPKM expression for the kth TF in cell c. In such case, each neuron in the TF aggregation layer transforms the weighted means of TF activity scores and TF RPKM expressions. TF aggregation layer is then followed by three fully connected layers with 64, 32 and 16 neurons. We named these layers as FC1, FC2 and FC3. FC3 is connected to the final output layer that predicts RNA velocity of the target gene i. We note that we tested additional architectures with more or less layers but did not observe large changes in results. We applied trace norm (44) to perform multi-task learning across different models. Trace norm regularization adopts a 'soft sharing' strategy, meaning that all models share their parameters in an indirect manner. This is achieved by first concatenating parameters from different models into a single matrix, then minimizing the sum of singular values of the concatenated matrix to encourage it to become a lowrank matrix. To reduce the computational cost, instead of sharing parameters among all models, our framework only shared parameters within each gene cluster. For each tissue, we assigned genes into nearly equal-sized clusters based on their velocity values. Gene clustering was performed using constrained k-means clustering algorithm (minimum cluster size = 24, maximum cluster size = 25) (45). Within each cluster, the loss function of models is defined as: where T is the number of tasks within each cluster, which is equal to the number of genes in each cluster. N(i ) denotes the number of cells (training examples) available for each task. Y i,c is the true RNA velocity for gene i in cell c, and , E c,· , θ) computes the predicted RNA velocity through forward propagation using a single input and current model parameters θ . A , E c,· denotes activity scores of all TFs with respect to gene i and expressions of all TFs in cell c. L is the mean squared loss function. W [i,k] is the model weight matrix for kth layer of the ith task.
The term λ 1 represents L1 regularization that penalizes complexity of all models and all layers. We used the default value λ value from Keras (version 2.8.0). W [k] represents a 2D matrix of model parameters concatenated from the kth layer of each task. Parameter tensor from each task is flattened into a 1D vector before concatenation so that each column of W [k] represents parameters from a single task (Supplementary Figure S1 and Supple- * is the regularization term that sums trace norm over m shared layers and γ controls the strength of trace norm regularization. Given that lowerlevel features are usually more similar than higher-level features across different tasks (44), we only shared layer FC1 and FC2 (Supplementary Figure S1 and Supplementary file 1). Accordingly, layer FC3 and the output layer were trained as task-specific layers. Trace norm of W [k] is computed as (44):  (44): The loss function was optimized using Adam optimizer (46) provided by TensorFlow version 2.7.0 (47).
Training and testing. Since not all genes had velocity values for all cells, training and testing sets were constructed on a per-gene basis. For each tissue, we split RPKM matrix and TF activity score matrices by cells. The ratio between training cells and testing cells is 9:1. Then for each gene, we removed the cells that did not have velocity value available for that gene. This produced inputs with different training examples among the shared models. During training, an ensemble training batch was generated by sampling an equally sized batch for each task. The number of ensemble training batches for one epoch is equal to ceili ng( C max b ), where C max is the number of cells for the gene with most training examples among all tasks and b is the pre-specified batch size. ceili ng rounds the value to the least integer greater than that value. Bootstrapping was used for genes with fewer available training samples. The above sampling process was repeated until all training examples have been used at least once and this is a complete epoch for training. After the specified number of epochs were finished, testing was performed on each gene separately using the test set. Standardization and scaling of the RPKM matrix and TF activity score matrices were performed separately for training and testing set as described previously. For each tissue, the above training and testing procedures were repeated three times. R 2 score was computed for each run and the final score Nucleic Acids Research, 2023, Vol. 51, No. 7 e38 was the average R 2 from the three runs. All hyperparameters used during training are shown in Supplementary Table S2.
Ranking TFs using deep SHAP. TFs were ranked by deep SHAP (Shapley Additive exPlanation) algorithm (36). Given a single input (a cell), deep SHAP assigns a SHAP value to each input feature. The SHAP value represents the impact of a specific input on the output compared to using a set of reference examples as inputs. In this study, we defined a reference example as an input vector with mean expressions of all cells and zero-value TF activity scores. The final importance score for each TF is then calculated by the following steps: (i) for each tissue, we computed a single reference example; (ii) we ran deep SHAP and compared all training examples to the single reference example; (iii) we summed absolute SHAP value for each input over all training examples and further aggregated for each TF the SHAP values from expressions and the TF activity scores. This resulted in a single non-negative importance score for each TF. The larger the importance score is, the more it contributes to the predicted outcome.
Construction of regulatory networks. We performed the following steps to construct a single regulatory network for each gene: (i) for each gene, we ran model training and deep SHAP TF ranking for five rounds; (ii) we extracted the topranked 50 TFs in each run and obtained the common top TFs across all five runs; (iii) we connected the TFs obtained in step (ii) to the corresponding target genes; (iv) repeat the above steps for all genes. Once a regulatory network is constructed, TFs could be further ranked by their degrees.

Benchmarking different methods and validating TFs
Benchmarking different methods. We compared MTL-Rank to other regulatory network inference methods. Linear models have been widely adopted for inferring regulatory networks using only gene expression data (9,23,24,25). We thus included a baseline LASSO regression method which predicts gene expression rather than RNA velocity. Furthermore, we also included a baseline NN model that uses the same architecture with MTLRank but without parameter sharing, two widely used network inference methods that only take expressions as inputs (GENIE3 (6) and GRNboost2 (5)), and a network inference approach that takes expressions and ATAC-seq data (CellOracle (48)). For each tissue, we randomly sampled 500 genes and ran all prediction methods on the 500 genes. Specifically, for MTL-Rank, we first clustered the 500 genes into 20 equally sized clusters and ran MTLRank training and testing framework as described previously. Unlike neural network, LASSO, GENIE3, GRNBoost2 cannot include different types of prior information as inputs. Hence, for these methods we only used RPKM matrix as input to predict expressions of target genes. For CellOracle, in addition to expression data, we also used our TF activity matrices computed from scATAC-seq and ChIP-seq data as inputs. We set any entries not equal to zero as 1. We used the ridge regression model adopted by CellOracle (48) to train tissue-level models and predicted velocities. Training and testing were performed as described previously. Hyperparameters used for these methods are summarized in Supplementary Table S3. R 2 score. We used standard R 2 score to evaluate how well the model predict RPKM expressions or RNA velocities of the target genes. It is defined as the equation below: where y i denotes the true RNA velocity/RPKM expression for the ith target gene in the testing set andȳ denotes the mean of these true values.
Validating TFs. While the gold-standard for tissue-specific regulatory network is lacking, the roles of the predicted TFs could be validated using various types of experimental evidence. To validate TFs in the predicted tissuespecific regulatory networks, we downloaded curated TFs deposited at TF-Marker database (49) (http://bio.liclab.net/ TF-Marker/). TF-Marker has collected various types of tissue-specific TFs that are tissue markers, or are regulating the tissue marker genes, or are regulated by tissue marker genes. We used the overlap between tissue-specific marker TFs and the input TFs in RPKM matrix as the 'ground truth TF' for the predicted tissue regulatory networks. Supplementary Table S4 shows the number of 'ground truth TF' in each tissue. The recall rate was used to evaluate the recovery of tissue-specific TFs: where TP stands for true positive, the number of successfully recovered tissue-specific TF and FN stands for false negative, the number of true tissue-specific TF not identified by the model. The alternative metric could be precision score, which is defined as: where FP stands for false positive, the number of predicted positive TF that were not collected in the TF-Marker database. The precision score, however, could be biased because we might not have enough 'true' TFs. The nominator could be underestimated when the set of TP is incomplete. We have collected from the database a number of tissue-specific TFs (TP) but we are likely missing several more. Those TFs selected by the model but not collected by the database are not necessarily false. Due to incomplete ground truth, we did not include precision score in our evaluation.

MTLRank, a multi-task learning based framework for predicting regulatory associations
To characterize the dynamics of gene regulations, we developed a multi-task learning (MTL) based framework, MTL-Rank, that uses tissue specific single cell and general information to determine TF-gene interactions. Our method  TFs  864  2147  2276  1736  1350  2399  Genes  1308  554  2019  515  1827  2380  Cells  4656  25 280  93 640  11 960  67 226  54 127  scATAC-seq  Peaks  113 726  270 033  8887  98 077  27 645  27 371  Cells  50 748  310 661  48 264  8368  11 857  17 482 uses single cell RNA-Seq and ATAC-Seq, and ChIP-Seq from bulk studies. For this, we collected and processed scRNA-seq and scATAC-seq data sets from 6 human tissues and 29 individual donors ( Figure 1A, Supplementary Table S1). All data sets were from the HuBMAP consortium (35). To model RNA velocity using activities from TFs, we used TF expressions from the processed scRNA-seq data. scATAC-seq data was combined with ChIP-seq binding site data to generate TF-gene activity scores ( Figure 1C, Table 1). Using these data sets as inputs, we built gene specific models in each tissue to predict the velocity of genes from TF expression and activity. Model parameters were shared for models within the same tissue using a 'soft sharing' MTL method ( Figure 1B, Materials and Methods). We further used the learned models to rank TFs and construct tissue-specific regulatory networks.

Comprehensive evaluation of MTLRank models
Evaluation based on prediction of velocities. Following the preprocessing steps (Methods), we obtained 4656-93 640 scRNA cells and 8368-310 661 scATAC cells for each tissue (Table 1). To evaluate MTLRank we compared velocity and gene expression predictions based on cross validation R 2 scores between MTLRank and several prior methods for learning TF-gene interactions. These methods included LASSO regression (50), a baseline neural network model (NN), GENIE3 (6) and GRNboost (5) (see Materials and Methods). We tested these methods either as expressionbased methods or velocity-based methods. For expressionbased methods, we used the TF RPKM as input and the gene expression RPKM as outputs, similar to prior strategies (5,6,51). For velocity-based methods, we incorporated two additional sources of information: TF activity scores for inputs and the gene velocity values as outputs. Results, presented in Figure 2 show that MTLRank using RPKM of TFs + TF activity scores & velocities as outputs outperforms other methods and other input-output combinations ( Figure 2A). Additionally, baseline NN model that uses the same architecture as MTLRank yielded better predictions for velocity values than RPKM expressions even when using the same inputs (Figure 2A). We observed only a marginal improvement when adding TF activity scores for liver, large intestine, and left kidney in baseline NN models (Figure 2A). Unlike TF expressions, which are derived from the same cells from which we obtained the velocity for the gene, scATAC-seq is from a different cell (or, in our case, a set of cells) for the above tissues (Materials and Methods). Our current framework addresses this by averaging the scATAC-seq signals across different cells and used these averaged signals as inputs for model training. This may have impacted model performance. To validate this, we further collected SNARE-seq (single-nucleus chromatin accessibility and mRNA expression sequencing (52)) data sets for lung and kidney tissues (35). Unlike standard scATAC-Seq, SNARE-seq data profiles chromatin accessibility and RNA expressions in the same single cell ( Table 2). Training and testing were performed using the same strategy as previously described (Materials and Methods). Our results demonstrate that the single-cell-specific TF activity scores indeed improved the model performance when such information was available ( Figure 2B). This means averaging the signals from scATAC-seq data can indeed lower the model performance. Note that the data used for Figure 2A and B is from different experiments (and tissues) and therefore the results are not directly comparable.
scRNA-seq is known to suffer from drop-out event characterized by high proportion of zero expressions (53). We performed additional experiment to test the robustness of our method to drop-out event. Results, presented in Supplementary Figure S3 and Supplementary file 1, indicate that our framework can still perform well even when the dropout rate reaches 60%, although there is a slight loss of performance at this percentage.
Evaluation based on recall of marker TFs. The R 2 scorebased evaluation can quantify the model performance in general. However, it does not validate whether the models correctly recover specific TFs. Therefore, we sought to focus on validating whether the TFs predicted in the tissuespecific networks are indeed key TFs for that tissue (Materials and Methods). Due to the lack of 'gold-standard' data sets, systematic evaluation of TFs is a challenging task. We thus first looked at whether the recovered TFs are known to be tissue-specific markers using TF-Marker database (49). We compared recall percentage of TF markers to a random selection method, which selected the same number of TFs as present in each tissue-specific network. The results, presented in Figure 2C, indicate that MTLRank models outperformed the random selection method for TF marker recall (Table 3, Supplementary file 4). MTLRank achieved the best performance in liver, which is also the best performing tissue from the R 2 score evaluation. We computed p-values using hypergeometric test (using the number of input TFs in each tissue as background) to evaluate whether the identified TFs were enriched for known tissue specific TF markers. We observed significant results for liver (corrected Pvalue = 1.59 × 10 −8 ), kidney (corrected P-value = 1.70 × 10 −4 ), and lung (corrected P-value = 2.01 × 10 −2 ). In contrast, TFs selected by random method did not significantly overlap with TF markers in any tissue ( Figure 2C). Further, the recall values for MTLRank are significantly higher than the recall values from the random method (Wilcoxon sum rank test, P-value = 9 × 10 −3 ). Next, we ranked identi- for the 500 randomly sampled genes in each tissue from SNARE-seq data. Left part represents R 2 for all 500 genes and right part represents R 2 for genes that have >200 available TF activity scores from the SNARE-seq data. The bars marked with 'single cell' stand for the scATAC-seq signals that were paired with the scRNA-seq data at single cell level and the bars marked with 'average' stand for the scATAC-seq signals that were averaged out across the cells. (C) Recall percentage of the TF-Marker database marker genes. The recall percentage was computed for the TFs in each tissue-specific network. The two numbers on top of each bar indicates the total number of recallable TFs versus recalled TFs by the method. P-values from hypergeometric test were also marked on top the bars. P-values <0.05 were marked by red color. Total number of recallable TFs are the intersection between the tissue markers from TF-Marker database and all available TFs in RPKM matrix. fied TFs by degree in each tissue and found that MTLRank successfully identified some well-known tissue TF markers. For example, JUN, FOS and MAF were TF markers that appeared among the top 10 TFs in liver (Table 3). JUN and FOS are involved in liver development and regeneration (54), and MAF plays important role for erythropoiesis in fetal liver (55). In the top 10 TFs of kidney, we found two TF markers (Table 3) including ID1, a transcriptional inhibitor that has been reported to drive dedifferentiation of kidney epithelial cells (56), and EGR1, an early growth response protein associated with diabetic kidney disease (57,58). ID1 and FOS were also found in the top 10 TFs of right lung (Table 3). ID1 has been shown to promote migration of lung cancer cells (59), and FOS was reported to regulate inflammatory response during acute lung injury (60).
Evaluation based on TF perturbation data. Currently, the gold-standard GRNs (gene regulatory networks) are lacking for evaluating GRN inference approaches. We therefore resort to using a high-throughput TF perturbation data set as an additional validation data set, in which the ground truth is anticipated to be similar to genes related to the perturbed TF. We applied our framework to a single cell dataset with perturbed c-Jun TF in human CAR-T cell (61). We obtained target genes for c-Jun from JASPAR motif analysis (downloaded from Harmonizome website (62)). We aimed to evaluate whether our framework can successfully recover the TFs as target genes for c-Jun (hereafter referred as c-Jun related TFs). We next used scRNA-seq data and ATAC-seq data from CAR-T cells, and ChIP-seq data from CistromDB [7] as inputs and examined whether our method can recover c-Jun related TFs. We ranked the top TFs following the same procedure in our previous analysis. The result suggests that the top ranked TFs from our model are significantly enriched for c-Jun related TFs (Supplementary Figure S2 and Supplementary file 1), and that MTLRank successfully captured the regulatory response to the perturbation on c-Jun related TFs.

Tissue-specific networks reveals key pathways and essential organ functions
We next constructed tissue-specific networks by performing model-based ranking for TFs. Briefly, we trained models using all available target genes in each tissue. To construct tissue-specific networks, the top 50 commonly predicted TFs among different runs were selected as the regulators for the corresponding target gene. We next used the networks to compute the importance of each TF for each gene using deep SHAP (Methods). See Supplementary File 2 for the predicted tissue-specific networks. We ranked TFs in each network by their degrees and found that there are four TFs (FOS, JUN, RPS27A and ELF3) commonly present among the top 10 TFs in at least three tissues. FOS was found in liver, spleen, left kidney, and right lung; JUN was found in liver spleen, and right lung; RPS27A was found in spleen, left kidney, and right lung; ELF3 was found in left kidney, large intestine, and right lung. Among the four TFs, FOS, JUN and RPS27A were labeled as 'low tissue specificity' in The Human Protein Atlas portal (https://www.proteinatlas.org/) (63), supporting their universal roles in gene regulation. Among the four TFs, FOS and JUN are known to interact by forming protein heterodimer AP-1 (64), which is involved in various cellular key pathways including cell proliferation (64), apoptosis (65), differentiation (66) and activation of T cells (67).
In the tissue-specific networks, FOS and JUN were jointly selected in liver, spleen, and lung. Particularly, for spleen there are 266 predicted targets for FOS and 184 predicted targets for JUN, with 65 common targets (hypegeometric P-value = 5.14 × 10 −14 , using genes listed in Table 1 as background). We further investigated the union of FOS-JUN targets by hypergeometric test-based GO overrepresentation analysis. The results show that GO terms related to T cell activation/differentiation and interleukin production are significantly overrepresented in these genes (Supplementary Table S5 and Supplementary file 1). This observation is consistent with previous findings that FOS and JUN can co-regulate interleukin 2, which further modulate cell-mediated immunity (68). RPS27A is a ribosomal protein involved in cell proliferation, regulation of cell cycle, and apoptosis (69). ELF3 is one of the ETS (Erythroblast Transformation Specific) family transcription factors that are involved in regulation of inflammatory response (70). The predicted active roles of ELF3 in kidney and lung were also previously reported (71). We also performed gene set enrichment analysis (GSEA) for the TFs and target genes in each network, using their respective degree ranking as inputs to GSEA. Results (supplementary file 3), show that TFs/target genes in the liver, spleen and kidney networks were enriched for distinctive annotations related to essential organ functions. For example, the KEGG (72) 'Drug metabolism' pathway is enriched among liver target genes (corrected P-value = 0.040); KEGG 'Th1 (T helper) and Th2 cell differentiation' pathway is enriched among spleen target genes (corrected Pvalue = 0.002), and KEGG 'MAPK signaling pathway' is enriched among spleen TFs (corrected P-value = 0.010). Based on GSEA results, we performed leading edge analysis (73) to extract genes that contribute most to the identified enriched terms. We then used these genes and their regulators to construct three tissue subnetworks with distinctive organ functions, as we discuss below.
Liver drug metabolism network. Drug metabolism is one of the key liver functions (74). Leading edge analysis has identified 12 genes from the KEGG pathway 'Drug metabolism' (Supplementary file 3), among which CYP3A4is one of the top-10 target genes in terms of degree in the liver network ( Figure 3A). CYP3A4 is an important enzyme from the cytochrome P450 family that catalyzes many drug metabolism related reactions (75). As predicted, CYP3A4 is primarily found in liver (76). The other four target genes in the liver drug metabolism network (CYP2B6, CYP2E1, CYP3A5, CYP2C19) are also from the cytochrome P450 family (Figure 3A). Their active roles during drug metabolism processes were previously reported (77)(78)(79)(80).
Spleen MAPK-th1/th2 network. Activation of immune response is the essential function for spleen (81). Target genes identified for spleen are enriched for various immune related KEGG pathway terms, including 'C-type lectin receptor signaling pathway' (corrected P-value = 0), 'Th1 (T helper 1) and Th2 cell differentiation' (corrected P-value = 0.002) and 'T cell receptor signaling pathway' (corrected Pvalue = 0.023). (Supplementary file 2). MAPKs (mitogenactivated protein kinases) proteins are known to promote Th1 immune response through regulating production of cytokines (82). To further investigate how MAPKs can modulate Th1/Th2 immune response, we extracted leading edge genes annotated as 'MAPK signaling pathway' for TFs and those annotated as 'Th1 and Th2 cell differentiation' for target genes. This yielded a small network with 5 TFs and 13 target genes. This subnetwork contained two HSPA (heat shock protein A) family TFs targeting IL2RB (interleukin 2 receptor subunit beta) ( Figure 3B), a cytokine receptor important for Th1 cell differentiation (83). Previous studies have also observed a strong correlation between MAPK genes and HSPA (heat shock protein A) family genes (84,85). Though direct interactions between HSPA genes and IL2RB have not been reported, it is likely that HSPA genes activate the MAPKs upon sensing of extracellular/intracellular stress, which further regulate immune response by regulating cytokines and their receptors. Taken together, the spleen MAPK-Th1/Th2 network may represent MAPKs mediated immune response in spleen tissue.
Kidney calcium signaling network. One of the important functions for kidney is maintaining the balance of calcium (86,87). Target genes for kidney are enriched for KEGG pathway 'Calcium signaling pathway' (corrected P-value = 0). Leading edge analysis identified 49 target genes associated with calcium signaling. We further extracted these 49 target genes and their regulators from the kidney network ( Figure 3C). Among these genes we found two CAMK (Ca 2+ /calmodulin-dependent protein kinase) family genes, (CAMK4, CAMK2D), and one CALM (calmodulin) family gene (CALM1). Genes from both families are important for calcium signaling (88,89). Additionally, we found multiple genes from the FGF (fibroblast growth factor) fam-ily and its receptor family FGFR in calcium signaling network (FGF10, FGF7, FGFR3, FGF9, Figure 3C). FGFs can regulate calcium metabolism by interacting with klotho proteins (90), suggesting the possible roles of the identified FGFs and FGFR genes. We also conducted experiments to test the ability of MTLRank to infer cell type specific regulatory networks. See Supplementary Figure S4 and Supplementary file 1 for more information.

Web portal for the query of tissue-specific networks
To facilitate the query of the predicted tissue-specific interactions, we built a web portal, HuBNet (https://hubnetqs.herokuapp.com/). Interactions in each tissue are ranked by their importance scores (Materials and Methods). Users may query the predicted tissue-specific interactions by providing a list of TFs and target genes of interest and select a tissue. The web portal will return the query results in real time, which can be viewed in a table format or as network graphs (Supplementary Figure S5 and Supplementary file 1).

DISCUSSION
Recent advances of single cell technologies have led to a surge of studies that generate tissue specific genomic profiles at atlas scale. The HuBMAP consortium is one such effort generating several tissue-specific single cell genomic data sets (91)(92)(93).
Here, we developed MTLRank, a deep NN method to incorporate chromatin accessibility, TF binding site information, and RNA velocity values to predict TF-gene interactions across multiple HuBMAP tissues. We showed that MTLRank can accurately identify TF-gene interactions and that it provides known and new interactions that can shed new light on the activity and pathways in several different tissues.
While RNA velocity analysis improves the performance, it also raises new challenges. Due to the nature of RNA velocity computation, many genes do not have available velocity values in all cells. To address this challenge, MTLRank relies on multi-task learning to share parameters across individual gene models. Similar parameter sharing strategy has been shown in the past to reduce model overfitting and improve model generalization (44).
The predicted regulatory associations between TFs and target genes are quantified in a model-specific manner, rather than by using a model-agnostic ranking method. This improves interpretability of the trained models and may help identify higher order TF-TF correlations that contribute to the ranking of input features. As co-regulation among TFs is also a topic of interest in regulatory genomics, the trained models may be further explored to reveal such interactions.
The lack of gold standard validation data set poses a great challenge for methods that predicts TF-gene associations. While synthetic data set can construct ground truth to be recovered (94), we believe such strategy cannot evaluate approaches using multiple types of prior information and does not characterize the importance of tissue-specific TFs. To address this issue, we performed a R 2 score-based evaluation and a TF tissue specific marker-based evaluation in this study. We also examined whether these sources of validation overlapped with the ChIP-seq TF data we used for learning the models. We found that in the TF-Marker database, only five TFs for heart were assigned using ChIP-seq experiment (ELK1, ESR1, HAND1, NFIL3, POU3F2), and we thus removed these from the analysis. All other validated TFs are based on independent data and so can be used for validation.
While the tissue-specific networks have successfully reconstructed the organ functional pathways, our framework has a few limitations. First, the size of training data set is limited by the number of cells and genes with available RNA velocity values. This could possibly be improved in the future with improvements for methods to compute RNA velocity values. Second, it is difficult to fine tune the hyperparameters for thousands of models. For tissue with >2000 target genes and 2000 TFs, training models and computing SHAP values could be challenging (Supplementary Figure  S6). Third, due to the small number of TF known markers, we were not able to test the recall of TF marker for spleen, where strong immune response pathways were found.
Finally, there are several potential directions to further expand the work. When time series data is available in addition to snapshot data such data can be integrated to further improve predictions. In addition, extracting TF-TF correlations from the trained model may identify co-factors important for the given functional pathways.

AVAILABILITY
The script for preprocessing, modeling and analysis of predicted interactions are available at GitHub repository: https://github.com/alexQiSong/MTLRank.

DATA AVAILABILITY
The data underlying this article were provided by the HuBMAP consortium and Cistrome database. Data will be shared on request to the HuBMAP consortium and Cistrome database maintenance team.

ACCESSION NUMBERS
scRNA-seq, scATAC-seq and SNARE-seq datasets used in this study can be accessed at the HubMap data portal: https://portal.hubmapconsortium.org/. All accession numbers used in this study are listed in Supplementary Table S1.