scMCs: a framework for single-cell multi-omics data integration and multiple clusterings

Abstract Motivation The integration of single-cell multi-omics data can uncover the underlying regulatory basis of diverse cell types and states. However, contemporary methods disregard the omics individuality, and the high noise, sparsity, and heterogeneity of single-cell data also impact the fusion effect. Furthermore, available single-cell clustering methods only focus on the cell type clustering, which cannot mine the alternative clustering to comprehensively analyze cells. Results We propose a single-cell data fusion based multiple clustering (scMCs) approach that can jointly model single-cell transcriptomics and epigenetic data, and explore multiple different clusterings. scMCs first mines the omics-specific and cross-omics consistent representations, then fuses them into a co-embedding representation, which can dissect cellular heterogeneity and impute data. To discover the potential alternative clustering embedded in multi-omics, scMCs projects the co-embedding representation into different salient subspaces. Meanwhile, it reduces the redundancy between subspaces to enhance the diversity of alternative clusterings and optimizes the cluster centers in each subspace to boost the quality of corresponding clustering. Unlike single clustering, these alternative clusterings provide additional perspectives for understanding complex genetic information, such as cell types and states. Experimental results show that scMCs can effectively identify subcellular types, impute dropout events, and uncover diverse cell characteristics by giving different but meaningful clusterings. Availability and implementation The code is available at www.sdu-idea.cn/codes.php?name=scMCs.


Introduction
The advancement of single-cell sequencing techniques assists researchers to simultaneously obtain multiple omics data, which in return more precisely characterize the joint regulatory mechanism of multiple molecules (Luecken and Theis 2019). Specifically, singlecell RNA-sequencing (scRNA-seq) quantifies the mRNA abundance of genes in each cell, while single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC) characterizes the openness of cis-regulatory elements in nearby genes (Zhu et al. 2020). The joint analysis of scRNA-seq and scATAC data can strength key genetic information of different omics, and decipher gene regulatory relationships related with cellular heterogeneity (Macaulay et al. 2017;Hao et al. 2021).
Although the integration of single-cell multi-omics data can facilitate the study of complex biological information, the inherent characteristics of single-cell data, such as high sparsity, noise, and dimensionality mismatch, bring great computational and analytical challenges. Researchers have been developing single-cell multi-omics integration methods by leveraging machine learning and bioanalytical techniques. A line of methods build on non-negative matrix factorization or principal component analysis to integrate single-cell multi-omics data and resolve cellular heterogeneity (Duren et al. 2018;Welch et al. 2019;Argelaguet et al. 2020;Ma et al. 2022). But these shallow methods mostly project multi-omics data into a shared latent space and ignore omics-specific information. Furthermore, linear models disregard non-linear geometries of multiomics data. Manifold alignment methods aim to align embedded low-dimensional manifolds of different omics data and characterize intrinsic cellular structures (Liu et al. 2019;Cao et al. 2021). Although these alignment-based methods can capture non-linear geometries across multi-omics data, they suffer a high time complexity OðN 3 Þ (N is the number of samples), which limits their applications.
By the virtue of expressive feature extraction capability, deeplearning methods have emerged as the mainstream technique for 1 single-cell data analysis Xiong et al. 2019;Liu et al. 2021a). Recently,  proposed single-cell Multimodal Variational AutoEncoder (scMVAE) to integrate scRNA-seq and scATAC data. Specifically, scMVAE combines probabilistic Gaussian mixture models with three different joint learning strategies to explore latent features that can characterize multi-omics data. But merely embedding different omics data into the same latent space may lose the specificity of individual omics. Unlike scMVAE, Deep Cross-omics Cycle Attention (DCCA) ) uses different deep generative networks to model the scRNA-seq and scATAC data, then applies attention-transfer to explore the regulations between different omics and cell heterogeneity.
The aforementioned deep methods still have some issues. First, most of them focus on a shared representation, but disregard the omics individuality, and cannot integrate different levels of biological features to learn a more discriminative representation for data imputation and cell clustering. Furthermore, contemporary single-cell clustering methods only aim at one clustering of cell types. In practice, cells can also be clustered by other biological characteristics, such as cell functions or states, and these biological characteristics can be regulated by gene expression. Existing methods cannot sufficiently integrate and merge the genetic information from different omics to reveal potential alternative clusterings with diversity and high quality, while these multiple clusterings can reveal the different roles and characteristics of cells from different perspectives.
To address these challenges, we propose a method called scMCs and present the conceptual framework in Fig. 1. The main idea of our solution is to design an information extraction and fusion module to finely process the individuality and commonality learned from heterogeneous omics, and construct a more comprehensive and informative representation for single-cell multi-omics data fusion, clustering, and multiple clustering. Specifically, scMCs uses the omics-independent deep autoencoders to learn the low-dimensional representation of each omics, and utilizes the attention mechanism and omics-label discriminator to capture the omics individuality. Meanwhile, scMCs utilizes the contrastive learning strategy to capture the commonality, and fuses the individuality and commonality features into a compact co-embedding representation for cell clustering and data imputation. To uncover the potential alternative clusterings in multi-omics data, scMCs applies multi-head attention mechanism (Vaswani et al. 2017) on the co-embedding representation to generate multiple salient subspaces, and reduce the redundancy between subspaces. Meanwhile, scMCs optimizes a Kullback-Leibler (KL) divergence-based clustering loss in each salient subspace and generates different high quality clusterings in an end-to-end framework.

Materials and methods
The framework overview of scMCs is shown in Fig. 1, where Fig. 1a aims at multi-omics data fusion and cell clustering; and Fig. 1b targets to explore multiple clusterings with quality and diversity embedded in multi-omics data. The technical details of scMCs are presented below.

Multi-omics data encoder for individuality
With the increasing complexity of single-cell data, researchers have merged deep learning with single-cell data clustering (Liu et al. 2021a). As a classical neural network, autoencoder can map high-dimensional data into a low-dimensional representation space while ignoring noise and outliers. Given that, we separately use autoencoders to map single-cell multi-omics data into their respective non-linear embedding spaces, thereby preserving the individuality, resisting noises and outliers.
Let X 2 R NÂDX and Y 2 R NÂDY be the normalized scRNA-seq data and scATAC data, where N is the number of samples, D X and D Y are the number of features. scMCs firstly employs two independent encoders f EX ðÞ and f EY ðÞ to learn respective d-dimensional feature representations fZ X ; Z Y g 2 R NÂd : (1) where d is the dimension of embedding space; Z X is the latent lowdimensional representation of cells and genes in scRNA-seq data, while Z Y encodes the latent patterns between cells and peaks in scATAC data. To extract the individuality and explore the complementary information among different omics, we incorporate the attention mechanism and omics-label discriminator into the encoder module. Concretely, scMCs defines two normalized attention score matrices as: where the elements in A X and A Y quantify the similarity of a pair of cells for different omics. SoftmaxðÁÞ normalizes the weight to [0, 1] to avoid modeling negative correlations, it also helps to prevent the local optimal problem caused by too large weights of some cells. With the normalized attention scores, we reorganize the lowdimensional representations by considering the similarity among cells: The attention mechanism plays important roles in the encoding module. On the one hand, it measures the importance of biological signals in the intrinsic feature spaces of different omics, and extracts omics individuality; on the other hand, it explores the similarity between cells and enables to explore the representation relationship between cells and features from a global perspective.
In supervised learning tasks, labels can indicate the class or identity of the samples. Given that, omics labels can be used as the supervised signals to extract individual features of each omics. Here, we explicitly define the omics labels, i.e. cells from the same omics are Figure 1 Framework overview of scMCs for single-cell multi-omics data fusion and multiple clusterings. (a) scMCs firstly projects scRNA-seq data X and scATAC data Y into different low-dimensional spaces ZX and ZY . Next, it utilizes the attention mechanism and omics-label discriminator to extract the omics individuality ZgX and ZgY , and uses contrastive learning mechanism to capture omics commonality ZXY . After that, scMCs fuses the omics individuality and commonality to obtain an informative co-embedding matrix ZI for clustering, and learns the parameter representations fMX; H; Pg of ZINB distribution and fMYg of Ber. (b) scMCs projects ZI into different subspaces fO l g L l¼1 , leverages the redundancy control constraint and clustering loss L clu to enhance the diversity and quality among them for generating multiple clusterings fR l g L l¼1 in an end-to-end manner. Besides, scMCs minimizes the reconstruction loss Lrec to ensure the consistency of feature information labeled as one type. Next, we design an omics-label discriminator to further enhance the quality of individuality in Z gX and Z gY . The discriminator loss is defined as: where CE is the cross-entropy loss, P 2 f0; 1g 2NÂK is the true omicslabel matrix, where K is the number of omics; f dis ðÞ is the omicslabel predictor, which is a fully connected neural network with two layers.

Cross-omics contrastive learning for commonality
The attention layers and omics-label discriminator may induce the model to pay more attention to individual features or noises of each omics, which is not conducive to data fusion and cell clustering. Furthermore, individual features only unilaterally characterize the complementarity between omics, while the cross-omics consistent (shared) information can reflect the commonality between omics, which is important for a consistent clustering with high quality. Existing methods (i.e. MOFAþ, CoNMF, and scMVAE) mainly concatenate the multi-omics data and project them into a common low-dimensional representation to explore the shared information. However, due to the sparsity and high dimensionality of different omics, the resulting representation may be of low quality. Although DCCA ) uses different deep generative autoencoders and the attention-transfer to link multi-omics, it pays more attention to the knowledge learned from scRNA-seq but lacks attention to scATAC. To extract the compact commonality features between different omics, we introduce the cross-omics contrastive learning strategy (Liu et al. 2021b) to extract shared knowledge from scRNA-seq and scATAC data for fusion. As a novel self-supervised learning paradigm, the core theory of contrastive learning is to maximize the consistency by maximizing the mutual information between different views (Chen and Geng 2021). In this way, we can obtain more informative embedded features by maximizing the information entropy, and avoid the simple solution of assigning all samples to the same cluster. The details of learning commonality are as follows: i. Feature multilayer perceptron (MLP): To eliminate the influence of heterogeneity and ensure the semantic consistency of Z X and Z Y , scMCs maps Z X and Z Y into one latent semantic space via a shared feature MLP: where fQ X , Q Y g 2 R NÂd are low-dimensional embedding representations of X and Y with similar semantics. ii. Cross-omics contrastive learning: In the latent space parameterized by f MLP , we optimize the contrastive loss between Q X and Q Y to learn the commonality representation as: where IðÁÞ denotes the mutual information, HðÁÞ is the information entropy, and is a weight parameter. Finally, scMCs integrates the consistent representations as follows: where Z XY encodes the commonality of different omics, f XY is a fully connected neural network with two layers.

Multi-omics data fusion and imputation for clustering
As discussed, scMCs can learn two latent representations Z gX and Z gY to encode omics individuality, and a latent representation Z XY to encode commonality, which are key factors for clustering and imputing single-cell multi-omics data. Here, we perform an elementwise sum operation with scale parameters k x and k y to aggregate them, and generate a more discriminative co-embedding representation Z I : A simple solution to optimize the co-embedding representation Z I is to use different MLP as decoders to reconstruct each omics. However, frequent dropout events may seriously affect the quality of Z I and lead to inaccurate clustering results. In practice, we can impute the dropout events and utilize the imputed data feedback to optimize Z I , further enhancing the accuracy of key genetic features. Previous studies show that scRNA-seq data often have the characteristics of discreteness, variance greater than the mean and high sparsity (Risso et al. 2018). Nonetheless, some studies report the zero-inflated negative binomial (ZINB) probability distribution can account for these characteristics (Eraslan et al. 2019). Therefore, we propose a ZINB model based decoder network to explore the global probabilistic structure of scRNA-seq data. Mathematically, ZINB is defined with the mean (l x ) and dispersion (h) parameters of the negative binomial distribution and a coefficient (p) that describes the probability of dropout events: where x is a vector from the original scRNA-seq data.
In details, the ZINB-based decoder estimates the parameters fp; l x ; hg based on Z I through three different fully connected layers as follows: where fP; M X ; Hg is the matrix form of fp; l x ; hg; f DX is a decoder with fully connected layer; W p , W l x , and W h are three learnable parameter matrices. The activation function of P is sigmoidðÞ because the dropout probability is between 0 and 1. In addition, since the mean and dispersion parameters are non-negative, the exponential function expðÞ is selected as the activation function for M X and H. Different from the traditional mean squared error loss-based autoencoder, the loss function of ZINB-based decoder network is the negative log of the ZINB likelihood: Considering the extremely sparse and nearly binary nature of scATAC data, we use a Bernoulli distribution (Ber)-based decoder network to model scATAC data: where y is a vector from the original scATAC data; l y is the mean parameters of Ber. The Bernoulli-based decoder estimates l y based on Z I through a fully connected layer with sigmoidðÞ as activation function: where M Y is the matrix form of l y and W l y is the weight parameter matrix. Finally, the Bernoulli-based autodecoder can be optimized by the cross-entropy loss: To pursue a more discriminative and informative co-embedding representation that incorporates individuality and commonality of multi-omics data, we unify the objective of imputing the scRNA-seq data and scATAC data, predicting the omics labels, and cross-omics contrastive learning loss as follows: where U 1 denotes the network parameters, a 1 , a 2 , and a 3 are three scalar parameters to constrain L Ber , L dis , and L cl . By optimizing Equation (17), the individual and shared feature representations can be learned from multi-omics data, and they can be merged into an informative co-embedded representation for clustering and multiple clustering.

Multiple clusterings mining module
Contemporary single-cell multi-omics analysis methods mainly aim to integrate cross-omics shared features to find an optimal cell division pattern, which ignores other potential important patterns. Due to the multiplicity of multi-omics data, different cell clustering patterns, such as cell type clustering or cell state clustering, can coexist. Unlike traditional multi-view clustering methods that can only discover a single clustering, multi-view multiple clustering can incorporate the omics consistent and specific features and simultaneously generate multiple meaningful and non-redundant clusterings, which help us to divide cells from different perspectives and explain the cell heterogeneity. Different from subspace clustering that finds one clustering with clusters spanned in different subspaces, multiple clustering explores alternative clusterings in different subspaces. To more comprehensively mine single-cell multi-omics data, scMCs introduces another module (as illustrated in Fig. 1b), and proposes to sufficiently utilize the omics individuality and commonality to explore alternative clusterings embedded in the multi-omics data.
A naive idea to generate multiple clusterings is to define multiple embedding subspaces based on the original or imputed data. However, the resulting embeddings/clusterings may largely overlap, due to the characteristics of high noise and sparsity of single-cell data. Here, scMCs uses Z I to generate different salient subspaces for its compactness with informative features. Specifically, it applies multi-head attention on Z I to generate L salient heads fO l g L l¼1 , which capture different perspectives of Z I , and thus generate L salient subspaces. The l-th head O l 2 R NÂm is calculated as: where fQ l ; K l ; V l g are the linear transformations of Z I with respect to different parameters fW Q l ; W K l ; W V l g, m is the dimension of each head. It is worth noting that projecting Z I with different parameters can theoretically control the difference between heads, and thus help to generate diverse subspaces and clusterings.
To ensure the consistency between subspace features and Z I , we concatenate all the heads as Z ¼ concatðO 1 ; . . . ; O L Þ and decode Z toward Z I with the following reconstruction loss: One key concern of multiple clusterings is how to reduce the redundancy between clusterings. Although with different linear transformation parameters, the multi-head attention may still produce redundant subspaces. Here, we leverage the Hilbert Schmidt Independence Criterion (HSIC) (Gretton et al. 2005) to quantify the dependency between heads, which also approximately measures the redundancy between subspaces and clusterings. Theoretically, HSIC quantifies the dependency between two head O l and O l 0 based on the norm of the cross-covariance operator. It can simultaneously measure the linear and non-linear dependency between representations. The larger the HSIC value, the larger the dependency between them is. The empirical HSIC is computed as: where TrðÁÞ is the trace norm, U l ¼ O T l O l is the Gram matrix, H ¼ I m À 1 m 11 T centers the Gram matrices to have zero mean. Mathematically, the dependency among L heads is computed as: whereŨ l ¼ ðm À 1Þ À2 P L l¼1;l6 ¼l 0 HU l 0 H. Minimizing Equation (21) penalizes the dependency among L heads, and reduces the redundancy between different subspaces and clusterings therein. Another concern of multiple clusterings is how to maintain the quality of each clustering, which describes the compactness within clusters and the separation between clusters. Here, we propose to learn L sets of cluster centers fX l g L l¼1 in L subspaces fO l g L l¼1 , where X l ¼ fx 1 l ; x 2 l ; . . . ; x J l l g indicates that O l has J l cluster centers. To optimize the cluster centers in each subspace, we utilize a KL divergence loss to enhance the association between similar cells. Specifically, we measure the pairwise similarity between the sample point o i l and centroid x j l in O l as follows: where p ij 2 P l is the probability of assigning sample i ð1 i NÞ to cluster j ð1 j J l Þ. Equation (22) uses a t-distribution constraint to optimize the distance between samples and cluster centers, which can generate larger gradients for dissimilar samples to prevent clustering them together.
To further optimize the cluster centers and strengthen the affinity between similar samples, we introduce an auxiliary target distribution R l to refine the clusters in each clustering by learning their highconfidence assignments (Xie et al. 2016), and its elements can be computed as: Theoretically, R l can improve the compactness between similar samples, while paying less attention to dissimilar ones. In addition, it balances the contribution of each cluster center through normalization, and avoids the clustering distortion caused by a larger cluster.
Based on these two similarity distribution functions, we can define the clustering loss among L heads as: To generate multiple diverse subspaces from Z I and explore high quality clusterings therein, we unify the objective of reconstruction loss, redundancy between subspaces, and clustering loss as follows: where U 2 is the network parameters, b 1 and b 2 are two scalar parameters to balance the diversity and quality. By optimizing Equation (25), we can find multiple salient subspaces from the coembedding representation Z I , and also generate multiple clusterings with high quality therein in an end-to-end manner. When updating l-th clustering C l , the label assigned to i-th sample can be made as c i ¼ argmax r ij ; C l ¼ fc i g N i¼1 . If we fix L ¼ 1, the redundance control term in Equation (25) is disregarded, then, we can learn an embedded representation O 1 of multiple omics and discover the single clustering therein.

Experiment setup
Datasets: scMCs is a flexible framework that can integrate different single-cell omics data. In the experiments, we mainly evaluate the performance of scMCs by jointly modeling the scRNA-seq data and scATAC data. We collect four preprocessed single-cell multi-omics data with paired profiles from a previous study ): (i) CellMix with 1047 cells is downloaded from GEO (D1, GSE126074), in which the chromatin accessibility and gene expression in each single-cell are simultaneously co-assayed using the SNARE-seq; (ii) PBMC 3K (D2) with 3012 cells is downloaded from 10X Genomics; (iii) Mouse skin downloaded from GEO (D3, GSE140203) contains 34 774 cells, and it is derived from adult mouse skin by SHARE-seq. (iv) AdBrain with 10 309 cells is downloaded from GEO (D4, GSE126074), in which the chromatin accessibility and gene expression in each single-cell are derived from the adult mouse cerebral cortex. We use the Signac package (Stuart et al. 2021) to preprocess AdBrain dataset, and retain the top 5000 highly variable-genes of scRNA-seq data and 52 818 peaks of the scATAC data.
Evaluation protocols: For 'single clustering', k-means is applied to cluster the cells based on the learned low-dimensional co-embedding representation Z I . Then, we use Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI) to evaluate the clustering performance. The range of NMI and ARI are both [0,1], and a higher value indicates a better clustering performance. For 'multiple clusterings', we use the NMI and Jaccard Index (JI) to measure the overlap between different clusterings, and Silhouette Coefficient (SC) and Dunn Index (DI) to evaluate the quality of each clustering.
Comparing baselines: We implement scMCs with the MindSpore deep learning framework and compare it against with Iv competitive single-cell multi-omics data fusion methods. (i) JSNMF (Ma et al. 2022) decomposes different omics data into different latent spaces, and learns the consistent information of multi-omics data through a consensus graph; (ii) UnionCom (Cao et al. 2020) projects multiomics data into a common embedding space, and matches the complex non-linear features by a global scaling parameter to cluster the cells; (iii) scMVAE  proposes three strategies, scMVAE-PoE, scMVAE-NN, and scMVAE-Direct, to learn the joint latent features for data fusion and clustering. scMVAE-Direct concatenates raw features of each omics, scMVAE-NN combines the low-dimensional features extracted from different omics, while scMVAE-PoE uses the product of experts framework to estimate a joint posterior distribution; and (iv) DCCA ) projects different omics into their corresponding low-dimensional spaces, and uses the 'Teacher-student' mechanism to fuse multiomics data. The experimental configurations of these compared methods are given in Supplementary Table S1. Table 1 summarizes the clustering performance of scMCs and other baselines on four datasets. Each method repeats five times to take the average and variance, and the bold fonts indicate the best result. UnionCom is too time-consuming on large datasets, so its results on Mouse skin are not reported. scMCs performs well on the four datasets in terms of NMI and ARI, and the clustering results are statistically better than other methods in most cases. Other important observations are as follows:

Cell clustering and visualization
i. scMCs versus JSNMF: JSNMF more focuses on the linear and shared features, but overlooks the individual features of each omics. In addition, it neglects the impact of dropout events. Thus, it has a poor clustering performance in most cases. In contrast, scMCs can learn the omics individuality and commonality to joint optimize the co-embedding and data imputation for a better cell clustering. ii. scMCs versus UnionCom: UnionCom not only fails to consider the influence of individual manifold features on clustering, but also cannot effectively handle the dropout events. So it loses to scMCs in most cases. Furthermore, the huge time overhead of learning the manifold topology structure also limits its application to high-dimensional data. iii. scMCs versus scMVAE: There is a clear margin between scMVAE-PoE, scMVAE-Direct, scMVAE-NN, and scMCs, which proves the advanatages of scMCs. scMVAE-Direct has the worst performance, because concatenating the highdimensional features can significantly increase the sparsity and complexity of data representation. scMVAE-NN performs better than scMVAE-Direct, because it explores a common representation in a more compact feature space. scMVAE-PoE learns a consistent probability distribution of multi-omics data with fewer model parameters from a global perspective, and it gives better results than scMVAE-Direct and scMVAE-NN. However, scMVAE disregards the individuality of multi-omics data for data fusion cell clustering. In constrast, scMCs not only considers shared features as key factors for a consensus cell clustering, but also the individual features. iv. scMCs versus DCCA: Although DCCA utilizes different neural networks to project multi-omics data into different representation spaces, it loses to scMCs by a clear margin. This is because DCCA mainly focuses on the individual features of different omics data, and neglects the shared features of these omics for the consistent clustering. In contrast, scMCs simultaneously extract the shared and individual features from different omics, and fuses them into a co-embedding space, which can encode the cellular heterogeneity and find a more accurate clustering.
In addition, to illustrate the quality of Z I , we apply uniform manifold approximation and projection (UMAP) (Becht et al. 2019) to visualize cell clustering points of scMCs and other baselines on each benchmark dataset. As shown in Supplementary Figs S1-S4, we can clearly see that scMCs has the clearest division boundaries and the lowest misclassification rate. These results also explain why scMCs achieves a better clustering performance.

Evaluation of data imputation
Besides accurate cell clustering, scMCs also realizes data imputation based on Z I using two independent deep generative decoder networks. To evaluate the quality of imputed scRNA-seq data and scATAC data, we visualize the raw data and the imputed data generated by scMCs, scMVAE-PoE, scMVAE-Direct, scMVAE-NN, and DCCA. Specifically, we project the raw data and imputed data into different 2D spaces via UMAP, and explore cell clusterings therein. Meanwhile, we also leverage NMI and ARI to evaluate the clustering given by each method.
Supplementary Figs S5-S12 report the visualization and clustering performance of each method on raw and imputed CellMix, PBMC_3K, Mouse_skin, and AdBrain, respectively. We see the NMI and ARI scores of scMCs are significantly higher than those of other baselines. The visualization results also confirm the cell clustering found by scMCs is more separated between different clusters and more compact within clusters. All these confirm that scMCs can generate an informative embedding representation Z I , which can be used for data imputation.
In addition, to assess whether scMCs contributes to discover important biological signals, we utilize Signac to process the raw multi-omics data as well as the imputed data. Taking AdBrain as example, we report the results in Supplementary Fig. S13. Concretely, we normalize the raw scRNA-seq data and scATAC data and visualize the normalized data into a 2D space via UMAP. Then, we annotate cell types and provide the results in Supplementary Fig. S13a, where the top shows the clustering results on raw AdBrain, the bottom shows the results on imputed data. We can observe that the clusters obtained using the imputed data are more compact, and the boundaries between clusters are clearer. To study differences in gene activity across clusters, we create a gene activity matrix based on imputed scATAC data. Taking 'L2/3 IT', 'L6 IT', 'L5 CT', and 'L4' as examples, we use FindAllMarkersðÞ function to determine the differentially expressed genes of each cell cluster, and report the results in Supplementary Fig. S13b. We can accurately identify the marker genes of different cell types using the imputed scATAC data, which prove scMCs can find out associations between genes and peaks by imputing the missing values in scATAC data. Moreover, we uncover the differentially accessible peaks between clusters using the imputed scATAC data, and report the results on four clusters in Supplementary Fig. S13c. We can observe that the peaks are significantly different among clusters, which indicates the specific accessibility in heterogeneous cell types. Overall, these results show that scMCs can achieve effective imputation of single-cell multi-omics data, reveal significant relationships between cells and genes, as well as the biological correlation between cell types and peak accessibility.

Evaluation of multiple clusterings
Existing single-cell data clustering methods can 'only find one clustering pattern' of cell types. However, with the increased multiplicity of single-cell data, there exist alternative and meaningful clusterings, which can uncover new patterns of cells at a more comprehensive way.
As shown in Fig. 1b, scMCs can project the co-embedding representation Z I into different salient subspaces, and find out different clusterings therein. The number of clusterings and clusters in each clustering can be specified based on the datasets or user's expectation. If the dataset has reference label, users can refer to these labels to specify the number of clusterings and clusters. Otherwise, users can specify the expected number of alternative clusterings, next adopts widely used stable clustering techniques (Wang et al. 2021) to determine the number of clusters in each clustering, and then visualizes these clusterings or use internal evaluation metrics (i.e. SC) to determine the number of alternative clusterings and clusters therein in an explorative data mining way. In the experiments, we project Z I into two subspaces fO 1 ; O 2 g, and generate two clusterings fC 1 ; C 2 g. Then, we use the SC and DI to measure the overall quality of fC 1 ; C 2 g, and further compare fC 1 ; C 2 g against the distinct ground truth C t of CellMix, PBMC_3K, and AdBrain. Table 2 lists the average clustering results of five independent runs of scMCs. In addition, we further evaluate the diversity between C 1 and C 2 using NMI and JI. Supplementary Fig. S14 reports the diversity (1-NMI, 1-JI) of scMCs on CellMix, PBMC_3K, and AdBrain. Concretely, NMI and JI measure the similarity between the two generated different clusterings. Hence, a larger (1-NMI or 1-JI) means these clusterings are less overlapped. Several observations can be made from these results: i. From Table 2, we can observe that C 1 has a high similarity with the ground truth C t , while the smaller NMI and JI values indicate that C 2 is not similar to C t . In addition, the high SC and DI values suggest that C 2 is a potential alternative clustering with high quality. ii. The results in Supplementary Fig. S14 show that there is a rather low redundancy between C 1 and C 2 , this fact proves that scMCs can not only find the significant cell type clustering from the coembedding representation Z I , but also the other potential alternative clustering.
To verify the biological significance of C 1 and C 2 , we conduct a series of downstream analyses. Taking CellMix as an example, the relevant results are shown in Supplementary Figs S15-S17. Firstly, we perform cell clustering and annotation on CellMix based on the ground truth C t . As shown in Supplementary Fig. S15a, CellMix is divided into four cell clusters. To determine the identity of each cell cluster, we identify the marker genes in each cluster using the FindAllMarkersðÞ function and report four differentially expressed genes in Supplementary Fig. S15b and c. According to the Cell Taxonomy database (Jiang et al. 2023), we confirm that these four genes mark four different cell lines, including H1, BJ, K562, and GM12878. In addition, Supplementary Fig. S16 provides the results of O 1 based on C 1 . We find that cells in O 1 can also be clustered into four clusters. By identifying the marker genes, we identify these four cell clusters as H1, BJ, K562, and GM12878, respectively. These results can also prove that there is a cell type clustering embedded in O 1 , and this is consistent with the results in Table 2. scMCs not only can find out a clustering in accordance with the known C t , but also other alternative ones C 2 embedded in O 2 , which reveals the tissue specificity of the cells from a new perspective. Concretely, Supplementary Fig. S17a and b shows that cells in O 2 can be divided into two clusters, where the marker genes of cluster 0 are UCHL1 and CALD1, and the markers of cluster 1 are TXNIP and DDIT3. Moreover, Supplementary Fig. S17c also shows that different genes are differentially expressed in each cluster. Based on the conclusions in Cell Taxonomy database (Jiang et al. 2023) and Human Protein Atlas (Uhlen et al. 2010), the expression of UCHL1 and CALD1 enhances the tissue specificity of the cells, while the expression of TXNIP and DDIT3 decreases the tissue specificity of cells. Therefore, as shown in Supplementary Fig. S17d, cluster 0 can be defined as cells with 'high tissue specificity', and cluster 1 can be defined as cells with 'low tissue specificity'. This observation suggests that scMCs can more comprehensively mine the single-cell multi-omics data by giving different clusterings

Ablation study and parameter sensitivity analysis
To study the contribution factors of scMCs, we introduce four variants: w/oAtt, w/oDiscriminator, w/oCL, and w/oZB, which separately disregard the attention layer, omics-label discriminator, contrastive learning, and ZINB loss and Bernoulli loss. Supplementary Fig. S18 reveals the average NMI and ARI values of scMCs and its variants. We observe that scMCs outperforms its variants by a clear margin, which confirms that attention layer, omicslabel, contrastive learning mechanism, and generative decoder indeed contribute to the quality of cell clustering. More analyses are given in Supplementary Section S4. Taking CellMix as an example, we also conduct different experiments to evaluate the parameter sensitivity of scMCs. The details are reported in Supplementary Figs S19-S21 in Supplementary Section S5. In general, scMCs can show better clustering performance without much effort to adjust parameters.

Conclusion
In this article, we propose scMCs for single-cell multi-omics data fusion, cell clustering, and multiple clusterings. scMCs extracts the individual and shared features of multi-omics data and fuses them into an informative co-embedding representation for clustering and imputation. Moreover, scMCs can comprehensively mine multi-omics data by projecting the co-embedding representation into different salient subspaces to generate different and meaningful alternative clusterings. Experimental results show that scMCs can achieve superior and competitive performance in cell clustering and data imputation. More importantly, scMCs finds out multiple clustering structures with diversity and quality, which provide new insights of understanding the diverse roles of cells from different perspectives. How to couple data fusion and multiple clustering mining into a unified method and simplifying scMCs with fewer parameters (ideally parameter-free) are two future pursues for single-cell data multiple clusterings.

Supplementary data
Supplementary data is available at Bioinformatics online.