Probabilistic cell/domain-type assignment of spatial transcriptomics data with SpatialAnno

Abstract In the analysis of both single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data, classifying cells/spots into cell/domain types is an essential analytic step for many secondary analyses. Most of the existing annotation methods have been developed for scRNA-seq datasets without any consideration of spatial information. Here, we present SpatialAnno, an efficient and accurate annotation method for spatial transcriptomics datasets, with the capability to effectively leverage a large number of non-marker genes as well as ‘qualitative’ information about marker genes without using a reference dataset. Uniquely, SpatialAnno estimates low-dimensional embeddings for a large number of non-marker genes via a factor model while promoting spatial smoothness among neighboring spots via a Potts model. Using both simulated and four real spatial transcriptomics datasets from the 10x Visium, ST, Slide-seqV1/2, and seqFISH platforms, we showcase the method’s improved spatial annotation accuracy, including its robustness to the inclusion of marker genes for irrelevant cell/domain types and to various degrees of marker gene misspecification. SpatialAnno is computationally scalable and applicable to SRT datasets from different platforms. Furthermore, the estimated embeddings for cellular biological effects facilitate many downstream analyses.


Introduction
With the rapid advancement of spatially resolved transcriptomics ( SRT ) technologies, it has become feasible to comprehensively characterize the gene expression profiles of tis-sues while retaining information on their physical locations.Among the already developed SRT methods, in situ hybridization ( ISH ) technologies, such as MERFISH ( 1 ) and seqFISH ( 2 ) , provide single-molecule resolution for targeted genes but require prior knowledge of the genes of interest.To enable single-cell analysis, cell segmentation must be performed to assign transcripts to individual cells.Alternatively, in situ capturing technologies, such as 10x Visium, Slide-seqV1 / 2 ( 3 ) and Stereo-seq ( 4 ) , are unbiased and provide transcriptomewide expression measurements.Among the in situ capturing technologies, there has been a dramatic improvement in spatial resolution, with spot sizes ranging from 55 μm in 10x Visium, 10 μm in Slide-seqV2, to < 1 μm in Stereo-seq.These SRT technologies provide an opportunity to study how the spatial organization of gene expression in tissues relates to tissue functions ( 5 ) .To characterize the transcriptomic landscape within a spatial context, assigning cell / domain types in relation to tissue location is an essential analytic step that provides comprehensive spatially resolved maps of tissue heterogeneity ( 6 ) .
Conventionally, spatial annotation relies on the manual assignment of cell / domain clusters using known marker genes that are readily available from existing studies or databases ( 7 ,8 ) .A general workflow begins with the unsupervised clustering of spots based on their transcriptomic profiles; this is followed by an examination of the differentially expressed genes ( DEGs ) specific to each cluster; and finally, the DEGs are manually matched with known marker genes to assign cell / domain types to spatial spots.This type of workflow requires sufficient knowledge of the biology and markers of the cell / domain types, but it can be time-consuming, laborintensive and less reproducible ( 6 ,9 ) .Moreover, these workflows are sensitive to the choice of clustering methods, presenting challenges in the downstream interpretations ( 10 ) .An improved strategy for spatial annotation is to automatically annotate the identified clusters using either reference data or leveraging existing information on the cell / domain types.Performing annotations with reference data has been shown to be successful in the context of single-cell RNA sequencing ( scRNA-seq ) analysis.For example, scmap performs cell annotation by projecting existing reference data with known cell types onto cells in the study data ( 11 ) .However, the success of this type of analysis relies on the availability of reference data that are 'similar' to the study data.On the other hand, the availability of data on cell-type-specific maker genes from existing studies or databases, potentially obtained using either low-throughput or high-throughput systems, further necessitates the efficient utilization of marker-gene information in a 'qualitative' manner.To this end, a number of methods have been developed for scRNA-seq data without any consideration of spatial information, including SCINA ( 12 ) , Garnett ( 13 ) , CellAssign ( 14 ) and scSorter ( 15 ) .While SCINA and Cel-lAssign use only the expression of marker genes, scSorter and Garnett can utilize information from non-marker genes.Although these methods can be applied to SRT data, they do not consider the invaluable spatial localization information among spots.
To efficiently utilize the existing knowledge based on marker genes for cell / domain types, an ideal annotation method for SRT datasets should be capable of leveraging this 'qualitative' information on marker genes with data on nonmarker genes while incorporating spatial information to promote spatial smoothness in the cell / domain-type annotation.Because the proportion of non-marker genes is much larger than that of marker genes, non-marker genes also harbor substantial amounts of biological information that can be used to separate cell / domain types.Annotation methods capable of leveraging marker with non-marker genes can improve our ability to detect spatial cell / domain clusters ( 14 ,15 ) .However, the high-dimensional nature of non-marker genes makes the annotation task more challenging and, moreover, requires proper and efficient modeling of this information.Furthermore, for SRT datasets, especially those from tissue sections with laminar structures, for example, brain regions, a desirable spatial annotation method would additionally be able to leverage spatial information.
To address the challenges presented by spatial annotation, we propose the use of a probabilistic model, SpatialAnno, which performs cell / domain-type assignments for SRT data and has the capability of leveraging non-marker genes to assign cell / domain types via a factor model while accounting for spatial information via a Potts model ( 16 ,17 ) .To effectively leverage a large number of non-marker genes and overcome the curse of dimensionality, SpatialAnno uniquely models expression levels in a factor model governed by separable cell / domain-type low-dimensional embeddings.As a result, SpatialAnno not only performs spatial cell / domaintype assignments with better accuracy but also estimates cell / domain-type-aware embeddings that can facilitate downstream analyses.We illustrate the benefits of SpatialAnno through extensive simulations and analyses of a diverse range of example datasets collated using different spatial transcriptomics technologies.To show the improved spatial annotation accuracy, we applied SpatialAnno to analyze a 10x Visium datasets for 12 human dorsolateral prefrontal cortex ( DLPFC ) samples.To illustrate the effectiveness of SpatialAnno in leveraging non-marker genes, we analyzed a mouse olfactory bulb ( OB ) dataset generated using the ST technology.Using Slide-seqV1 / 2 datasets for the mouse hippocampus, we demonstrated that SpatialAnno can correctly identify cell-type distribution at near-cell resolution.The utility of SpatialAnno to estimate low-dimensional embeddings is demonstrated by a seqFISH dataset for the mouse embryo.

Model specification
In the SpatialAnno model, we denote X as the spot-by-gene expression matrix on n spatial locations.These locations have known spatial coordinates and unknown labels y i , i = 1, …, n .We can separate genes into a group of m marker genes and a group of p non-marker genes, denoted as x 1 i = (x i 1 , . . ., x im ) and x 2 i = (x i,m +1 , . . ., x i,m + p ) , respectively.Suppose prior knowledge of marker genes for K cell / domain types is encoded as an indicator matrix ρ of dimension m × K , with ρ jk = 1 if gene j is a maker for cell / domain type k and 0 otherwise.Following (18)(19)(20) , we assume that the expression measurements have already been normalized through variance stabilizing transformation and further centered for each gene to have zero mean ( see Supplementary Notes ) .
SpatialAnno models the centered normalized expression vector, x 1 i , for marker genes in cell i , and latent label, y i , as with the constraint that β jk ≥ 0. Here, α j is the base expression level for gene j in the marker group.The intuition is that if gene j is a marker for cell / domain type k , then we expect the expression of j to be higher in these cell / domain types ( 14 ) with an increased magnitude β jk .Note that there is no restriction stating maker genes cannot be expressed in other cell / domain types.We assume the covariance = diag (σ 2 1 , . . ., σ 2 m ) .This simplification significantly reduces the computational cost.
For the high-dimensional non-marker genes, SpatialAnno models their centered normalized expression vector, x 2 i , and latent label, y i , as where factor z i ∈ R q represents a q -dimensional embedding of x 2 i ; L is a p × q factor loading matrix; m k ∈ R q is the mean vector for the k th cell / domain type, and V is the covariance matrix that is shared across cell / domain types; and e i is the residual error and follows an independent normal distribution with mean zero and variance , which is a diagonal matrix, or e i ∼ N (0 , ) .
To promote neighborhood similarity in cell / domain types, we follow previous computation ( 21 ,22 ) and assume that cell / domain type y i ∈ {1, …, K } follows a Potts model characterized by an interaction parameter ξ and a neighborhood graph S, where i ∼ i denotes all neighboring pairs in the neighborhood graph S; I(y i = y i ) is an indicator function that equals 1 if both the i th and i th locations belong to the same cell / domain type and equals 0 otherwise; ξ is an unknown interaction parameter that determines the extent of cell / domain type similarity among neighboring locations; and C ( ξ) is the normalizing constant, also known as the partition function that ensures the above probability mass function has a summation of one across all possible configurations of y .

Methods for comparison
We compared SpatialAnno with four annotation methods: ( i ) SCINA ( 12 ) implemented in the R package SCINA ( version 1.2.0 ) , ( ii ) Garnett ( 13 ) implemented in the R package garnett ( version 0.1.21) , ( iii ) CellAssign ( 14) implemented in the R package cellassign ( version 0.99.21 ) and ( v ) scSorter ( 15 ) implemented in the R package scSorter ( version 0.0.2 ) .We used the default parameter settings as recommended in their respective tutorials.SCINA models the expression levels of marker genes using a Gaussian mixture model and enforces a constraint that marker genes should have higher mean expression levels in their corresponding cell types.One key advantage of SCINA is its computational efficiency.CellAssign models the expression count data of marker genes based on a Bayesian probabilistic model that takes into account batch-or sample-specific effects.This approach enhances the accuracy of cell type annotation, particularly when dealing with data from a heterogeneous scRNA-seq population.However, both SCINA and CellAssign rely solely on the expression of marker genes for cell-type annotation.Garnett takes a different approach by first identifying representative cells for known cell types using only marker genes.It then trains a multinomial classifier using all genes with the representative cells and uses this classifier to classify the remaining cells.Garnett also offers a method for rapidly annotating additional datasets by apply-ing the pre-trained classifier.Notably, non-marker genes are not employed in the initial step of Garnett's approach.In contrast, scSorter combines the expression of both marker genes and non-marker genes.It uses K-means optimization strategies and relies on pre-specified weight parameters to adjust the contribution of marker genes.These weights need to be specified manually.

Simulations
We performed comprehensive simulations to evaluate the performance of SpatialAnno and compared it with that of alternative annotation methods.The spatial locations of 3639 spots were taken from DLPFC section 151673.Cell / domain types were assigned with manually generated annotations from the original studies ( 23 ) .We simulated gene expression data for each spot using the splatter package ( version 1.20.0 ) .
Five marker genes for each cell / domain type were selected from the top DEGs based on log-fold change in expression.We tested the accuracy and robustness of SpatialAnno with the following settings that reflect real-world scenarios.I. To test the robustness of SpatialAnno to the erroneous specification of the number of cell / domain types, we considered three scenarios.In the first scenario, marker genes for all seven cell / domain types were provided, and no unknown cell / domain types existed in the expression data.In the second scenario, marker genes for two cell / domain types were removed to create a scenario in which fewer cell / domain types were specified in the marker gene matrix than actually exist in the data.Thus, cells from these two cell / domain types should be assigned to 'unknown'.In the third scenario, the marker genes for nine cell / domain types were added, but two cell / domain types did not appear in the expression data.This mimics a scenario in which more cell / domain types are specified in the marker gene matrix than actually present in the data.II.To evaluate the robustness of SpatialAnno to marker gene misspecification, we next created a scenario in which marker genes may be incomplete or incorrect.We randomly flipped a fraction of entries in the binary marker gene matrix ρ to introduce errors.Specifically, the procedure consisted of two steps.In the first step, a proportion of entries in ρ that contained one were flipped.In the second step, the same number of entries flipped in the first step was flipped for the entries that contained zero in the original ρ.The proportions considered were set to 10%, 20% or 30%.Other settings were similar to those in the first scenario in Simulation I. III.To assess the capability of SpatialAnno to utilize highdimensional non-marker genes, we varied the number of non-marker genes as 60, 100, 500, 1000 and 2000.
In this setting, we only compared scSorter and Garnett, as only these methods can utilize non-marker genes.Other settings were similar to those in the first scenario in Simulation I.
For each simulation setting, we performed 50 replicate simulations.In each replicate, we applied SpatialAnno and the other methods to annotate each spot.

Human dorsolateral prefrontal cortex data generated using 10x Visium
We downloaded a human DLPFC dataset ( 23 ) generated using the 10x Visium platform from http://spatial.libd.org/spatialLIBD/.In this dataset, there were 12 tissue sections, which contained a total of 33 538 genes measured on average over 3973 spots.We used the sample ID151673, which contains expression measurements of 33 538 genes on 3639 spots, as the main analysis example.We presented the results for the other 11 samples in the Supplementary Figures.For all the sections, we extracted the top 2000 spatially variable genes with SPARK-X ( 24 ) before performing annotations.To identify layer-specific marker genes for annotation, we used tissue section 15 1507 as the reference data.This dataset contained 33 538 genes for 4226 spots.For each layer, the top 5 DEGs were selected as its marker genes.The final marker gene list is available in Supplementary Table S1.

Mouse olfactory bulb data by spatial transcriptomics (ST)
We obtained the mouse olfactory bulb ST data from the spatial transcriptomics research website ( https://www.spatialresearch.org/).These data consist of gene expression levels in the form of read counts that were collected for a number of spatial locations.We followed the methods of previous studies ( 25 ,26 ) to focus on the mouse OB Section 12, which contains 16 034 genes and 282 spatial locations.We presented the results for the other 11 sections in the Supplementary Figures.We extracted the top 3000 most highly variable genes with function SCTransform implemented in Seurat (version 4.0.5)( 27) before performing annotations.To construct the marker gene list for annotation, we perform differential expression analysis on scRNA-seq data ( 28 ) from the Gene Expression Omnibus (GEO; accession number GSE121891).This scRNA-seq data was collected from the mouse olfactory bulb and contains 18 560 genes and 12 801 cells for five cell types: granule cells (GC, n = 8614), olfactory sensory neurons (OSNs, n = 1200), periglomerular cells (PGC, n = 1693), mitral and tufted cells (M-TC, n = 1133), and external plexiform layer interneurons (EPL-IN, n = 161).For each cell type, the top four DEGs were selected as its marker genes.The final marker gene list is available in Supplementary Table S2.

Mouse hippocampus Slide-seq data and Slide-seqV2 data
We obtained the mouse hippocampus Slide-seq dataset and Slide-seqV2 dataset ( 3 ) from the Broad Institute's Single Cell Portal ( https:// singlecell.broadinstitute.org/single _ cell/ study/ SCP948/ robust-decomposition-of-cell-type-mixturesin-spatial-transcriptomics ).The Slide-seq dataset consists of gene expression measurements in the form of read counts for 22 457 genes and 34 199 spatial locations.The Slide-seqV2 dataset consists of gene expression measurements in the form of read counts for 23 264 genes and 53 208 spatial locations.In the analysis, we filtered out genes that had fewer than 20 counts on all locations and filtered out locations that had fewer than 20 genes with nonzero counts.These filtering criteria led to final sets of 14 481 genes and 31 664 cells for Slide-seq dataset, and 16 121 genes and 51 212 cells for Slide-seqV2 dataset.In addtion, for both datasets, we extracted the top 2000 most spatially variable genes with SPARK-X ( 24 ) before performing annotations.To construct marker genes for annotation, we obtained the DropViZ scRNA-seq dataset ( 29 ) from the Broad Institute's Single Cell Portal.This data was collected from the mouse hippocampus, which contained 22 245 genes and 52 846 cells for 19 cell types.For each cell type, the top five DEGs were selected as marker genes.Besides the 19 cell types, we added another two cell types, Slc17a6 neurons and Hb neurons, and their marker genes were extracted from the original study ( 29 ).The final marker gene list used is available in Supplementary Table S3.

Mouse embryo data by seqFISH
We obtained the mouse embryo seqFISH data ( 2 ) from https:// marionilab.cruk.cam.ac.uk/SpatialMouseAtlas/ .This dataset profiles the expression of of 387 selected target genes from three mouse embryo tissue sections.Cell segmentation is performed using a combination of aligning membrane stains to the first hybridization round and training them with a machine learning toolkit called ilastik ( 30 ).This process generates probability maps, which are then used to create 2Dlabeled cells for each z slice.mRNA transcript signals are located by finding local maxima above a threshold, and these spots are assigned to corresponding cells based on location, generating a gene-cell count matrix.In total, the dataset includes gene-cell count matrices for 19 451, 14 891 and 23 194 cells, respectively.We calculated normalized expression log counts for each cell using logNormCounts function in the R package scuttle ( 31 ) with cell-specific size factors.To construct the marker gene list, we used Embryo 3 as a reference; this dataset contains 24 cell types.For each cell type, the top eight DEGs were selected as marker genes.We removed marker genes for two cell types, ExE endoderm cells and blood progenitors, as there were too few ( < 30) of these cells.The cell type 'Low quality' was also removed.The final marker gene list used contained 21 cell types and is available in Supplementary Table S4.

Evaluation metrics
We evaluated annotation performance using three metrics, that is, Kappa, mF1 score and ACC, as suggested in previous single-cell data annotation studies ( 11 , 32 ).A CC was defined as the proportion of spots that were classified into the correct types.Kappa is generally thought to be a more robust measure than ACC, since it takes into account the possibility that the agreement occurs by chance.The cell-level F1 score considers each cell to be an individual classification task with a true cell-type assignment (and potentially multiple incorrect cell-type assignments) for the purposes of calculating precision and recall (Supplementary Notes).
We also compared the low-dimensional embeddings estimated in SpatialAnno with those from PCA and DR-SC ( 22 ).In detail, we first extracted the top 15-dimensional components and then summarized those top components as three tSNE components and visualized the resulting tSNE components with RGB colors in the RGB plot.To show that the estimated embeddings carry the most information about cell / domain types, we evaluated the conditional correlation coefficients between the true cell / domain labels and the observed gene expression, given the estimated embeddings in SpatialAnno.Furthermore, the embeddings in SpatialAnno improve clustering performance.With embeddings from Spa-tialAnno, PCA and DR-SC, we performed clustering analysis using the Louvain community detection algorithm imple- mented in the R package Seurat (version 4.1.1),and evaluated clustering performance using the ARI ( 33 ).

Overview of SpatialAnno
Similarly to other methods that assign known cell / domain types to cells using information about marker genes, Spa-tialAnno takes as input normalized gene expression matrix, spatial location information, and a list of marker genes for known cell / domain types (Figure 1 A).SpatialAnno automatically performs cell / domain-type assignments while providing low-dimensional embeddings for all spatial spots.Although most in situ capturing technologies have limited spatial resolution, with each measured location possibly containing multiple cell types, SpatialAnno can still provide a crucial understanding of tissue organization by annotating domain types.
When marker genes for known cell types are available, Spa-tialAnno can be used to annotate cell types for measured locations.However, it is important to note that relying solely on the major cell type identified in each location could potentially lead to biased results.Based on the latent cell / domain type for each spot, SpatialAnno builds a 'semi-supervised' Gaussian mixture model to modulate the over-expression of marker genes and a hierarchical factor model to relate non-marker gene expression to the cell / domain separable latent embeddings while accounting for the spatial smoothness of the cell / domain types with a Potts model (Figure 1 B).Uniquely, SpatialAnno, via the factor model, allows for the assignment of cell / domain types that leverage a large number of non-marker genes, and, via the Potts model, is more likely to assign the same cell / domain type to neighboring spots, promoting spatial smoothness in the cell / domain types.Notably, with expression data for both marker and non-marker genes, SpatialAnno simultaneously assigns each spot known cell / domain types while obtaining low-dimensional embeddings for each spot, which can facilitate other downstream analyses.Similarly to other methods, SpatialAnno automatically labels spatial spots that do not belong to any known cell / domain type as 'unknown', preventing incorrect assignment when novel cell / domain types are present.

Validation using simulated data
We conducted simulations to evaluate the performance of Spa-tialAnno and compared the results with those of non-spatial annotation methods commonly applied to scRNA-seq data: SCINA, Garnett, CellAssign, and scSorter.Briefly, we simulated gene expression counts using a splatter model ( 34 ) for seven cortical layers using labels from the DLPFC data.Then, we selected five marker genes for each layer based on the logfold change in expression (see Supplementary Notes).In total, we obtained 35 marker genes and 2000 non-marker genes for 3639 spots from seven layers.For each simulated SRT dataset, we applied SpatialAnno and the four other methods to perform spatial domain annotation.We used Cohen's Kappa, mean F1 (mF1) score, and classification accuracy (ACC) (see Supplementary Notes) to quantify the concordance between the detected spatial domains and the seven labeled cortical layers ( 11 ,14 ).We performed 50 replicate simulations for each setting.To determine if the three performance measures of the compared methods were distinguishable, we computed the Bayes factor ( 35 ,36 ) to directly compare the performance of each method against SpatialAnno.A Bayes factor > 3 was considered statistically different.( 36 ).
When the correct number of layers was specified, Spa-tialAnno (Kappa = 0.903, mF1 = 0.807 and ACC = 0.922) outperformed all other methods in terms of annotation accuracy (Figure 1 C; number of cell / domain types = 7), with Bayes factors > 10 (Supplementary Figure S1A).After varying the number of cell / domain types with marker genes, the SpatialAnno annotation still outperformed all other methods (Figure 1 C; number of cell / domain types = 5 or 9), with Bayes factors > 10 (Supplementary Figure S1A).Unsurprisingly, Spa-tialAnno performed worse when there were five cell / domain types with marker genes (Kappa = 0.839, mF1 = 0.729 and ACC = 0.883) than seven or nine (Kappa = 0.900, mF1 = 0.803 and ACC = 0.918).The latter two cases (seven and nine cell / domain types) led to comparable annotation performances for SpatialAnno and CellAssign.In contrast, annotation performance decreased for the other methods when we included marker genes for irrelevant cell / domain types.We examined the robustness of SpatialAnno when there were various degrees of marker gene misspecification (Figure 1 D), as well as the presence of shared marker genes across different cell types.As the proportion of misspecified marker genes increased, the annotation performance decreased for all methods, but SpatialAnno still outperformed all other methods in terms of annotation accuracy (Kappa, mF1 and ACC) , with Bayes factors > 3 (Supplementary Figure S1B).Similarly, when the proportion of shared marker genes across different cell types increased, we observed consistent outcomes (Supplementary Figure S1C and S1D).
Next, we examined the effectiveness of SpatialAnno, which leverages various amounts of non-marker information compared with the scSorter and Garnett methods, also capable of leveraging non-marker genes (Supplementary Figure S2A).As the number of non-marker genes increased from 60 to 2000, SpatialAnno showed 10.3%, 21.9% and 8.1% improvements in annotation accuracy for Kappa, mF1 and ACC, respectively, while the annotation accuracies of scSorter and Garnett were almost unchanged, with the changes being -0.6% and -0.6% for Kappa, 1.7% and -0.1% for mF1, and -0.1% and -0.7% for ACC, respectively.These results suggest that SpatialAnno can effectively leverage various numbers of non-marker genes.
In addition to the spatial spots being accurately annotated, the low-dimensional embedding of non-marker genes from SpatialAnno was cell / domain-type informative.Clustering performance using low-dimensional embeddings with either marker genes or non-marker genes, or a combination of the two, with a comparable adjusted rand index (ARI) between marker and non-marker genes, is shown in Supplementary Figure S2B and C.Not surprisingly, combining both embeddings for marker and non-marker genes led to improved ARIs in all scenarios, demonstrating the benefits of borrowing information from non-marker genes when annotating cell / domain types.In addition, the Pearson's correlation coefficients for the relationship between the observed expression and the estimated labels, given the embeddings from Spa-tialAnno, were much smaller than those for the principal component analysis (PCA), but comparable to those for the DR-SC ( 22 ) (Supplementary Figure S2D and E).These results suggest SpatialAnno embeddings can capture cell / domain-typerelevant information for each spot, thus facilitating the downstream analysis.
Finally, we evaluated the computational efficiency of all methods for different numbers of cell / domain types, as shown in Supplementary Figure S2F.SpatialAnno was computationally efficient and comparable in efficiency to SCINA and sc-Sorter, and all three were faster than Garnett and CellAssign.

SpatialAnno improves annotations of known layers in human dorsolateral prefrontal cortex
We applied SpatialAnno and the four methods to the analysis of human dorsolateral prefrontal cortex (DLPFC) 10x Visium data ( 23 ).In this dataset, there were 12 tissue sections from three adult donors with a median depth of 291 million reads for each sample, a median of 3844 spatial spots per section and a mean of 33 538 genes per spot (Supplementary Table S5).Based on a manual examination of cytoarchitecture and specific marker genes, each tissue section was carefully annotated by the original study ( 23 ) in one of the six layers of the prefrontal cortex or white matter (WM).Taking sample ID151507 as a reference, we constructed a marker-gene list that contained five marker genes for each of the seven layers (see Supplementary Notes).
Taking manual annotations as ground truth, we first evaluated the performance of spatial annotation using Kappa, mF1, and ACC for each of the 12 tissue sections (Figure 2 A).Spa-tialAnno annotated spatial domains more accurately (median Kappa = 0.524, median mF1 = 0.494 and median ACC = 0.628) than scSorter (median Kappa = 0.381, median mF1 = 0.366 and median ACC = 0.489), SCINA (median Kappa = 0.209, median mF1 = 0.337 and median ACC = 0.307), Garnett (median Kappa = 0.24, median mF1 = 0.32 and me-dian ACC = 0.339) and CellAssign (median Kappa = 0.253, median mF1 = 0.29 and median ACC = 0.326), with Bayes factors > 50 (Supplementary Figure S3).The heatmap of the spatial assignments from SpatialAnno and the other methods and the manual annotations for sample ID151673 are shown in Figure 2 B. SpatialAnno achieved the best annotation accuracy (Kappa = 0.634, mF1 = 0.619 and ACC = 0.685), while the annotations from scSorter, SCINA and CellAssign were only accurate for the WM, and Garnett completely failed to assign the WM region.Notably, the domains identified in Spa-tialAnno were spatially smooth, continuous, and well matched with the elevated expression levels of marker genes for each layer (Figure 2 C and Supplementary Figure S4-S15), such as PCP4 and MOBP that are marker genes for layer 5 and WM, respectively ( 23 ,37 ).
To evaluate the robustness of SpatialAnno, we obtained marker genes from the other DLPFC tissue section that contained seven layers and performed spatial annotation for the remainder of the 11 tissue sections (see Supplementary Notes).Using the top 5 / 10 / 15 DEGs as marker genes for each layer, SpatialAnno achieved the best annotation accuracy according to Kappa, mF1 and ACC.The annotation accuracies of all other methods for the other tissue sections were slight worse than for those when sample ID151507 was used as a reference (Supplementary Figure S16A), which is consistent with the simulations involving the misspecification of marker genes (Figure 1 D).This suggests that annotation accuracy can be impaired when inaccurate marker genes are used.However, this difference became negligible when the number of marker genes for each layer was 15.Furthermore, we examined the robustness of SpatialAnno using marker genes for irrelevant cell types, those not present in the studied SRT dataset.For samples ID151669-151672 from Donor 2, which only contained five cortical layers, we applied SpatialAnno and other methods using marker genes for the seven layers.As shown in Supplementary Figure S16B, SpatialAnno achieved the best annotation performance for these samples.
Uniquely amongst the methods, SpatialAnno's estimated embeddings were highly informative for the DLPFC layers in the 12 sections.The clustering accuracies, determined using the ARI for embeddings from marker , non-marker , and a combination of the two, respectively, were shown in Supplementary Figure S16C, with the largest ARI value for embeddings from a combination of the two.Clearly, embeddings from non-marker genes harbored substantial amount of information about spatial domains, even more than the marker genes.When using a combination of marker and nonmarker genes, the embeddings led to improved clustering performance, suggesting that annotation based on both marker and non-marker genes improved the annotation accuracy.Red / green / blue (RGB) plots using three tSNE components for the embeddings in sample ID151673 estimated by Spa-tialAnno revealed a more clear laminar structure for DLPFC than those by PCA or DR-SC (Figure 2 D).Such stronger structure predictivity from SpatialAnno is numerically supported by its higher ARI (0.450) compared to PCA (ARI = 0.296) and DR-SC (ARI = 0.365).Moreover, an estimated PAGA graph ( 38 ) using SpatialAnno embeddings demonstrated the almost linear development trajectory from WM to layer 1, while the PAGA graphs using both PCA and DR-SC embeddings were less clearly delineated (Figure 2 E and Supplementary Figure S4-S15).To better understand the impact of each component in SpatialAnno, we conducted additional experiments on the DLPFC dataset.As shown in Supplementary Figure S16D, we demonstrate the performance of the model when one or more components were disabled.

SpatialAnno correctly identifies cells in mouse olfactory bulb
To quantitatively demonstrate the performance of Spa-tialAnno compared with SCINA, scSorter, CellAssign and Garnett in domain-type annotation, we analyzed a mouse OB data generated using ST technology.This dataset represented 12 tissue sections with a median of 16 024 gene expression measurements among a median of 266 spots (Supplementary Table S6).
Taking the four anatomic layers manually annotated based on H&E staining as ground truth (Figure 3 A), we first evaluated the performance of the spatial annotation using Kappa, mF1 and ACC for Section 12 (Figure 3 B).SpatialAnno annotated spatial domains more accurately (Kappa = 0.739, mF1 = 0.812 and ACC = 0.800) than scSorter (Kappa = 0.608, mF1 = 0.718 and ACC = 0.696), SCINA (Kappa = 0.598, mF1 = 0.670 and ACC = 0.689), CellAssign (Kappa = 0.395, mF1 = 0.607 and ACC = 0.707) and Garnett (Kappa = 0.552, mF1 = 0.686 and ACC = 0.646).We examined the robustness of SpatialAnno by including marker genes for two irrelevant cell types (endothelial and mural cells) that were not present in this section, and SpatialAnno achieved the best annotation performance (Supplementary Figure S17A).To illustrate the effectiveness of leveraging non-marker information, we evaluated the performance of the spatial annotation by SpatialAnno, scSorter, and Garnett with 30, 300 or 3000 nonmarker genes, as only these three methods are able to leverage non-marker gene information.SpatialAnno achieved higher annotation accuracy when more non-marker genes were used, while the difference in performance between 300 and 3000 non-marker genes was minimal for SpatialAnno (Supplementary Figure S17B).In contrast, scSorter and Garnett performed similarly with 30 or 300 non-marker genes, but their performance deteriorated when 3000 non-marker genes were applied.
SpatialAnno recovered the laminar structure of the mouse OB across 12 sections (Supplementary Figure S18).The mouse OB has a multi-layered cellular architecture in the order, from the inner to outer layer, of granule cell layer (GCL), mitral cell layer (MCL), glomerular layer (GL) and the nerve layer (ONL).Detailed assignments by SpatialAnno and the other four methods for Section 12 are shown in Figure 3 A. The cell types annotated by SpatialAnno accurately represented this laminar structure, while CellAssign incorrectly assigned 'unknown' cells to regions belonging to GCL, MCL and GL.Moreover, the annotation patterns of Garnett were rather chaotic, while scSorter and SCINA failed to distinguish periglomerular cells (PGC) in the GL.
We further examined the expressions of marker genes specific to each layer, including Kit for external plexiform layer interneuron (EPL-IN) ( 28 ), Penk for granule cells (GC) ( 39 ), Cdhr1 for mitral and tufted cells (M / TC) ( 40 ), S100a5 for olfactory sensory neurons (OSN) ( 41 ) and Th for PGC ( 42 ) (Figure 3 C).Although the three methods provided similar assignments for GC, M / TC, OSN and PGC, their assignments for EPL-IN were quite different.EPL-IN are located adjacent to GL in the external plexiform layer comprises PGC ( 28 ).SpatialAnno assigned spots near PGC to EPL-IN; however, sc-Sorter and Garnett did not (Supplementary Figure S19).As the ground truth for the EPL-IN locations was unknown, we manually combined the inferred EPL-IN with the adjacent layers in different ways: (i) by combining the inferred EPL-IN and PGC and (ii) by combining the inferred EPL-IN, M / TC and PGC.SpatialAnno still achieved the best annotation accuracy (Supplementary Figure S17C & D).
Another key benefit of SpatialAnno is its ability to extract low-dimensional embeddings relevant to different cell types from the high-dimensional non-marker genes, which is useful for many downstream analyses.We summarized the low-dimensional embeddings inferred by SpatialAnno (Sup- plementary Figure S17E), PCA and DR-SC into 3D tSNE components and visualized the resulting components in the RGB plot.The RGB plot (Figure 3 D) shows the multi-layered architecture of the mouse OB, with neighboring spots sharing more similar colors to those farther away.To compare the predictive powers of these low-dimensional embeddings for the four anatomic layers annotated based on H&E staining, we applied the Louvain community detection algorithm to spot clustering using the Seurat R package.The clusters identified by SpatialAnno depicted the multi-layered structures more accurately (ARI = 0.599) than those of PCA (ARI = 0.549) or DR-SC (ARI = 0.569).

SpatialAnno reveals cell-type distribution in mouse hippocampus with SRT data at near-cell resolution
To show the cell-type distribution in the mouse hippocampus, we applied SpatialAnno and the other methods to the analysis of a mouse hippocampus dataset generated using Slide-seqV2, which quantifies transcriptome-wide expression levels at near-cellular resolution with 10μm barcoded beads ( 3 ).This dataset contains expressions for 23 264 genes over 53 208 spatial locations (Supplementary Table S7).As shown in the Allen Reference Atlas (Figure 4 A), the primary regions in the mouse hippocampus were composed of the cornu ammonis (CA1-3) and dentate gyrus (DG).SpatialAnno clearly identified a 'cord-like' structure as well as an 'arrow-like' structure in the hippocampal subfields in C A1, C A3 and DG (Figure 4 B), which is consistent with the annotation of hippocampus structures in the Allen Reference Atlas (Figure 4 A).In contrast to SpatialAnno, the other methods SCINA, Garnett, and CellAssign showed blurred / incorrect localizations for the primary hippocampal subfields in CA3 and DG and were unable to reveal the main   S23A).Specifically, compared with the RGB plots for PCA and DR-SC, the plot for SpatialAnno clearly depicted the Hb region.
Finally, we validated the cell-type distributions identified for an independent slide from the mouse hippocampus profiled using Slide-seq.As with the initial version of Slide-seqV2, the transcript detection sensitivity of Slide-seq is relatively low (Figure 4

Embeddings estimated by SpatialAnno lead to biologically relevant trajectories in mouse embryo
We further applied SpatialAnno and the other methods to the analysis of a dataset obtained from three mouse embryo sections collated at the 8-12 somite stage using seqFISH ( 2 ), which has the capability of probing the expression of a targeted gene set at the single-cell resolution by image processing and single-cell segmentation ( 2 ).Each of the three mouse embryo sections contained expression level measurements for 351 genes, chosen to recover the cell-type identities at these developmental stages, from around 20 000 cells, as well as their physical locations (Supplementary Table S8).After selecting 168 marker genes for 21 cell types (see Supplementary Notes), 183 non-marker genes remained for annotation analysis.
The original study provided manual annotations for the cells based on their nearest neighbors in the Gastrulation atlas ( 43 ).For each method, we summarized the annotation accuracy using both Kappa, mF1 and ACC for each embryo section (Figure 5 A and Supplementary Figure S27).SpatialAnno achieved the highest Kappa, mF1 and ACC in two of the three sections and was only surpassed by CellAssign for the second embryo section.For Embryo 1, the annotations of different methods are shown in Figure 5 B. Clearly, cell-type distributions identified by SpatialAnno were well matched with the expression of their corresponding marker genes (Figure 5 C).
For the embeddings uniquely estimated by SpatialAnno, we performed trajectory inference on brain cells to investigate the spatiotemporal development of the mouse brain and detected two linear trajectories (Figure 5 D).We observed the lowest pseudotime values in the mesencephalon, which diffused smoothly toward the tegmentum followed by the rhombencephalon in one branch, and towards the prosencephalon in another branch (Figure 5 D).More importantly, the diffusion patterns were spatially continuous and smooth.The detected trajectories delineated the spatial trajectories of mouse brain development, which are in agreement with the findings of recent studies ( 2 ).In contrast, the trajectories identified using embeddings from either PCA or DR-SC lacked spatial continuity (Supplementary Figure S28A and B).We further examined genes associated with the inferred pseudotime, and a heatmap of the expression levels of the top 20 significant genes suggested that there were interesting expression patterns over pseudotime (Supplementary Figure S28C).A mesencephalon and prosencephalon maker gene, Otx2 ( 44 ,45 ), showed higher expression levels in the early stage of development, while at a later stage, its expression levels were sub-stantially suppressed (Supplementary Figure S28D).In contrast, the expression levels of a gene enriched in the rhombencephalon, Sfrp1 ( 46 ), changed from low to high (Supplementary Figure S28D).These results concur with the formation of the midbrain-hindbrain boundary ( 47 ,48 ), and this is supported by the observation that these two genes could be used to identify the precise boundary between the mesencephalon and rhombencephalon (Supplementary Figure S28E).

Discussion
SpatialAnno takes, as input, the normalized gene expression matrix, the physical location of each spot, and a list of marker genes for known cell / domain types.The output of SpatialAnno comprises the estimated posterior probability of each spot belonging to each cell / domain type and the low-dimensional embeddings of each spot for nonmarker genes.To efficiently capitalize on both marker and non-marker genes, SpatialAnno uniquely models the expression levels of non-marker genes via a factor model governed by cell / domain-type separable low-dimensional embeddings and simultaneously promotes spatial smoothness via a Potts model.As a result, SpatialAnno provides improved spatial cell / domain-type assignments, and its estimated lowdimensional embeddings are cell-type-relevant and can facilitate downstream analyses such as trajectory inference.Spa-tialAnno is computationally efficient, easily scalable to spatially resolved transcriptomics with tens of thousands of spatial locations and thousands of genes (Supplementary Table S9).With simulation studies, we demonstrated that Spa-tialAnno presents improved spatial annotation accuracy with either correct, under-or over-specification of the number of cell / domain types, robustness to the marker gene misspecification and efficient leveraging of non-marker genes compared with other annotation methods.
We examined the SRT data generated using different platforms, such as 10x Visium, ST, Slide-seqV1 / 2 and seqFISH, with various spatial resolutions.Using both DLPFC 10x Visium datasets and mouse OB ST datasets with manual annotations, we demonstrated the improved annotation accuracy of SpatialAnno with the capability of recovering laminar structures, while the identified PAGA graph using embeddings in SpatialAnno recovers an almost linear trajectory from WM to layer 1.In DLPFC datasets, the domains identified were well matched with the elevated expression for marker genes, such as PCP4 and MOBP that are marker genes for layer 5 and WM, respectively ( 23 ,37 ).Using mouse hippocampus Slide-seqV1 / 2 datasets, we demonstrated that SpatialAnno can successfully detect the primary hippocampal subfields for CA1, CA3 and DG, with almost a perfect correlation between celltype proportions in both datasets and the elevated expression levels for Wfs1 , Cpne4 and C1ql2 are well matched with C A1, C A3 and DG regions identified by SpatialAnno, respectively.Wfs1 showed differential expression in hippocampal field CA1 and has been reported to be highly expressed in the CA1 region ( 49 ).Cpne4 , a known marker gene for hippocampal subfield CA3, was highly expressed in a region identified as CA3 ( 50 ).In addition, C1ql2 , a marker gene for dentate principal cells, was expressed in a region identified as DG ( 51 ).When applied to mouse embryo seqFISH datasets, SpatialAnno not only provided improved annotation accuracy, but uniquely estimated cell-type-aware embeddings leading to the identification of two trajectories in brain regions, originating in mesencephalon towards the rhombencephalon and prosencephalon, respectively.Moreover, cell-type distributions identified by SpatialAnno were well matched with the expression of their corresponding marker genes.For example, Popdc2 , a cardiomyocyte marker, was expressed in the developing heart tube ( 52 ).Foxa1 , a gut endoderm marker, showed the highest expression levels in the developing gut tube along the anterior-posterior axis of the embryo ( 53 ).In addition, Foxf1 , a mesoderm marker that encodes a forkhead transcription factor expressed in the splanchnic mesenchyme surround-ing the gut, was highly expressed at the identified splanchnic mesoderm ( 54 ).
SpatialAnno paves the way for future spatial annotation analyses in multiple scenarios.For example, a similar strategy can be applied to the problem of cell-type assignment in other spatial omics data, such as spatial resolved singlecell chromatin accessibility data ( 55 ) and spatial proteomics ( 56 ).To establish a complete spatial atlas of organism architecture, a critical bottleneck is to perform an automatic celltype assignment with both considerations of molecular fea-PAGE 13 OF 15 tures with / without prior knowledge as well as their spatial organization, SpatialAnno can substantially reduce both the irreproducibility and human effort in the processes of manual cell / domain-type assignment ( 56 ).We have primarily focused on examining Spatial Transcriptomics (SRT) technologies that measure high-dimensional gene expression at each tissue location.In addition, there are spatial proteomics technologies, such as Cytometry by Time-of-Flight (CyTOF) and CODEX, which characterize proteomic profiles of single cells using 30-40 protein channels ( 57 ).These technologies generate lowdimensional data.Since predefined marker proteins that define cell types are available, SpatiaAnno can also be applied to analyze these datasets.In our analysis of real CyTOF data from breast cancer samples ( 58 ) and simulated CyTOF data from Cytomulate (unpublished manuscript), we observed that SpatialAnno and SCINA demonstrate similar performance.Furthermore, these methods outperform other approaches in terms of accuracy and robustness, as shown in Supplementary Figure S29.
The benefits of SpatialAnno come with some caveats that may require further exploration.First, SpatialAnno is applicable for spatial annotation in a single tissue slide.With multiple tissue slides available, methods that are capable of integrating multiple SRT datasets for cell / domain-type annotation are sincerely needed ( 59 ).Second, SpatialAnno was designed to perform annotation analysis of data with a single modality.However, incorporating multi-modal data with data of other modalities can further improve annotation accuracy.Third, many of the early SRT technologies do not have a single-cell resolution, and SpatialAnno is only able to assign domains with prior knowledge of each spot for those datasets.Cell-type annotation for this type of dataset further requires simultaneous deconvolution with spatial cellular annotation.

Figure 1 .
Figure 1 .Sc hematic o v ervie w of SpatialAnno and its perf ormance in simulation studies.( A ) SpatialAnno emplo y s spatial transcriptomics data along with a known marker-gene list in its analysis.With these two datasets as input, SpatialAnno performs spatial annotation via a probabilistic model that combines both marker and non-marker gene expression data, and produces both domain / cell-type assignments and low-dimensional embeddings for all spatial locations as output.( B ) Ov ervie w of the SpatialAnno probabilistic model.Latent cell / domain types (shown in the gray circle) and observed data (shown in the blue boxes) are shown along with the distributional assumptions.( C ) Kappa, mF1, and ACC of SpatialAnno, scSorter, SCINA, Garnett and CellAssign for simulation data from seven cortical layers; different numbers of cell / domain types are provided as a list of marker genes.( D ) Kappa, mF1 and ACC of SpatialAnno, scSorter, SCINA, CellAssign and Garnett for simulation data from seven cortical layers; different proportions of marker genes are erroneously specified.

Figure 2 .
Figure 2. Spatial domain annotation in the DLPFC 10x Visium dataset.( A ) B o xplots of Kappa, mF1, and ACC showing the accuracy of different methods for domain annotation across 12 tissue sections.( B ) Spatial domain annotation in tissue sample ID1 51 673 for ground truth, SpatialAnno, scSorter, SCINA, Garnett and CellAssign.( C ) Top, expression levels of corresponding layer-specific marker genes.Bottom, annotations by SpatialAnno are shown on each spot.( D ) RGB plots for low-dimensional embedding inferred by SpatialAnno, PCA and DR-SC.As end-to-end annotation approaches, scSorter, SCINA, Garnett and CellAssign cannot be utilized to extract low-dimensional embeddings.( E ) PAGA graphs generated by SpatialAnno, PCA and DR-SC embeddings for DLPFC Section ID1 51 673.

Figure 3 .
Figure 3. Spatial annotation in the mouse olfactory bulb dataset.( A ) Anatomic la y ers annotated based on H&E staining of the olfactory bulb, and cell-types inferred by SpatialAnno, scSorter, SCINA, Garnett and CellAssign.( B ) Bar plots of Kappa, mF1 and ACC showing the domain-type annotation accuracy of different methods.( C ) Top, expression levels of corresponding cell-type-specific marker genes.Bottom, annotations by SpatialAnno are shown on each spot.( D ) RGB plots of low-dimensional embeddings inferred by SpatialAnno, PCA and DR-SC.As end-to-end annotation approaches, scSorter, SCINA, Garnett and CellAssign cannot be utilized to extract low-dimensional embeddings.

Figure 4 .
Figure 4. Spatial cell-type annotation of the mouse hippocampus dataset.( A ) Annotation of hippocampus str uct ures from the Allen Reference Atlas of an adult mouse brain.( B ) Spatial annotation of the Slide-seqV2 hippocampus section by SpatialAnno, scSorter, SCINA, Garnett and CellAssign.( C ) Top, e xpression le v els of corresponding cell-type-specific mark er genes.B ottom, annotations b y SpatialAnno of the Slide-seqV2 hippocampus section are shown on each spot.The examined cell types were CA1 cells, CA3 cells and dentate cells.( D ) Results of P earson 's chi-squared test of correlation betw een e xpression patterns of mark er genes and the three hippocampal subfields identified b y different methods.( E ) Total UMIs per bead for Slide-seq (y ello w, n = 34, 199 spots) versus Slide-seqV2 (blue, n = 53, 208 spots) in the mouse hippocampus sections.( F ) Top, e xpression le v els of corresponding cell type specific marker genes.Bottom, annotation by SpatialAnno of the Slide-seq hippocampus section is shown on each spot.
E). SpatialAnno successfully identified the hippocampal subfields in this Slide-seq data (Supplementary Figure S23B-D and Supplementary Figure S24-S26).The annotated regions for C A1, C A3 and DG with their marker gene expressions are shown in Figure 4 F.

Figure 5 .
Figure 5. Spatial cell-type annotation of the mouse embryo dataset.( A ) Bar plots of Kappa, mF1 and ACC showing the cell-type annotation accuracy of different methods.( B ) Spatial annotations for ground truth, SpatialAnno, scSorter, SCINA, Garnett and CellAssign.( C ) Top, expression levels of corresponding cell-type-specific marker genes.Bottom, annotations of ground truth and SpatialAnno are shown on each spot.( D ) Left: latent time trajectory generated by slingshot on low dimensional embeddings of SpatialAnno.Right: clustering of the forebrain / midbrain / hindbrain cells into four spatially distinct clusters representing different regions of the de v eloping brain.