SMDB: a Spatial Multimodal Data Browser

Abstract Understanding the relationship between fine-scale spatial organization and biological function necessitates a tool that effectively combines spatial positions, morphological information, and spatial transcriptomics (ST) data. We introduce the Spatial Multimodal Data Browser (SMDB, https://www.biosino.org/smdb), a robust visualization web service for interactively exploring ST data. By integrating multimodal data, such as hematoxylin and eosin (H&E) images, gene expression-based molecular clusters, and more, SMDB facilitates the analysis of tissue composition through the dissociation of two-dimensional (2D) sections and the identification of gene expression-profiled boundaries. In a digital three-dimensional (3D) space, SMDB allows researchers to reconstruct morphology visualizations based on manually filtered spots or expand anatomical structures using high-resolution molecular subtypes. To enhance user experience, it offers customizable workspaces for interactive exploration of ST spots in tissues, providing features like smooth zooming, panning, 360-degree rotation in 3D and adjustable spot scaling. SMDB is particularly valuable in neuroscience and spatial histology studies, as it incorporates Allen's mouse brain anatomy atlas for reference in morphological research. This powerful tool provides a comprehensive and efficient solution for examining the intricate relationships between spatial morphology, and biological function in various tissues.


INTRODUCTION
Spatially resolved transcriptomics has revolutionized our understanding of tissue organization by enabling in situ transcriptomic profiling, leading to groundbreaking discoveries in neuroscience ( 1-3 ), de v elopmental biology ( 4 ) and disease r esear ch ( 5 , 6 ). ST data has been used to cr eate unbiased spatial transcriptional landscapes of various tissues ( 7 , 8 ), including the brain ( 1-3 , 9 ), human kidney ( 10 ), heart ( 4 ), testis ( 11 ) and lung ( 12 ). Moreo ver, ST pro vides invaluable insights into the tissue disorganization observed in diseases, especially cancer, by re v ealing molecular features and mechanisms at the tumor micr oenvir onment's leading edge with normal tissue. A comprehensi v e understanding of the biological functions reflected in the transcriptional state requires simultaneous knowledge about morphological context ( 13 ). Consequently, visualization tools that seamlessly combine spatial-omics and morphological information are becoming increasingly popular among researchers.
Se v eral tools supporting spatial analysis and visualization have been released recently. Brain Explorer ( 14 ) allows the exploration of gene expression and anatomical information in 2D and 3D space using built-in r efer ence data for adult mice. Other tools, such as spatialLIBD ( 15 ), Giotto ( 16 ), STUtility ( 17 ) and Cirrocumulus ( https://github.com/ lila b-bcb/cirrocumulus ), ena ble users to load their data to analyz e and visualiz e cell typing results with spatial positions in 2D space. Although ST-Viewer ( https://github. com/jfnavarro/st viewer ) can present ST data in 3D space, it lacks support for morphology informa tion. W hile these tools can visualize the gene expression pattern of adjacent 2D slices or align stacked experiments to create a static view of the tissue, it is inconvenient to screen the distribution of genes and related cell types from a 3D morphology perspecti v e by trav ersing m ultiple slices sim ultaneousl y. There is an urgent need for a tool that can support the interacti v e e xploration of molecular features and spatial morphology of ST data in 3D space.
The Spatial Multimodal Data Browser (SMDB) is a visualization tool designed to seamlessly connect gene expression profiles with spatial context and morphological informa tion. It of fers a range of features to enhance user experience and facilitate in-depth analysis, including: • 2D spa tial mapping tha t integra tes molecular fea tures from spatial transcriptomic data with morphological aspects like stained tissue sections. • 3D morphological reconstruction and visualization based on gene expression, molecular subtypes or image segmenta tion. SMDB fea tures a built-in reference atlas of Allen's mouse brain anatomy atlas for easy comparison with reconstructed morphology to investigate the differences. • Interacti v e spot and region-le v el analysis, such as userdefined label filtering, and manual lassoing tools for outlining characteristic morphological regions with arbitrary shapes. • Comprehensi v e summaries of ST data quality, as well as correlation and gene expression difference statistics between clusters.

Implementation
SMDB is built with a Browser / Server ar chitectur e, le v eraging Spring Boot for the backend and Vue and ElementUI frame wor ks for the frontend. The data is stored in Mon-goDB, a document-based NoSQL database. To process graphical data, SMDB employs open3d ( 18 ), DBSCAN ( 19 ) and vtk ( https://vtk.org/ ). Echarts is used for loading 2D data, while Three.js library is utilized for rendering 3D geospa tial da ta.

Data workspace
SMDB is compatible with various spatial omics technologies and relies on three types of essential data: expression matrix, physical position of spots, and spot annotation information, along with optional anatomical structure data. Users must provide an expression matrix in TSV or TSVbased ZIP format, spot coordinates in a 4-column TSV file, and metada ta-annota ted spot informa tion in TSV forma t.
To illustrate its potential, SMDB utilized the Ortiz's dataset comprising 75 coronal sections of a hemisphere of the whole brain, alongside matching H&E-stained images and reference outlines ( 9 ). Additionally, SMDB allows users to load tissue outlines and offers convenient email notifications and workspace management for an enhanced user experience. Additional user data r equir ements and workspace usage can be found from SMDB w e bsite.

3D structure reconstruction
When dealing with ST datasets that include a r efer ence ana tomy a tlas, the reconstruction process involves two crucial steps: 2D image registration and 3D reconstruction. Initially, each 2D section image is r egister ed to a 3D stereotactic atlas such as the Allen CCFv3 ( 20 ), and the register ed r esults serve as the foundation for 3D reconstruction.
Howe v er, if the ST datasets lack a r efer ence anatomy atlas, users can perform 3D reconstruction directly using molecular morphological fea tures, spa tial fea tures of spots and other similar features. The registration process for ST datasets usually involves three steps: locating the 2D slice position in the stereotactic atlas, re-slicing the reference slice image and registering the 2D slice to the r efer ence slice image. Ther e ar e curr ently thr ee main types of r egistration methods: manual ( 21 ), semi-automatic ( 22 ), and automatic (23)(24)(25). The accuracy of these methods varies, with manual methods achieving up to 25 um accuracy, while semi-automated or automated methods typically range between 50 and 100 um. Although manual correction is often necessary to further improve registration accuracy, it cannot be fully automated, especially for studies requiring fine subregion segmentation of the brain. To aid users with registration, we have provided a detailed description of the registration process on the HELP page of SMDB. This allows users to perform registration locally and upload high-quality results to the browser for automatic 3D point cloud reconstruction.
The 3D reconstruction process contains three steps: noise erase, point cloud reconstruction, and smoothing. Before 3D reconstruction, we utilize the DBSCAN method ( 19 ) to eliminate noisy points through density-based clustering. This technique starts with a core point and continuously expands into areas of consistent density. Noise distributions, being more dispersed, cannot be classified into any clusters, and are consequently removed from the data. We then evaluate popular 3D reconstruction algorithms, including Conv e x Hull, Alpha Shape, Ball Pivoting, and Poisson Surface (refer to Supplementary Methods). Ultimately, we select the highly efficient Alpha Shape Algorithm ( 18 ) for point cloud reconstruction, as it effecti v el y ca ptures the external contours of the point cloud to facilitate 3D reconstruction. As the alpha shape algorithm forms triangular angles at the three intersecting points of the point cloud during contour formation, we further smooth the outer contours using standard low-pass filters in signal processing ( 26 ) in vtk. To evaluate the accuracy of our reconstruction, we utilized the Ortiz's dataset ( 9 ) to reconstruct different regions such as the hippocampal region and fiber tracts. Our 3D reconstructions exhibited high accuracy in matching the anatomical morphology, as detailed in the Supplementary Methods. Nucleic Acids Research, 2023, Vol. 51, Web Server issue W555

Refer ence anatom y atlas collection
Gi v en the widespread use of ST technology in neuroscience, SMDB has integrated the latest version of Allen Mouse Brain Common Coordinate Frame wor k (CCFv3) ( 20 ), which includes 43 isocortical areas and their layers, 329 subcortical gray matter structures, 81 fiber tracts, and 8 ventricular structures. With SMDB, users can load refer ence anatomical structur es into the workspace and accuratel y ma p different types of data into a common 3D space for comparison and correlation. This allows for a fle xib le and comprehensi v e approach to e xploring tissue molecular landscapes and gaining insights into the complex interactions between gene expression and spatial context in the mouse brain.

2D visualization
SMDB enables r esear chers to spa tially integra te molecular data with morphological information in 2D space. Users can visualize spatially resolved transcriptomics data alongside r efer ences, such as Allen's mouse brain ana tomy a tlas borders (Figure 1 A), or in situ hybridization (ISH) images to align molecular data with morphological regions (Figure  1 B). SMDB highlights gene expression-profiled boundaries, re v ealing molecular characteristics within or between different regions. Employing supervised or unsupervised analysis clustering, r esear chers can segment regions or subregions to investigate tissue composition (Figure 1 C).
Users can manually align morphology and spots by overlaying and scaling images freely. Using the lasso tool, 2D tissue sections can be segmented by interacti v ely selecting regions of interest based on morphological information (Figure 1 D). SMDB supports filtering spots in highlighted areas for subsequent annotation correlation, grouping statistics, leading edge analysis, and more.

3D visualization
Various strategies for ST data visualization can applied in 3D space as a continuous superposition of information from multiple adjacent 2D slices in a third dimension (axis Z). SMDB enables the combined display of anatomical morphological features, molecular morphological features, spa tial fea tures of spots, and cell type features in the same wor kspace to interacti v ely e xplore gene e xpression maps and the spatial distribution of cell types in tissues in 3D space ( Figure 2 ).
The introduction of 3D spatial morphological r efer ence contours allows for more accurate localization of the spatial position of different cell types and gene expression in tissues (Figure 2 ). By freely rotating the view and positioning molecular classification mar kers, SMDB enab les users to re v eal hidden patterns and molecular fea tures tha t are easily missed in 2D perspecti v e.
Unlike existing spatial transcriptome visualization tools that only align and stack 2D sections f or displa y, SMDB includes a 3D reconstruction module that uses pure 3D tissue volume reconstruction technology to interacti v ely display tissue morphology obtained by reconstructing tissue seg-mentation r epr esented by ana tomical fea tures based on image slices, as well as molecular morphology obtained by features based on ST data. The reconstruction algorithm, optimized based on point cloud reconstruction, not only conv erts 3D coor dina te point fea tures into solid morphological features but also preserves morphological details with fine tuning (Figure 2 A). Certain morphological featur es wer e concentrated in distinct anatomical regions (Figure 2 B); for example, HIPsub-10 was prevalent in DG-po, HIPsub-09 in DG-sp, and HIPsub-05 in CA1sp. Some HIPsubs further divided anatomical structures into molecular clusters, such as Hipsub-02,04,07 which constituted DG-mo. Others encompassed multiple regions like HIPsub-01, while some necessita ted further explora tion as they did not clearly correspond to a specific region. This method connects the physical structure of biological systems to molecular characteristics, providing r esear chers with a compr ehensi v e understanding of tissue molecular landscapes.

CASE STUDY
The dorsal striatum (DS) in rodents is a single mass of gray matter often r eferr ed to as the caudate-putamen complex ( 27 ), and comprises two functionally distinct regions: the dorsola teral stria tum (DLS), responsible for integra ting sensorimotor information, and the dorsomedial striatum (DMS), involved in processing associative information (28)(29)(30)(31). Howe v er, the morphological boundary of the DLS and DMS remains to be clearly defined.
Using the SMDB, we identified two primary subregions of the DS in the striatum, based on molecular subtypes from the Ortiz's mouse brain molecular atlas ( 9 ). We analyzed the differential expression genes (DEGs) of these two subr egions and discover ed Crym and Cnr1 as typical markers for DMS and DLS, corresponding to specific markers of two distinct medium spiny neuron (MSN) subtypes ( 32 ). We then delineate the morphological subregions of these two genes, overlaying Allen's mouse brain anatomy atlas borders on each slice. Subsequently, we utilized the 3D reconstruction tool in SMDB to crea te separa te 3D r epr esentations of the DMS and DLS regions (Figure 3 A).
We extracted the spots of each region in SMDB and explored the molecular differences between the DLS and DMS morphological regions. We identified 14 ISHsupported regional signature genes (Figure 3 B, Supplementary Table 1), including Coch, Kcnk2 and Calb1, etc. which hav e been pre viously reported to be associated with DLS and DMS regions (33)(34)(35)(36)(37). SMDB offers insights into the morphological characterization of the DLS and DMS subregions through a combination of molecular and morpholo gical a pproaches.

Comparison with other tools
Se v eral tools are available for visualizing ST data combined with 2D morphological information. Table 1 compares SMDB with se v eral well-known tools supporting ST da ta visualiza tion, including Giotto ( 16 ), Spa tialLIBD ( 15 ), STUtility ( 17 ), ST Viewer and Cirrocumulus (Table1, Supplementary Table 2). While most of these offline tools cover 2D spa tial visualiza tion functions (raw slices form of ST image; ( C ) color-highlighted ST data corresponding to molecular clusters and aligned with Allen's mouse brain ana tomy a tlas borders; ( D ) a manually selected subregion highlighted based on background morphological information. data), SMDB stands out as the only visualization tool that can cover embedded 3D structure, customized outline, clustering selection, and 3D reconstruction. W ha t's more, SMDB is w e b-based and runs directly in the browser, eliminating the need for a local installation process. Its compatibility with different ST technologies and incorporation of the Allen Mouse Brain Common Coordinate Framework (CCFv3) further expands its application in neuroscience r esear ch.

DISCUSSION
SMDB is a versatile tool that empowers r esear chers to interpret cell composition and gene expression in tissues in both 2D and 3D space. Users can visualize molecular expression information with reference atlases, morphological segmented sections without a r efer ence atlas, or customized partition information using a lasso tool for challenging annotations.   In the future, SMDB aims to expand its capabilities as an online visualization w e b service by enhancing its functionality to include more sophisticated analyses based on spatial information, such as cell type annotation, spatial domain identification, and cell-cell communication. By integrating high-throughput proteomics approaches with spatial transcriptomics, SMDB plans to improve compatibility with both transcriptome and protein expression, further strengthening the joint analysis and visualization of multiomics data. Additionally, the tool will continue to incorporate r esear ch on r efer ence spatial structur es for various tissues and organs, le v eraging reconstructed 3D tissue morphology based on stained images and molecular expression information to facilitate a deeper understanding of tissue structure and function for researchers.
In summary, SMDB is an innovati v e and powerful tool that enables the exploration of molecular landscapes in tissues across 2D and 3D spaces. Its adaptability and applicability in multiple spatial omics fields make it an invaluable r esour ce for r esear chers.

DA T A A V AILABILITY
SMDB is available at https://www.biosino.org/smdb . The application is free and open to all users with no login r equir ement.

SUPPLEMENT ARY DA T A
Supplementary Data are available at NAR Online.