In this demo, we will analyze a 1k-plex CosMx data (Bruker) of a mouse brain sample; see Chapter 6. Following quality control and preprocessing, we will perform cell type annotation, identification of marker genes, and cell-cell interaction analysis.
We can have an overview of the data, by plotting the spatial distribution of the cells, represented by their centroids. This is useful to have a global bird’s-eye-view of the tissue.
Unlike, Xenium and MERSCOPE, CosMx does not stitch fields of view (FOVs) in a single image, but rather treats each FOV independently. It is therefore useful to visualize a map of the FOVs to understand their spatial arrangement.
Chapter 18 gives a thorough overview of quality control steps for imaging-based spatial transcriptomics data, including CosMX-specific metrics. Here, we will simply compute SpaceTrooper’s aggregated QC score and keep the cells that pass the threshold.
Note that the annotation performed here is a transfer label from a reference dataset that only partially matches with the tissue under study and purposely annotated at a coarse level (level 1). Ideally, one can use a more appropriate reference dataset, or better yet, build a custom reference from single-cell RNA-seq data of the same tissue.
A different, often preferable, approach consists in using, in addition to the gene expression profiles, the image data and the spatial context of the cells. InSituType(Danaher et al. 2022) provides a framework to perform such analysis in unsupervised, semi-supervised, or supervised manners. Please refer to its documentation for more details.
24.4.1 Marker genes
Here, we identify cluster-related spatially-variable genes using the DESpace package (Cai, Robinson, and Tiberi 2024). In this workflow, we use this as a way to identify marker genes for the cell types annotated above.
We focus on the genes with the smallest p-values, and compute the average expression of each gene in each cell type. We can visualize this with a heatmap.
Code
df<-res$gene_resultsgs<-df$gene_id[rank(df$PValue, ties.method="min")==1]mu<-t(apply(counts(sub)[gs, ], 1, tapply, sub$SingleR_label, mean))pheatmap(t(mu), scale="column", show_colnames=FALSE, cluster_rows=TRUE, cluster_cols=TRUE, main="Average expression of top SVGs per cell type")
We can also visualize the spatial distribution of some of these genes, e.g., Mbp that marks oligodendrocytes, and Calm1, Calm2, and Snap25 that mark different subsets of neurons. Note also how some genes show a difference of expression within the pyramidal neuron population, suggesting the presence of subtypes.
We start by creating a cell neighborhood graph, based on a distance threshold of 50 microns. To do so, we use the SpatialFeatureExperiment package (Moses et al. 2023).
Note that an alternative approach uses the \(k\) nearest neighbors (kNN) method to define the neighborhood graph. Here, we prefer to use a distance-based approach, as it better captures the morphological structure of this tissue area.
Code
# convert to SFE & remove cells with NA labelssfe<-as(sub, "SpatialFeatureExperiment")sfe<-sfe[, !is.na(sfe$SingleR_label)]# this is needed for Voyager's plotting functionscolnames(sfe$polygons)[6]<-"geometry"st_geometry(sfe$polygons)<-"geometry"colGeometry(sfe, "cellSeg")<-sfe$polygonscolGraph(sfe, "poly2nb")<-findSpatialNeighbors(sfe, type="cellSeg", method="poly2nb", style="W", snap=50)plotColGraph(sfe, colGraphName="poly2nb", colGeometryName="cellSeg")+theme_void()
Given the neighborhood graph, we can calculate local measures of spatial autocorrelation for specific genes in the dataset.
Here, we focus on Snap25 and Calm1, neuronal marker genes identified above that show interesting spatial expression distributions (see above). First, we compute the local Moran’s I coefficient.
From the plots, we can see that Snap25 shows positive spatial autocorrelation in a very localized part of the tissue, while Calm1 shows a more diffuse autocorrelation structure.
We focus on Snap25 and visualize its autocorrelation structure using a Moran scatter plot, which shows the relationship between the expression of the gene in each cell and the average expression in its neighbors.
In this case, we can see a significant positive autocorrelation in the lower left part of the structure, while the rest of the tissue does not show significant autocorrelation.
Note that other local spatial autocorrelation measures, such as Geary’s C coefficient and Getis-Ord statistic, as well as their global versions can be used for similar analyses Moses et al. (2023).
24.6 Cell-cell interactions
24.6.1 Joint counts
We can also use the neighborhood graph to investigate cell-cell interactions between different cell types annotated above. Here, we use the join count statistic implemented in the spdep package to quantify the tendency of cells of different types to be neighbors.
We can see that there is a strong tendency for cells of the same type to be neighbors (high positive z-values). There are also some interesting negative interactions between different cell types: e.g., pyramidal CA1 tend to be away from oligodendrocytes and astrocytes. This highlights the compartimentalized structure of the brain tissue.
24.6.2 Point-pattern analysis
A similar result can be obtained using point-pattern analysis, such as Ripley’s K function, and similar methods, using for instance the spatialFDA package.
Here, we use Besag’s L function to quantify the interaction between pyramidal CA1 neurons, and the interaction between them and oligodendrocytes (cross-L function).
Other methods are available, such as Ripley’s K function, and the pair correlation function (Emons et al. 2025). Note that here we are making the quite strong assuption of homogeneity of the point pattern, which is irrealistic in tissue biology. In particular, CSR is a naïve null model that does not take into account tissue structure and cell density variations across the tissue. Hence, these plots should be considered as exploratory analyses rather than formal hypothesis tests.
In this type of plots, the dashed line represents the theoretical value of the metric under complete spatial randomness (CSR). Values above this line indicate clustering, while values below indicate dispersion. Here, we can see that pyramidal CA1 neurons tend to cluster together, as expected.
Note that the L-function lays below the CSR line for the first 50 microns, which is a consequence of the cell segmentation that does not allow cells to overlap, indicating the impossibility of finding two cells at a distance smaller than their size.
While in the plot above we focused on a single cell type, we can also investigate the interaction between two different cell types, e.g., pyramidal CA1 neurons and oligodendrocytes, using the cross-L function.
In this case, we can see that pyramidal CA1 neurons and oligodendrocytes tend to be spatially segregated, as indicated by the cross-L function being below the CSR line.
We only briefly discussed cell-cell interactions and spatial exploratory analyses here: we refer interested readers to the comprehensive documentation and tutorials available at the pasta and Voyager websites.
24.7 Appendix
References
Cai, Peiying, Mark D Robinson, and Simone Tiberi. 2024. “DESpace: Spatially Variable Gene Detection via Differential Expression Testing of Spatial Clusters.”Bioinformatics 40 (btae027, 2). https://doi.org/10.1093/bioinformatics/btae027.
Danaher, Patrick, Edward Zhao, Zhi Yang, David Ross, Mark Gregory, Zach Reitz, Tae K Kim, et al. 2022. “Insitutype: Likelihood-Based Cell Typing for Single Cell Spatial Transcriptomics.”BioRxiv, 2022–10.
Emons, Martin, Samuel Gunz, Helena L. Crowell, Izaskun Mallona, Reinhard Furrer, and Mark D. Robinson. 2025. “Harnessing the Potential of Spatial Statistics for Spatial Omics Data with Pasta.”Nucleic Acids Research 53 (17): gkaf870. https://doi.org/10.1038/s41467-022-28020-5.
Moses, Lambda, Pétur Helgi Einarsson, Kayla Jackson, Laura Luebbert, A. Sina Booeshaghi, Sindri Antonsson, Nicolas Bray, Páll Melsted, and Lior Pachter. 2023. “Voyager: Exploratory Single-Cell Genomics Data Analysis with Geospatial Statistics.”bioRxiv. https://doi.org/10.1101/2023.07.20.549945.
Zeisel, Amit, Ana B Muñoz-Manchado, Simone Codeluppi, Peter Lönnerberg, Gioele La Manno, Anna Juréus, Sueli Marques, et al. 2015. “Cell Types in the Mouse Cortex and Hippocampus Revealed by Single-Cell RNA-Seq.”Science 347 (6226): 1138–42.