SeekSpace Spatial Transcriptomics Data Analysis Tutorial (Seurat / Mouse Brain)
Background Information
This document explains the data obtained from SeekGene SeekSpace technology and its analysis method using Seurat.
SeekGene SeekSpace technology can detect gene expression data at single-cell resolution while also locating the spatial coordinates of each cell within the tissue.
Data analysis for SeekSpace is straightforward and compatible with common single-cell transcriptomics analysis software such as Seurat and Scanpy.
This data is spatial data of a mouse brain based on SeekSpace technology. It contains a single-cell transcriptome matrix of 32,758 cells, a spatial coordinate matrix, tissue DAPI staining images, and H&E images.
File Description
The companion basic data processing software for SeekSpace technology is SeekSpace® Tools, which identifies cell expression information from sequencing libraries and locates the spatial position of each cell.
The result file format obtained after processing with SeekSpace® Tools software is as follows:
├── WTH1092_filtered_feature_bc_matrix # Expression matrix directory, can be read using Seurat's Read10X command
│ ├── barcodes.tsv.gz
│ ├── features.tsv.gz
│ ├── matrix.mtx.gz
│ └── cell_locations.tsv.gz Spatial coordinate file for cells in mouse brain sequencing data. Column 1 is the barcode, consistent with the order in filtered_feature_bc_matrix/barcode; Columns 2 and 3 are the spatial positions (pixel coordinates on the spatial chip) of the cell represented by the barcode.
├── WTH1092_aligned_DAPI.png # DAPI staining image of mouse brain tissue section
├── WTH1092_aligned_HE.png
├── WTH1092_aligned_HE_TIMG.png
├── seekspace_of_Seurat.ipynb # Jupyter example file for analyzing this mouse brain spatial data using Seurat
└── seekspace_of_scanpy.ipynb # Jupyter example file for analyzing this mouse brain spatial data using Scanpy
The size of one pixel in SeekSpace technology is approximately 0.2653 micrometers. Multiplying the pixel coordinates by 0.2653 converts them to the distance of cells in real space.
library(Seurat)
library(tidyverse)
library(ggplot2)Create Seurat Object
Read the SeekSpace matrix file. Here, simply use the common Seurat methods for reading and creating matrices.
mouse_brain.data <- Read10X('./Outs/WTH1092_filtered_feature_bc_matrix')
mouse_brain <- CreateSeuratObject(counts=mouse_brain.data,project='mouse_brain')Dimensionality Reduction and Clustering
Here, common default parameters are used for dimensionality reduction and clustering. In actual analysis, modifications can be made according to sample characteristics.
mouse_brain <- NormalizeData(mouse_brain, normalization.method = "LogNormalize", scale.factor = 10000) %>%
FindVariableFeatures(selection.method = "vst", nfeatures = 2000) %>%
ScaleData() %>%
RunPCA() %>%
FindNeighbors(dims = 1:15) %>%
FindClusters(resolution = seq(0.2, 1.4, 0.3)) %>%
RunUMAP(dims = 1:15)
Idents(mouse_brain) <- 'RNA_snn_res.0.8'Add Spatial Coordinates
Next, we use Seurat's CreateDimReducObject function to add spatial coordinates to each cell in the Seurat object.
Note!!
The order of cells in the spatial coordinate matrix read by the CreateDimReducObject function must match the order of cells in the meta.data of mouse_brain. Otherwise, cell order confusion will occur during multi-sample integration analysis.
Here we use a simple one-line code to sort the spatial coordinate matrix.
spatial_df <- read.table('./Outs/WTH1092_filtered_feature_bc_matrix/cell_locations.tsv.gz', row.names = 1, sep = '\t',header = T)
colnames(spatial_df) <- c("spatial_1","spatial_2")
spatial_matrix <- as.matrix(spatial_df)
spatial_matrix_sorted <- spatial_matrix[match(row.names(mouse_brain@meta.data),row.names(spatial_matrix)), ]
mouse_brain@reductions$spatial <- CreateDimReducObject(embeddings = spatial_matrix_sorted, key='spatial_', assay='RNA')head(mouse_brain@reductions$spatial@cell.embeddings)After this step, we can see a new coordinate system "spatial" in the Seurat object, which represents the coordinates of each cell in spatial position.
str(mouse_brain@reductions$spatial)Add Tissue Image Information
Below we show how to add background images to the spatial data.
samplename = 'WTH1092'
size_x = 55128
size_y = 19906
# If the second to last letter of the chip ID is A, then size_x=55128, size_y=19906;
# If the second to last letter of the chip ID is B, then size_x=55050, size_y=19906;
# If the second to last letter of the chip ID is C, then size_x=55050, size_y=19906.
# (You can check the chip ID in the Summary Table of the samplename_report.html QC report)
mouse_brain@misc$info[[`samplename`]]$size_x = as.integer(size_x)
mouse_brain@misc$info[[`samplename`]]$size_y = as.integer(size_y)Add DAPI Image
img = './Outs/WTH1092_aligned_DAPI.png'
# base64 format
img_64 = base64enc::dataURI(file = img)
mouse_brain@misc$info[[`samplename`]]$img = img_64
# png format
img_gg <- png::readPNG(img)
img_grob <- grid::rasterGrob(img_gg, interpolate = FALSE, width = grid::unit(1,"npc"), height = grid::unit(1, "npc"))
mouse_brain@misc$info[[`samplename`]]$img_gg = img_grobAdd H&E Image
img = './Outs/WTH1092_aligned_HE_TIMG.png'
# base64 format
img_64 = base64enc::dataURI(file = img)
mouse_brain@misc$info[[`samplename`]]$img_he = img_64
# png format
img_gg <- png::readPNG(img)
img_grob <- grid::rasterGrob(img_gg, interpolate = FALSE, width = grid::unit(1,"npc"), height = grid::unit(1, "npc"))
mouse_brain@misc$info[[`samplename`]]$img_he_gg = img_grobstr(mouse_brain@misc$info)Cell Annotation Results
Next, we add the cell annotation results to each cell so we can see the distribution of different cell types in space.
The process of SeekSpace cell annotation is exactly the same as ordinary single-cell annotation.
We have previously annotated major groups and subgroups for the demo data, and the annotation results are in the annotation.csv file.
Next, we simply read and process it.
anno <- read.csv("./annotation.csv",row.names = 1,header = TRUE)mouse_brain <- AddMetaData(mouse_brain, anno)head(mouse_brain)saveRDS(mouse_brain,"WTH1092_demo_mouse_brain.rds")Result Visualization
The data processed by Seurat is saved as rds, and we can continue to use the default series of functions in the Seurat package for plotting
Spatial Coordinate Plotting
#dimplot umap
options(repr.plot.height=7, repr.plot.width=7)
DimPlot(mouse_brain, reduction = 'umap')options(repr.plot.height=10, repr.plot.width=10)
FeaturePlot(mouse_brain, reduction = 'umap', features=c('Mbp','Mobp', 'Olig1','Plp1'), pt.size = 1)When we set the reduction parameter in the DimPlot function to "spatial", we can plot cells on the spatial level.
options(repr.plot.height=7, repr.plot.width=15)
DimPlot(mouse_brain, reduction = 'spatial',pt.size = 1)options(repr.plot.height=7, repr.plot.width=15)
DimPlot(mouse_brain, reduction = 'spatial',pt.size = 1,group.by = "Sub_CellType")Similarly, when using the FeaturePlot function, we can set the reduction parameter to "spatial" to view gene expression in spatial positions.
options(repr.plot.height=10, repr.plot.width=20)
FeaturePlot(mouse_brain, reduction = 'spatial', features = c('Mbp','Mobp', 'Olig1','Plp1'),pt.size = 0.7)SeekSpace data has good compatibility with Seurat, and functions like subsetting work well.
options(repr.plot.height=7, repr.plot.width=15)
sub_mouse_brain <- subset(mouse_brain, Main_CellType == 'Ext')
DimPlot(sub_mouse_brain, reduction = 'spatial',pt.size = 1.5, group.by = "Main_CellType")Plotting with DAPI/H&E Images
We defined two functions, ImageSpacePlot and FeatureSpacePlot, to plot cell clustering information and continuous variable metrics in space
###################### Cell Clustering Plot Function
ImageSpacePlot = function(obj, group_by, type="DAPI", sample=names(obj@misc$info)[1], size=1, alpha=1,color=MYCOLOR){
MYCOLOR=c(
"#6394ce", "#2a4c87", "#eed500", "#ed5858",
"#f6cbc2", "#f5a2a2", "#3ca676", "#6cc9d8",
"#ef4db0", "#992269", "#bcb34a", "#74acf3",
"#3e275b", "#fbec7e", "#ec4d3d", "#ee807e",
"#f7bdb5", "#dbdde6", "#f591e1", "#51678c",
"#2fbcd3", "#80cfc3", "#fbefd1", "#edb8b5",
"#5678a8", "#2fb290", "#a6b5cd", "#90d1c1",
"#a4e0ea", "#837fd3", "#5dce8b", "#c5cdd9",
"#f9e2d6", "#c64ea4", "#b2dfd6", "#dbdfe7",
"#dff2ec", "#cce8f3", "#e74d51", "#f7c9c4",
"#f29c81", "#c9e6e0", "#c1c5de", "#750000"
)
raster_type <- switch(type,
HE = "img_he_gg",
DAPI = "img_gg",
stop("Invalid type. Must be 'HE' or 'DAPI'.")
)
spatial_coord1 <- as.data.frame(obj[[group_by]])
colnames(spatial_coord1) <- group_by
spatial_coord2 <- as.data.frame(obj@reductions$spatial@cell.embeddings)
spatial_coord <-cbind(spatial_coord2,spatial_coord1)
ImageSpacePlot <- ggplot2::ggplot() + ggplot2::annotation_custom(grob = obj@misc$info[[sample]][[raster_type]],
xmin = 0, xmax = obj@misc$info[[sample]]$size_x,
ymin = 0, ymax = obj@misc$info[[sample]]$size_y) +
ggplot2::geom_point(data = spatial_coord, ggplot2::aes(x = spatial_1,y = spatial_2, color = !!sym(group_by),
fill = !!sym(group_by)), size=size, alpha=alpha)+
labs(size = group_by) + guides(alpha = "none")+
ggplot2::theme_classic()+
scale_color_manual(values = color)+ coord_fixed()
return(ImageSpacePlot)
}
################### Gene Expression Plot Function
FeatureSpacePlot = function(obj, feature, type="DAPI", sample=names(obj@misc$info)[1], size=1, alpha=c(1,1),color=c("lightgrey","blue")){
raster_type <- switch(type,
HE = "img_he_gg",
DAPI = "img_gg",
stop("Invalid type. Must be 'HE' or 'DAPI'.")
)
spatial_coord1 <- as.data.frame(obj@reductions$spatial@cell.embeddings)
spatial_coord2 <- FetchData(obj,feature)
colnames(spatial_coord2) <- feature
spatial_coord <-cbind(spatial_coord1,spatial_coord2)
FeatureSpacePlot <-ggplot2::ggplot() + ggplot2::annotation_custom(grob = obj@misc$info[[sample]][[raster_type]],
xmin = 0, xmax = obj@misc$info[[sample]]$size_x, ymin = 0, ymax = obj@misc$info[[sample]]$size_y) +
ggplot2::geom_point(data = spatial_coord, ggplot2::aes(x = spatial_1, y = spatial_2,color = !!sym(feature),alpha = !!sym(feature)), size=size)+
labs(color = feature)+
guides(alpha = "none")+
ggplot2::theme_classic()+
ggplot2::scale_alpha_continuous(range=alpha)+
scale_color_gradient(low=color[1],high = color[2])+ coord_fixed()
return(FeatureSpacePlot)
}# Cell clustering plot with DAPI background
options(repr.plot.height=7, repr.plot.width=15)
ImageSpacePlot(obj=mouse_brain, group_by = "Sub_CellType",type="DAPI",size=0.7)# Cell clustering plot with H&E background
options(repr.plot.height=7, repr.plot.width=15)
ImageSpacePlot(obj=mouse_brain, group_by = "Sub_CellType",type="HE")# Gene expression plot with DAPI background
options(repr.plot.height=7, repr.plot.width=15)
FeatureSpacePlot(obj=mouse_brain, feature="Hpca",type="DAPI")# Gene expression plot with H&E background
options(repr.plot.height=7, repr.plot.width=15)
FeatureSpacePlot(obj=mouse_brain, feature="Hpca",type="HE")