Skip to content

Banksy spatial clustering principles installation code examples and interpretation

Author: Jin Wen
Time: 4 min
Words: 657 words
Updated: 2026-01-26
Reads: 0 times
Spatial-seq Analysis Guide

Introduction

NOTE

This document introduces the principles, installation, usage, and result interpretation of the Banksy (BANKSY) spatial clustering R package, suitable for cell type identification and spatial domain segmentation in single-cell spatial transcriptomics data.

Principle Overview

IMPORTANT

Banksy is a spatial omics clustering method that enhances each cell's feature vector by integrating the mean expression of its spatial neighbors and spatial gradient features, thereby improving clustering. This approach can:

  • Improve cell type assignment in noisy data.
  • Distinguish subtle cell subtypes influenced by the microenvironment.
  • Identify spatial domains with similar microenvironments.

Banksy is applicable to various spatial transcriptomics technologies (e.g., SeekSpace, Slide-seq, MERFISH, CosMX, etc.) and is scalable to large datasets.

Image: Conceptual illustration of the Banksy spatial clustering method

Data Preparation

Example Data

The official Banksy R package provides mouse hippocampus spatial transcriptomics data for direct testing.

r
# Load example data
library(Banksy)
data(hippocampus)
gcm <- hippocampus$expression
locs <- as.matrix(hippocampus$locations)

User Data

  • Expression matrix (genes × cells/spots).
  • Spatial coordinates matrix (cells/spots × 2).

Environment Installation

IMPORTANT

Requires R >= 4.4.0. Bioconductor installation is recommended.

r
# Install via Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install('Banksy')

# Or install the latest version from GitHub
if (!requireNamespace("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_github("prabhakarlab/Banksy")

TIP

It is recommended to also install SpatialExperiment, scuttle, scater, ggplot2, and cowplot for data processing and visualization.

r
BiocManager::install(c("SpatialExperiment", "scuttle", "scater"))
install.packages(c("ggplot2", "cowplot"))

Example Workflow

r
# Load required packages
library(Banksy)
library(SpatialExperiment)
library(scuttle)
library(scater)
library(ggplot2)
library(cowplot)

# Construct SpatialExperiment object
se <- SpatialExperiment(assay = list(counts = gcm), spatialCoords = locs)

# Quality control and normalization
qcstats <- perCellQCMetrics(se)
thres <- quantile(qcstats$total, c(0.05, 0.98))
keep <- (qcstats$total > thres[1]) & (qcstats$total < thres[2])
se <- se[, keep]
se <- computeLibraryFactors(se)
assay(se, "normcounts") <- normalizeCounts(se, log = FALSE)

# Compute Banksy neighborhood features
lambda <- c(0, 0.2)
k_geom <- c(15, 30)
se <- Banksy::computeBanksy(se, assay_name = "normcounts", compute_agf = TRUE, k_geom = k_geom)

# Dimensionality reduction and clustering
set.seed(1000)
se <- Banksy::runBanksyPCA(se, use_agf = TRUE, lambda = lambda)
se <- Banksy::runBanksyUMAP(se, use_agf = TRUE, lambda = lambda)
se <- Banksy::clusterBanksy(se, use_agf = TRUE, lambda = lambda, resolution = 1.2)
se <- Banksy::connectClusters(se)

# Visualization
cnames <- colnames(colData(se))
cnames <- cnames[grep("^clust", cnames)]
colData(se) <- cbind(colData(se), spatialCoords(se))
plot_nsp <- plotColData(se, x = "sdimx", y = "sdimy", point_size = 0.6, colour_by = cnames[1])
plot_bank <- plotColData(se, x = "sdimx", y = "sdimy", point_size = 0.6, colour_by = cnames[2])
plot_grid(plot_nsp + coord_equal(), plot_bank + coord_equal(), ncol = 2)

Parameter Description

ParameterDescription
lambdaWeight of spatial neighborhood features; 0 for non-spatial, 0.2 recommended for cell typing.
k_geomNumber of spatial neighbors, recommended 15 and 30.
use_agfWhether to use neighborhood mean and gradient features.
resolutionClustering resolution, affects the number of clusters.
assay_nameName of the expression matrix used (e.g., normcounts).

WARNING

  • The recommended lambda value may differ for datasets with different spatial resolutions; see the official documentation for details.
  • For large datasets, high-performance computing resources are recommended.

Main Results Interpretation

  • Clustering labels: The clust_* columns in colData(se) correspond to clustering results under different lambda values.
  • Dimensionality reduction: UMAP and other coordinates are stored in reducedDims(se).
  • Visualization: Use plotColData, plotReducedDim, etc., to visualize spatial clustering.

NOTE

Banksy clustering results can be used for downstream spatial subtype analysis, domain identification, microenvironment studies, etc.

References

0 comments·0 replies