Single-Cell Spatial Transcriptomics: Neighborhood Enrichment Analysis (based on CellCharter)

Author: SeekGene

Time: 15 min

Words: 2.9k words

Updated: 2026-06-05

Reads: 0 times

Spatial-seq Analysis Guide Notebooks Co-localization

1. Module Introduction

This module is based on the CellCharter algorithm and is designed for Neighborhood Enrichment Analysis of spatial transcriptomics data.

By constructing a spatial proximity network between cells/spots, CellCharter evaluates spatial colocalization patterns among different cell populations. It achieves the following core analysis objectives:

Single-Sample Cell Type Colocalization: Evaluates whether the spatial proximity between two cell populations (or states) within a tissue is significantly higher than random expectation, thereby revealing mutual attraction or repulsion between cells.
Multi-Sample Differential Neighborhood Enrichment Analysis: Compares significant changes in spatial cellular interactions across different biological conditions (e.g., healthy vs. disease, different groups).

💡 Note
The input for this pipeline is Seurat/Scanpy compatible format data. The analysis results will generate heatmaps and related tables displaying spatial cellular interactions.

Brief Algorithm Principles:

Network Construction: Constructs a node connection network based on spatial coordinates (e.g., via Delaunay Triangulation).
Calculate Enrichment Score (NE): Calculates the ratio of the observed number of connections (Observed) to the expected number of connections based on node degree (Expected).
Differential Analysis: Compares the significance of NE score differences between two groups using permutation testing.

2. Input File Preparation

This module requires files containing the expression matrix, spatial coordinates, and corresponding Metadata.

File Structure Example:

text

filtered_feature_bc_matrix/          # Output directory for rds to matrix conversion
├── matrix.mtx.gz                    # Expression matrix
├── barcodes.tsv.gz                  # Cell barcodes
├── features.tsv.gz                  # Gene features
└── cell_locations.tsv               # Spatial coordinates file

How to export required files for CellCharter from a Seurat object:

If you already have a Seurat object (.rds), you can use the following R code snippet to export the expression matrix and spatial coordinates into the folder structure shown above:

library(Seurat)
library(dplyr)
library(tibble)
library(readr)
library(Matrix)

# 1. Set input paths
rds_path <- "your_input.rds"
meta_path <- "your_meta.txt" 

# 2. Read data and integrate metadata
obj <- readRDS(rds_path)
meta <- read.delim(meta_path)
rownames(meta) <- meta$barcode
obj@meta.data <- meta

# 3. Create output directory
dir.create('filtered_feature_bc_matrix', showWarnings = FALSE)

# 4. Extract and save spatial coordinates
spatial_data <- Embeddings(obj, reduction = 'spatial')
spatial_data <- as.data.frame(spatial_data)
colnames(spatial_data) <- c('x', 'y')
spatial_data <- rownames_to_column(spatial_data, "Barcode")
write_delim(spatial_data, 'filtered_feature_bc_matrix/cell_locations.tsv', delim = '\t')

# 5. Extract and save expression matrix (MTX format)
writeMM(GetAssayData(obj, slot="counts"), file = 'filtered_feature_bc_matrix/matrix.mtx')

# 6. Save gene and barcode features
Gene <- data.frame(GeneID = rownames(obj), Gene = rownames(obj))
write.table(Gene, 'filtered_feature_bc_matrix/genes.tsv', row.names = FALSE, col.names = FALSE, sep = '\t', quote = FALSE)

barcode <- data.frame(Barcode = colnames(obj))
write.table(barcode, 'filtered_feature_bc_matrix/barcodes.tsv', row.names = FALSE, col.names = FALSE, sep = '\t', quote = FALSE)

3. Data Loading and Preprocessing

python

# --- Environment Preparation ---
import squidpy as sq
import cellcharter as cc
import scanpy as sc
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from itertools import combinations
import matplotlib.image as mpimg
import os
import math

output

/PROJ2/FLOAT/jinwen/apps/miniconda3/envs/cellcharter-env/lib/python3.9/site-packages/numba/core/decorators.py:246: RuntimeWarning: nopython is set for njit and is ignored
warnings.warn('nopython is set for njit and is ignored', RuntimeWarning)
/PROJ2/FLOAT/jinwen/apps/miniconda3/envs/cellcharter-env/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm

python

# --- Input Parameter Configuration ---

## meta_path: Path to external metadata file
meta_path = "./rds/meta.tsv"

## col_sam: Column name representing the sample name in Metadata
col_sam = "Sample"

## sample_name: Sample names to analyze, separated by commas
sample_name = "25020230_nao_CS_expression"

## col_celltype: Column name representing cell types in Metadata
col_celltype = "RNA_snn_res.0.2"

## celltypes: List of cell types to focus on, separated by commas
celltypes = "0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19"

## col_group: Column name representing experimental groups in Metadata (for multi-sample differential analysis)
col_group = ""

python

# --- Parameter Parsing ---
sample_name=sample_name.strip().split(",")
celltypes=celltypes.strip().split(",")

python

# --- Data Loading and Coordinates Merging ---
adata = sc.read_10x_mtx("filtered_feature_bc_matrix/",cache=False) 

## Add coordinates information
spatial_df = pd.read_csv("filtered_feature_bc_matrix/cell_locations.tsv",index_col=0,sep='\t')
spatial_df1 = spatial_df[spatial_df.index.isin(adata.obs_names)]
adata.obsm["spatial"] = spatial_df1.values
meta_df = pd.read_csv(meta_path,index_col=0,sep='\t')
meta_df = meta_df.reindex(adata.obs_names)
adata.obs = pd.concat([adata.obs, meta_df], axis=1)

python

# --- Data Format Conversion ---
adata.obs[col_sam] = adata.obs[col_sam].astype('category')
adata.obs[col_celltype] = adata.obs[col_celltype].astype('category')

3.1 Construct Spatial Neighborhood Network

Before performing neighborhood enrichment analysis, cells must be connected into a "spatial neighborhood network graph" based on physical spatial coordinates.

Three common spatial network construction strategies: This tool provides three methods to construct the cell neighborhood network. Users can choose flexibly based on data characteristics (e.g., uniform cell density):

Delaunay Triangulation:
- Parameter setting: delaunay=True
- Principle: Connects spatial points to form a continuous triangular mesh; cells sharing a triangle edge are considered direct neighbors.
- Advantage: Highly robust to variations in cell density within the tissue, without needing to pre-guess the number of neighbors or distance thresholds.
k-Nearest Neighbors (k-NN):
- Parameter setting: n_neighs=K (e.g., n_neighs=6)
- Principle: Forces each cell to find the K nearest cells in spatial distance as neighbors.
- Advantage: Regular network structure with a fixed degree for each node.
- Disadvantage: If there are sparse cell regions in the tissue, it may force connections across large empty physical spaces between unrelated cells.
Radius-based:
- Parameter setting: radius=R (e.g., radius=50)
- Principle: Treats every cell within a radius R from each cell center as a neighbor.
- Advantage: Strictly based on physical distance thresholds.
- Disadvantage: If tissue cell density is highly uneven, cells in dense areas will have too many neighbors, while cells in sparse areas may have none (becoming isolated points).

Other Key Parameters Description:

coord_type='generic': Indicates the use of generic 2D continuous coordinates (suitable for SeekSpace, Xenium, etc.), rather than fixed hexagonal/square grids like Visium.
library_key=col_sam: This is a crucial parameter for multi-sample analysis. It instructs the algorithm to build networks independently within each sample (slice), strictly preventing cells from different slices from being erroneously connected as neighbors.

python

# --- Construct Spatial Neighborhood Network ---
# Build spatial neighbor graph based on spatial coordinates (default uses Delaunay Triangulation)
sq.gr.spatial_neighbors(adata, library_key=col_sam, coord_type='generic', delaunay=True)

4. Single-Sample Cell Type Colocalization

This section evaluates whether the spatial proximity between two cell populations (or states) within a single tissue slice is significantly higher or lower than expected from a random distribution.

By calculating the "Neighborhood Enrichment Score", we can quantitatively describe the spatial interaction patterns between different cell types:

Enrichment (Attraction): Certain cell types tend to cluster spatially, suggesting active cellular communication or synergistic execution of specific microenvironmental functions.
Depletion (Repulsion): Certain cell types deliberately maintain distance spatially, reflecting niche exclusivity or different developmental origins.

Figure Legend: The figure below displays the single-sample cell type colocalization neighborhood enrichment heatmap.
Rows: Source cell type $C_{i}$ ; Columns: Target cell type $C_{j}$ .
Color (Positive values): Indicates that cell populations $C_{i}$ and $C_{j}$ tend to be enriched spatially, i.e., more likely to be adjacent.
Color (Negative values): Indicates that cell populations $C_{i}$ and $C_{j}$ tend to be depleted (repelled) spatially, i.e., more likely to be separated.
Color (Zero): Indicates that the number of connections matches random expectation.
Interpretation: Redder/darker colors represent a stronger tendency for two cell types to colocalize/cluster spatially; bluer/cooler colors represent spatial repulsion.

python

# --- Data Format Conversion ---

for i in range(len(sample_name)):
    adata1 = adata[adata.obs[col_sam] == sample_name[i]]
    cc.gr.nhood_enrichment(
        adata1,
        cluster_key=col_celltype,
    )
    adata1 = adata1[adata1.obs[col_celltype].isin(celltypes)]
    adata1.obs[col_celltype] = adata1.obs[col_celltype].astype('category')
    #adata1.uns["FCN1_LGR5_2_nhood_enrichment"]["enrichment"] = adata1.uns["FCN1_LGR5_2_nhood_enrichment"]["enrichment"].drop(index="Others", columns="Others")

    adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"].columns = adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"].columns.astype(str)
    adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"].index = adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"].index.astype(str)
    adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"] = adata1.uns[col_celltype+"_nhood_enrichment"]["enrichment"].loc[celltypes,celltypes]
    cc.pl.nhood_enrichment(
        adata1,
        cluster_key=col_celltype,
        annotate=True,
        vmin=-0.1,
        vmax=0.1,
        figsize=(5,5),
        fontsize=7,
        save=sample_name[i]+"_"+col_celltype+"_enrichment_heatmap.png"
    )

output

/PROJ2/FLOAT/jinwen/apps/miniconda3/envs/cellcharter-env/lib/python3.9/site-packages/cellcharter/gr/_nhood.py:236: ImplicitModificationWarning: Trying to modify attribute \`._uns\` of view, initializing view as actual.
adata.uns[f"{cluster_key}_nhood_enrichment"] = result
/tmp/ipykernel_1483/3764853794.py:10: ImplicitModificationWarning: Trying to modify attribute \`.obs\` of view, initializing view as actual.
adata1.obs[col_celltype] = adata1.obs[col_celltype].astype('category')

5. Multi-Sample Differential Neighborhood Enrichment Analysis

When the study design includes multiple biological replicates or different treatment conditions (e.g., healthy vs. disease, pre-treatment vs. post-treatment), we need to evaluate whether these specific conditions lead to significant changes in spatial cellular interaction patterns.

This section compares the neighborhood enrichment score differences between two groups of samples using a statistical Permutation Test:

Identify Interaction Changes: Find which spatial contacts between cell populations are specifically enhanced or weakened under specific disease states or therapeutic interventions.
Reveal Microenvironment Remodeling: The dynamic changes in these spatial interactions are often direct evidence of tissue microenvironment remodeling, providing key clues for disease mechanisms or drug response mechanisms.

Figure Legend: The figure below displays the multi-sample differential neighborhood enrichment analysis heatmap.
Rows: Source cell type $C_{i}$ ; Columns: Target cell type $C_{j}$ .
Color (Positive values): Indicates that in the Experimental group (vs. baseline), cell populations $C_{i}$ and $C_{j}$ tend to be enriched (closer to each other).
Color (Negative values): Indicates that in the Experimental group (vs. baseline), cell populations $C_{i}$ and $C_{j}$ tend to be depleted (further from each other).
Color (Zero): No significant difference.
Interpretation: Used to discover cell spatial interaction patterns that are specifically enhanced or weakened under particular disease or treatment conditions.

python

# --- Calculate Multi-Sample Differential Neighborhood Enrichment ---
if len(sample_name) > 1:
    cc.gr.diff_nhood_enrichment(
        adata,
        cluster_key=col_celltype,
        condition_key=col_sam,
        library_key=col_group,
        pvalues=True,
        n_jobs=15,
        n_perms=100
    )

python

# --- Plot Differential Neighborhood Enrichment Heatmap ---
if len(sample_name) > 1:
    combinations_list = list(combinations(sample_name, 2))
    for i in range(len(combinations_list)):
        cc.pl.diff_nhood_enrichment(
            adata,
            cluster_key=col_celltype,
            condition_key=col_group,
            condition_groups=list(combinations_list[i]),
            annotate=True,
            figsize=(5,5),
            #significance=0.05,
            fontsize=5,
            save=list(combinations_list[i])[0]+"vs"+list(combinations_list[i])[1]+"_"+col_celltype+"_enrichment_heatmap.png"
        )

6. Cell Spatial Distribution Density Analysis

Quantify the spatial distribution density of different cell types relative to target cells based on physical distances.

Figure Legend: The figure below displays the cell type colocalization density plot.
X-axis (Distance to Target): Represents the spatial physical distance from a certain class of cells to the Target Cell.
Y-axis (Density): Represents the distribution density of cells across different distance intervals.
Color Grouping (Phenotype): Different color curves represent different cell types (Phenotype).
Interpretation: A peak further to the left (closer distance) indicates a stronger tendency for that cell type to colocalize (cluster) spatially with the target cell; a flatter curve or a peak further to the right indicates that the cell type is spatially distant from the target cell or randomly distributed.

python

import pandas as pd
import numpy as np
from scipy.spatial import cKDTree

def find_nearest_distance(df, 
                          id_col='Cell ID', 
                          pheno_col='Phenotype', 
                          x_col='Cell X Position', 
                          y_col='Cell Y Position'):

    results = {}
    # Get all unique phenotypes in the data
    phenotypes = df[pheno_col].dropna().unique()
    
    # Extract coordinates and IDs of all cells
    coords_all = df[[x_col, y_col]].values
    cell_ids_all = df[id_col].values
    
    for pheno in phenotypes:
        # Filter cells for specific phenotypes
        mask_p = (df[pheno_col] == pheno)
        df_p = df[mask_p].reset_index(drop=True)
        
        dist_col_name = f'Distance to {pheno}'
        id_col_name = f'Cell ID {pheno}'
        
        n_p = len(df_p)
        
        # Return NaN if there are no cells for this phenotype
        if n_p == 0:
            results[dist_col_name] = np.nan
            results[id_col_name] = np.nan
            continue
            
        coords_p = df_p[[x_col, y_col]].values
        ids_p = df_p[id_col].values
        
        # Build KDTree for fast spatial nearest neighbor search
        tree = cKDTree(coords_p)
        
        # If there is only 1 cell for this phenotype, query the closest 1; otherwise query the closest 2 (to exclude itself)
        k_query = 2 if n_p > 1 else 1
        distances, indices = tree.query(coords_all, k=k_query)
        
        final_dists = np.zeros(len(df))
        final_ids = np.empty(len(df), dtype=object)
        
        if k_query == 1:
            for i in range(len(df)):
                # If the closest cell found is itself (same ID), it means there are no "other" cells for this phenotype
                if cell_ids_all[i] == ids_p[indices[i]]:
                    final_dists[i] = np.nan
                    final_ids[i] = pd.NA
                else:
                    final_dists[i] = distances[i]
                    final_ids[i] = ids_p[indices[i]]
        else:
            for i in range(len(df)):
                # indices[i, 0] is the closest, indices[i, 1] is the second closest
                # If the closest cell is itself, take the second closest cell
                if cell_ids_all[i] == ids_p[indices[i, 0]]:
                    final_dists[i] = distances[i, 1]
                    final_ids[i] = ids_p[indices[i, 1]]
                else:
                    final_dists[i] = distances[i, 0]
                    final_ids[i] = ids_p[indices[i, 0]]
                    
        results[dist_col_name] = final_dists
        results[id_col_name] = final_ids
        
    # Return as DataFrame format, aligned with original data index
    return pd.DataFrame(results, index=df.index)

python

adata.obs[["Cell X Position", "Cell Y Position"]] = adata.obsm["spatial"]
position = adata.obs[["Cell X Position","Cell Y Position",col_celltype,col_sam]]
position = position.reset_index(names='Cell ID')
position.rename(columns={col_celltype: 'Phenotype'}, inplace=True)
position[["Cell X Position","Cell Y Position"]] = position[["Cell X Position","Cell Y Position"]]*0.2653

python

custom_40_colors = [
    '#1f77b4', '#aec7e8', '#ff7f0e', '#ffbb78', '#2ca02c', '#98df8a', '#d62728', '#ff9896',
    '#9467bd', '#c5b0d5', '#8c564b', '#c49c94', '#e377c2', '#f7b6d2', '#7f7f7f', '#c7c7c7',
    '#bcbd22', '#dbdb8d', '#17becf', '#9edae5', '#393b79', '#5254a3', '#6b6ecf', '#9c9ede',
    '#637939', '#8ca252', '#b5cf6b', '#cedb9c', '#8c6d31', '#bd9e39', '#e7ba52', '#e7cb94',
    '#843c39', '#ad494a', '#d6616b', '#e7969c', '#7b4173', '#a55194', '#ce6dbd', '#de9ed6'
]

# === Layout Design: Arrange up to 3 sample density plots per row ===
max_cols = 3
num_samples = len(sample_name)

# Automatically calculate required columns and rows
ncols = max_cols if num_samples >= max_cols else num_samples
nrows = math.ceil(num_samples / ncols) if num_samples > 0 else 1

# Set overall theme
sns.set_theme(style="whitegrid", rc={"axes.edgecolor": "black"})

# Dynamically generate canvas size: width 12 per subplot (allowing ample legend space), height 6 per subplot
fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(12 * ncols, 6 * nrows))

# Uniformly convert to a 1D list for convenient ax access by index in loops
if isinstance(axes, np.ndarray):
    axes_list = axes.flatten()
else:
    axes_list = [axes]

for i in range(num_samples):
    ax = axes_list[i]
    
    # Extract current sample data
    sub_position = position.loc[position[col_sam] == sample_name[i]]
    distances = find_nearest_distance(sub_position)
    csd_with_distance = pd.concat([sub_position, distances], axis=1)
    
    # Plot density map, specifying the current ax
    sns.kdeplot(
        data=csd_with_distance,
        x='Distance to 1',
        hue='Phenotype',
        linewidth=2,
        common_norm=False,
        clip=(0, None),
        palette=custom_40_colors,
        ax=ax
    )

    ax.set_xlim(left=0)
    ax.set_ylabel('Density', fontsize=12)
    ax.set_title(f'Sample: {sample_name[i]}', fontsize=14, fontweight='bold')

    # Use seaborn's move_legend to move the legend of each subplot to the right
    if ax.get_legend() is not None:
        sns.move_legend(
            ax, "upper left",
            bbox_to_anchor=(1.02, 1), 
            ncol=2, 
            title="Phenotype", 
            frameon=False
        )

# If the number of subplot grids is greater than the number of samples, delete excess blank subplots
for j in range(num_samples, len(axes_list)):
    fig.delaxes(axes_list[j])

# Automatically adjust subplot spacing to prevent legend overlap with adjacent plots
plt.tight_layout()
plt.show()

output

/tmp/ipykernel_1483/54170521.py:44: UserWarning: The palette list has more values (40) than needed (20), which may not be intended.
sns.kdeplot(

python

Single-Cell Spatial Transcriptomics: Neighborhood Enrichment Analysis (based on CellCharter) ​

1. Module Introduction ​

2. Input File Preparation ​

3. Data Loading and Preprocessing ​

3.1 Construct Spatial Neighborhood Network ​

4. Single-Sample Cell Type Colocalization ​

5. Multi-Sample Differential Neighborhood Enrichment Analysis ​

6. Cell Spatial Distribution Density Analysis ​

Single-Cell Spatial Transcriptomics: Neighborhood Enrichment Analysis (based on CellCharter)

1. Module Introduction

2. Input File Preparation

3. Data Loading and Preprocessing

3.1 Construct Spatial Neighborhood Network

4. Single-Sample Cell Type Colocalization

5. Multi-Sample Differential Neighborhood Enrichment Analysis

6. Cell Spatial Distribution Density Analysis