{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "780ee634",
   "metadata": {},
   "source": [
    "---\n",
    "title: SeekSpace 高级分析：基于 stLearn 的细胞通讯热点识别\n",
    "author: SeekGene\n",
    "date: 2026-01-29\n",
    "tags:\n",
    "  - 空间转录组\n",
    "  - 分析指南\n",
    "  - Notebooks\n",
    "  - 细胞通讯分析\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9bda5ee-3f34-4b2f-a62f-133d8906c2cd",
   "metadata": {},
   "source": [
    "# SeekSpace 高级分析：基于 stLearn 的细胞通讯热点识别"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2dd37f51-7539-4df9-8968-7051a2961983",
   "metadata": {},
   "outputs": [],
   "source": [
    "import stlearn as st\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "#matplotlib版本3.5.3，高版本的与stlearn有冲突\n",
    "import matplotlib.pyplot as plt\n",
    "import scanpy as sc\n",
    "import re"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04b82713-dc2a-4b67-8549-72c0aaa15db4",
   "metadata": {},
   "source": [
    "## stLearn 细胞通讯模块输入参数介绍\n",
    "matrix_path：matrix，feature，barcode 三个文件路径  \n",
    "spatial_path：空间位置路径  \n",
    "SAMple_name：样本名  \n",
    "image_path：空间 HE 染色相片路径  \n",
    "celltype_path：注释的细胞类型文件路径，文件中需要以 barcode 为行名，列名为“celltype”（细胞类型）为列  \n",
    "spot_diameter_fullres：细胞/spot 的大小，与分析步骤中的空间距离范围相关，可选 50 或者 100  \n",
    "species：物种，与下面参数 lr_database_bool 有关，若 lr_database_bool 为 True，则该处只能填\"human\"或者\"mouse\"；若为 False，则不需要填写  \n",
    "lr_database_bool：是否用 stlearn 自带的受配体库，默认是 True。若不使用 stlearn 自带的受配体库，填 False，并在 lr_database 处填写相应的受配体库  \n",
    "lr_database_path：与 lr_database_bool 联用，若 lr_database_bool 参数填 False，则需要提供受配体库路径，文件内容包含受配体列(有行名，没有列名)，如 ligand_receptor  \n",
    "ncpus：线程大小，关乎运行速度，可设置大写  \n",
    "cluster_name：设置 adata 对象 obs 中细胞类型的列名  \n",
    "lr_pair：是否自己设置展示的受配体对，如果不设置，填 None，则展示显著性高的 3 个受配体对；设置，填受配体对名，多个受配体对以逗号分隔，\"A_B,C_D\"  \n",
    "grid_step：是否画格子，对于细胞数量过多的数据，运行时间会长，推荐填 True；不使用则填 False  \n",
    "n_：长和宽划分格子的数量，若 grid_step 参数填 True，则 n_必填"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5eb5fc5-bc51-4c7b-b6e2-688c1c335774",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "matrix_path = \"\"\n",
    "spatial_path = \"\"\n",
    "sample_name = \"\"\n",
    "image_path = \"\"\n",
    "celltype_path = \"\"\n",
    "spot_diameter_fullres = 50\n",
    "species = \"human\"\n",
    "lr_database_bool = True\n",
    "lr_database_path = \"\"\n",
    "n_cpus = 16\n",
    "cluster_name = \"celltype\"\n",
    "lr_pair = \"None\"\n",
    "grid_step = True\n",
    "n_ = 125"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22366627-01eb-4e79-87cb-3656845bfec8",
   "metadata": {},
   "source": [
    "## 读取 SeekSpace 数据并进行标准化、降维、聚类"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c5e6bccf-cb68-4bce-98a7-e578bf606772",
   "metadata": {},
   "outputs": [],
   "source": [
    "adata = sc.read_10x_mtx(matrix_path)\n",
    "spatial = pd.read_csv(spatial_path,sep=\",\",index_col=0)\n",
    "spatial = spatial.loc[:,(\"x\",\"y\")]\n",
    "selected_rows = spatial.loc[spatial.index.isin(adata.obs_names)]\n",
    "selected_rows.columns = [\"imagecol\",\"imagerow\"]\n",
    "selected_rows = selected_rows.reindex(adata.obs_names)\n",
    "selected_rows = selected_rows*0.265385\n",
    "a = st.create_stlearn(count=adata.to_df(),spatial=selected_rows,library_id=sample_name, scale=1,image_path=image_path,spot_diameter_fullres=spot_diameter_fullres)\n",
    "a.layers[\"raw_count\"] = a.X\n",
    "# Preprocessing\n",
    "#st.pp.filter_genes(a,min_cells=3)\n",
    "st.pp.normalize_total(a)\n",
    "st.pp.log1p(a)\n",
    "# Keep raw data\n",
    "a.raw = a\n",
    "st.pp.scale(a)\n",
    "st.em.run_pca(a,n_comps=50,random_state=0)\n",
    "st.pp.neighbors(a,n_neighbors=25,use_rep='X_pca',random_state=0)\n",
    "st.tl.clustering.louvain(a,random_state=0)\n",
    "sc.tl.umap(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0900bd10-c2d5-47dc-8a0f-9bcab4c2fa67",
   "metadata": {},
   "source": [
    "## 将注释好的细胞类型添加到 adata 对象中"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b908cb9-978b-4689-9464-e035d2545f85",
   "metadata": {},
   "outputs": [],
   "source": [
    "celltype = pd.read_csv(celltype_path,index_col=0)\n",
    "celltype.index = [re.sub(r'_9$', '', s) for s in celltype.index]\n",
    "celltype = celltype.loc[a.obs.index]\n",
    "a.obs[cluster_name] = celltype[\"celltype\"]\n",
    "a.obs[cluster_name] = a.obs[cluster_name].astype('category')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3ceef413-2f7b-4a23-a5eb-e8e4f152795e",
   "metadata": {},
   "source": [
    "## 将空间切片划分为长为 n_，宽为 n_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd4f096e-a737-4e78-9227-d2223163e9fa",
   "metadata": {},
   "outputs": [],
   "source": [
    "if grid_step:\n",
    "    #n_ = 125\n",
    "    print(f'{n_} by {n_} has this many spots:\\n', n_*n_)\n",
    "    a = st.tl.cci.grid(a,n_row=n_, n_col=n_, use_label = cluster_name)\n",
    "    print(a.shape )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f817a0e2-492f-45e8-9126-eb8225ac3af8",
   "metadata": {},
   "source": [
    "## 加载受配体库，计算受配体表达的显著性"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9fc42e7-3009-4ef6-95bb-ac2acb4545d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "if lr_database_bool:\n",
    "    lrs = st.tl.cci.load_lrs(['connectomeDB2020_lit'], species=species)\n",
    "else:\n",
    "    lrs = pd.read_csv(lr_database_path,names=[\"0\"])\n",
    "    lrs = np.array(list(lrs[\"0\"]))\n",
    "\n",
    "#lrs = st.tl.cci.load_lrs(['connectomeDB2020_lit'], species=species)\n",
    "st.tl.cci.run(a, lrs,\n",
    "                  min_spots = 20, #Filter out any LR pairs with no scores for less than min_spots\n",
    "                  distance=None, # None defaults to spot+immediate neighbours; distance=0 for within-spot mode\n",
    "                  n_pairs=100, # Number of random pairs to generate; low as example, recommend ~10,000\n",
    "                  n_cpus=8, # Number of CPUs for parallel. If None, detects & use all available.\n",
    "                  )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a12c0d91-97bb-436b-abad-4422465015f6",
   "metadata": {},
   "source": [
    "## 受配体表达显著性结果表格展示\n",
    "行：受配体对  \n",
    "第一列 n_spot：未经过 pval 过滤受配体对互作的细胞/spot 数量（总）  \n",
    "第二列 n_spots_sig：经过矫正后的 pval 过滤后的受配体对显著互作的细胞/spot 数量  \n",
    "第三列 n_spots_sig_pval：根据设定的 pval 过滤后的受配体对互作的细胞/spot 数量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8aef1663-12c1-4020-bf41-82f33b14cd77",
   "metadata": {},
   "outputs": [],
   "source": [
    "lr_info = a.uns['lr_summary']\n",
    "lr_info"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "831c15f3-687c-488b-a2a2-e1aee8986bac",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.tl.cci.adj_pvals(a, correct_axis='spot',\n",
    "                   pval_adj_cutoff=0.05, adj_method='fdr_bh')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d8a9ad7-977e-494a-893c-ad5311b5e7fb",
   "metadata": {},
   "source": [
    "## 诊断图（基于 lrfeatures 结果作图）  \n",
    "受配体分析的一个关键方面是在确定显著 hotspot 时，控制受配体的表达水平和频率。  \n",
    "因此，诊断图应显示受配体对的热点与表达水平和表达频率之间几乎没有相关性。  \n",
    "以下诊断方法可以帮助检查和确认这一点；如果不是这样，可能表明需要更大数量的 Permutations。  \n",
    "* **左图：**  \n",
    "横轴为 n_spots_sig 的排序受配体对  \n",
    "纵轴为受配体对(非零)表达中位值  \n",
    "* **右图：**  \n",
    "横轴为 n_spots_sig 的排序受配体对  \n",
    "纵轴为受配体对表达值为 0 的细胞占所有细胞的比例"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2cbaff38-3bfa-44e5-b4e0-03c278023a6a",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_diagnostics(a, figsize=(10,2.5))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "981486fc-9a43-4fa0-847d-8a84ff27ee43",
   "metadata": {},
   "source": [
    "基于 lr_summary 的结果作图：展示了 top500 和 top50 的受配体对  \n",
    "横轴代表 n_spots_sig 的排序受配体对  \n",
    "纵轴代表经过矫正后的 pval 过滤后的受配体对显著互作的细胞/spot 数量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1007dea-dc5c-4c65-a0b7-881e239ee70d",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_summary(a, n_top=500)\n",
    "st.pl.lr_summary(a, n_top=50, figsize=(10,3))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ebc9a10-8e33-43bd-90e2-f63e2f83c85f",
   "metadata": {},
   "source": [
    "基于 lr_summary 的结果作图：展示了 top500 和 top50 的受配体对  \n",
    "横轴代表 n_spots_sig 的排序受配体对  \n",
    "纵轴代表未经过 pval 过滤受配体对互作的细胞/spot 数量（总）\n",
    "颜色代表显著和不显著受配体对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "94e472bf-e61b-48b5-9ff7-62a287c2e81c",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_n_spots(a, n_top=50, figsize=(11, 3),\n",
    "                    max_text=100)\n",
    "st.pl.lr_n_spots(a, n_top=500, figsize=(11, 3),\n",
    "                    max_text=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2486a936-2a49-452e-89da-c2f48a53ab2d",
   "metadata": {},
   "source": [
    "## 预测显著互作的细胞类型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53dcf6a3-7018-4885-befa-dd27f0cd9c20",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.tl.cci.run_cci(a, cluster_name, # Spot cell information either in data.obs or data.uns\n",
    "                  min_spots=3, # Minimum number of spots for LR to be tested.\n",
    "                  sig_spots=True, # Only consider neighbourhoods of spots which had significant LR scores.\n",
    "                  n_perms=1000, # Permutations of cell information to get background, recommend ~1000\n",
    "                  n_cpus=16\n",
    "                 )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec203cc8-39a4-465f-93e3-c99d36a6914b",
   "metadata": {},
   "source": [
    "## 诊断图：检查受配体互作和细胞类型的细胞数之间的相关性  \n",
    "如果 permutations 的数量足够，下面的图表应该显示几乎没有或没有相关性；否则，建议将 n_perms（置换次数）的值调高。  \n",
    "横轴代表细胞类型  \n",
    "柱状图对应的左纵轴代表细胞类型数量  \n",
    "折线对应的由纵轴代表细胞类型互作的受配体数量  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d04411e-7146-452f-9966-90f66abab453",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.cci_check(a, cluster_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ad09928-bea7-4042-ab6b-3af8f1c79544",
   "metadata": {},
   "source": [
    "## CCI 网络图\n",
    "图中展示对应受配体对的互作图，通过结果 per_lr_cci_*绘制的"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "692f781b-b3cb-4907-81a3-58df72fcde8d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualising the no. of interactions between cell types across all LR pairs #\n",
    "pos_1 = st.pl.ccinet_plot(a, cluster_name, return_pos=True)\n",
    "\n",
    "# Just examining the cell type interactions between selected pairs #\n",
    "\n",
    "if lr_pair == None:\n",
    "    lr_pair=None\n",
    "else:\n",
    "    lr_pair=lr_pair.strip.split(\",\")\n",
    "\n",
    "if lr_pair is not None:\n",
    "    for best_lr in lr_pair:\n",
    "            st.pl.ccinet_plot(a, cluster_name, best_lr, min_counts=2,\n",
    "                              figsize=(10,7.5), pos=pos_1\n",
    "                             )\n",
    "else:\n",
    "    lr_pair = a.uns['lr_summary'].index.values[0:3]\n",
    "    for best_lr in lr_pair[0:3]:\n",
    "        st.pl.ccinet_plot(a, cluster_name, best_lr, min_counts=2,\n",
    "                          figsize=(10,7.5), pos=pos_1\n",
    "                         )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89655fd0-eda3-47d8-813a-5435ff46b990",
   "metadata": {},
   "source": [
    "## CCI 和弦图\n",
    "图中展示对应受配体对的互作弦和图"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f63067ae-c2c8-4cf1-8789-a75feb91ca59",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_chord_plot(a, cluster_name)\n",
    "for lr in lr_pair:\n",
    "    st.pl.lr_chord_plot(a, cluster_name, lr)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4abbc264-d8e4-4ffa-a485-4ce50dbfdccb",
   "metadata": {},
   "source": [
    "## CCI 热图\n",
    "图中展示对应受配体对的互作热图，横轴展示互作的细胞类型，纵轴展示互作的受配体，可通过参数 n_top_lrs 和 n_top_ccis 调整展示的个数，或者指定受配体对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37eb073f-647c-45e3-a455-fc16a0dceab0",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_cci_map(a, cluster_name, lrs=None, min_total=100, figsize=(20,4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "05d0bd4d-1f5e-46c6-8071-b629c8116af2",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.lr_cci_map(a, cluster_name, lrs=lr_pair, min_total=100, figsize=(20,4))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef843cd8-0295-43b1-8fcb-10d0f32b15c6",
   "metadata": {},
   "source": [
    "图中展示对应受配体对的互作热图，横轴和纵轴展示细胞类型，可指定展示的受配体对"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0b9a130a-98c0-424e-b289-1beaa7cb509e",
   "metadata": {},
   "outputs": [],
   "source": [
    "st.pl.cci_map(a, cluster_name)\n",
    "\n",
    "lr_pair = a.uns['lr_summary'].index.values[0:3]\n",
    "for lr in lr_pair[0:3]:\n",
    "    st.pl.cci_map(a, cluster_name, lr)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d18956c8",
   "metadata": {},
   "source": [
    "## 文献案例解析\n",
    "* **文献一：**  \n",
    "文献《Spatially organized tumor-stroma boundary determines the efficacy of immunotherapy in colorectal cancer patients》使用 stlearn 去揭示肿瘤与基质区的交界处的细胞类型互作情况。  \n",
    "## 参考文献\n",
    "Pham, D., Tan, X., Balderson, B. et al. Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues. Nat Commun 14, 7739 (2023).  \n",
    "Feng, Y., Ma, W., Zang, Y. et al. Spatially organized tumor-stroma boundary determines the efficacy of immunotherapy in colorectal cancer patients. Nat Commun 15, 10259 (2024)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "stlearn",
   "language": "python",
   "name": "stlearn"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.20"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
