Skip to content

Data Preparation

Author: SeekGene
Time: 2 min
Words: 277 words
Updated: 2026-05-28
Reads: 0 times
scMethyl + RNA-seq Data Preparation

Download Reference Database

bash
# Download human reference genome (GRCh38)
wget -dc -O human-reference-GRCh38.tar.gz "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/human-reference-GRCh38.tar.gz"
wget -dc -O human-reference-GRCh38.tar.gz.md5 "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/human-reference-GRCh38.tar.gz.md5"

# Download mouse reference genome (GRCm39)
wget -dc -O mouse-reference-GRCm39.tar.gz "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/mouse-reference-GRCm39.tar.gz"
wget -dc -O mouse-reference-GRCm39.tar.gz.md5 "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/mouse-reference-GRCm39.tar.gz.md5"

# Extract reference genomes
tar -xzf human-reference-GRCh38.tar.gz
tar -xzf mouse-reference-GRCm39.tar.gz

Download Test Data (Optional)

For downloading demo data, please refer to the Datasets document.

Repository Layout

After cloning, the key Nextflow entry points and modules are:

  • nf/main.nf: Main workflow for transcriptome + methylation end-to-end processing.
  • nf/methy_only.nf: Workflow for methylation-only data.
  • nf/modules/: Step-wise process modules:
    • step1.nf preprocessing, QC, barcode extraction, transcriptome analysis.
    • step2.nf Bismark alignment and BAM sorting.
    • step3.nf per-cell BAM splitting, ALLC generation/merge, multi-scale datasets.
    • step4.nf summaries, dimensionality reduction, joint report.
    • utils.nf helpers for methylation-only workflow (reads counting and cell estimation).
  • nf/bin/: Helper scripts and resources (e.g., barcode whitelists).
  • nf/nextflow.config: Executors and resource configuration.
  • sc_methy_workflow.sh: Shell script to run the dual-omics analysis pipeline.

We provide two methods for data analysis:

  1. Shell Script: Run the analysis pipeline directly via the sc_methy_workflow.sh script.
  2. Nextflow Pipeline: Run the Nextflow pipeline via nf/main.nf.

Details of both methods are provided below.

0 comments·0 replies