Skip to content

Analysis Guide

Author: SeekGene
Time: 3 min
Words: 519 words
Updated: 2026-05-12
Reads: 0 times
scMethyl + RNA-seq Analysis Guide

Activate the Environment

bash
conda activate seeksoulmethyl

Run the Dual-omics Workflow with the Shell Script

bash
bash sc_methy_workflow.sh \
  /path/to/expression_R1.fastq.gz \
  /path/to/expression_R2.fastq.gz \
  /path/to/methy_R1.fastq.gz \
  /path/to/methy_R2.fastq.gz \
  --sample WTJW880 \
  --outdir /path/to/results \
  --database_dir /path/to/human-reference-GRCh38 \
  --chemistry DD-MET3 \
  --core 64 \
  --filter_ch 2

For samples with multiple datasets, provide comma-separated FASTQ lists in matching order.

System Requirements for Shell Mode

Recommended resources for sc_methy_workflow.sh:

  • CPU: 64 cores
  • Memory: 128 GB RAM
  • Storage: at least 500 GB free space
  • Operating system: Linux, preferably Ubuntu 18.04+ or CentOS 7+

Run the Workflow with Nextflow

Nextflow is the recommended method for batch processing, report generation, and workflow management.

Install Nextflow

bash
conda install -n seeksoulmethyl -c bioconda nextflow

Run the Main Workflow

bash
nextflow run -bg SeekSoulMethyl/nf/main.nf \
  --outdir /path/to/results/ \
  --samplesheet samplelist.csv \
  -w /path/to/results/work \
  -c SeekSoulMethyl/nf/nextflow.config \
  -profile aliyun_k8s \
  --database_dir /path/to/human-reference-GRCh38/ \
  --split_fastq 1 \
  --filter_ch 2 \
  --chemistry DD-MET3 > methy.log
Figure 1: SeekSoul™ Methyl Tools Nextflow pipeline workflow

Supported Nextflow Workflows

The v2.1.2 code base supports multiple workflows through --workflow:

  • rna_met: transcriptome plus methylation integrated analysis
  • methy_only: methylation-only workflow
  • force_cell: recompute or update methylation results based on previous outputs

Methylation-only Workflow

bash
nextflow run SeekSoulMethyl/nf/main.nf \
  --workflow methy_only \
  --outdir /path/to/results \
  --samplesheet samplelist.csv \
  -w /path/to/work \
  -c SeekSoulMethyl/nf/nextflow.config \
  -profile aliyun_k8s \
  --database_dir /path/to/reference \
  --split_fastq 4 \
  --filter_ch 2 \
  --chemistry DD-MET3

Notes on nextflow.config

The file nf/nextflow.config must be adjusted for your infrastructure. Focus on these items:

  • process.executor: choose local, slurm, pbs, k8s, awsbatch, or another supported executor
  • process.cpus, process.memory, process.time: default resource limits
  • workDir: working directory with enough writable storage
  • conda.enabled, docker.enabled, or singularity.enabled: select the environment strategy your platform supports

Typical profiles include local execution, Slurm clusters, and Kubernetes environments.

Key Runtime Parameters

  • --database_dir: reference genome database directory
  • --chemistry: choose DD-MET3 or DD-MET5
  • --split_fastq: split methylation FASTQ files by the first n barcode bases to increase parallelism
  • --filter_ch: remove read pairs with more than n CH methylation sites; set to 0 to disable
  • -resume: restart a failed or interrupted Nextflow run with the same work directory

FAQ

  • Samplesheet parsing error: confirm the first column is exactly sample_id and use absolute file paths.
  • Missing ${sample}.mcds: check whether per-cell *_allc.gz files were generated and confirm chromosome-size resources are valid.
  • Bismark alignment stalls or fails: confirm the Bismark reference is present under --database_dir/fasta/ and visible to the execution environment.
  • Repeated runs: keep the same -w working directory and add -resume.
0 comments·0 replies