Data Preparation
Time: 2 min
Words: 277 words
Updated: 2026-05-28
Reads: 0 times
Download Reference Database
bash
# Download human reference genome (GRCh38)
wget -dc -O human-reference-GRCh38.tar.gz "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/human-reference-GRCh38.tar.gz"
wget -dc -O human-reference-GRCh38.tar.gz.md5 "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/human-reference-GRCh38.tar.gz.md5"
# Download mouse reference genome (GRCm39)
wget -dc -O mouse-reference-GRCm39.tar.gz "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/mouse-reference-GRCm39.tar.gz"
wget -dc -O mouse-reference-GRCm39.tar.gz.md5 "https://seekgene-public.oss-cn-beijing.aliyuncs.com/methy_demo/methy_exp/v1.1/mouse-reference-GRCm39.tar.gz.md5"
# Extract reference genomes
tar -xzf human-reference-GRCh38.tar.gz
tar -xzf mouse-reference-GRCm39.tar.gzDownload Test Data (Optional)
For downloading demo data, please refer to the Datasets document.
Repository Layout
After cloning, the key Nextflow entry points and modules are:
nf/main.nf: Main workflow for transcriptome + methylation end-to-end processing.nf/methy_only.nf: Workflow for methylation-only data.nf/modules/: Step-wise process modules:step1.nfpreprocessing, QC, barcode extraction, transcriptome analysis.step2.nfBismark alignment and BAM sorting.step3.nfper-cell BAM splitting, ALLC generation/merge, multi-scale datasets.step4.nfsummaries, dimensionality reduction, joint report.utils.nfhelpers for methylation-only workflow (reads counting and cell estimation).
nf/bin/: Helper scripts and resources (e.g., barcode whitelists).nf/nextflow.config: Executors and resource configuration.sc_methy_workflow.sh: Shell script to run the dual-omics analysis pipeline.
We provide two methods for data analysis:
- Shell Script: Run the analysis pipeline directly via the
sc_methy_workflow.shscript. - Nextflow Pipeline: Run the Nextflow pipeline via
nf/main.nf.
Details of both methods are provided below.
