Skip to content

ATAC + GE Cell Ranger Analysis

Author: Xueling Liu, Ruifeng Gao
Time: 4 min
Words: 788 words
Updated: 2025-08-08
Reads: 0 times
ATAC Cell Ranger

Paired ATAC and GE Libraries

Overview

Both ATAC and GE data need to be converted to barcode whitelists supported by 10x Cell Ranger ARC.

NOTE

ATAC and GE use different whitelists but have a mapping relationship. DD ATAC and GE data share one whitelist: P3CB.barcode.txt.gz.

Step 1: fastp QC (optional)

bash
# QC parameters for 150:150 split raw outputs:
software/fastp \
  -i atac/demo_arc_S1_L001_R1_001.fastq.gz \
  -I atac/demo_arc_S1_L001_R2_001.fastq.gz \
  -o fastpYaml/demo/demo_arc_S1_L001_R1_001.fastq.gz \
  -O fastpYaml/demo/demo_arc_S1_L001_R2_001.fastq.gz \
  -j demo_atac_fastp.json \
  -h demo_atac_fastp.html \
  --cut_tail_window_size 1 --cut_tail_mean_quality 3 --cut_tail --length_required 60

software/fastp \
  -i atac/demo_GE_S1_L001_R1_001.fastq.gz \
  -I atac/demo_GE_S1_L001_R2_001.fastq.gz \
  -o fastpYaml/demo/demo_GE_S1_L001_R1_001.fastq.gz \
  -O fastpYaml/demo/demo_GE_S1_L001_R2_001.fastq.gz \
  -j demo_GE_fastp.json \
  -h demo_GE_fastp.html \
  --cut_tail_window_size 1 --cut_tail_mean_quality 3 --cut_tail --length_required 60

Step 2: Data Conversion

Load environment and set parameters

SeekSoul Tools 1.2.2 environment can be used:

bash
# Load environment
conda activate path/seeksoultools/seeksoultools.1.2.2/

# Parameters
outdir='path/SeekArcTools/arc_2_10x/demo/'
sample="demo"
gsample='demo_GE'
gfq1="demo1_GE_S1_L001_R1_001.fastq.gz"
gfq2="demo_GE_S1_L001_R2_001.fastq.gz"
asample='demo_arc'
afq1="demo_arc_S1_L001_R1_001.fastq.gz"
afq2="demo_arc_S1_L001_R2_001.fastq.gz"
afq3="demo_arc_S1_L001_R3_001.fastq.gz"
# SeekGene raw outputs have no ATAC R3 file, but Cell Ranger requires it.
scrdir="path/arc_2_10x/to10Xcs/"
# Resources
core='16'
memory='60'

Run Step 1: extract and correct DD barcodes

bash
# Step 1: specify P3CB.barcode.txt.gz to add GE barcodes onto read IDs
mkdir -p ${outdir}/${gsample}
python ${scrdir}/code/barcode.py \
  --fq1 ${outdir}/Rawdata/${gfq1} \
  --fq2 ${outdir}/Rawdata/${gfq2} \
  --samplename ${gsample} \
  --outdir ${outdir}/${gsample} \
  --barcode ${scrdir}/file/P3CB.barcode.txt.gz \
  --chemistry DD-Q \
  --core ${core}

# Step 1: specify P3CB.barcode.txt.gz to add ATAC barcodes onto read IDs
mkdir -p ${outdir}/${asample}
python ${scrdir}/code/atacbarcode.py \
  --fq1 ${outdir}/Rawdata/${afq1} \
  --fq2 ${outdir}/Rawdata/${afq2} \
  --samplename ${asample} \
  --outdir ${outdir}/${asample} \
  --barcode ${scrdir}/file/P3CB.barcode.txt.gz \
  --chemistry DD_AG \
  --core ${core}

Convert SeekGene barcodes to corresponding 10x barcodes

stcbto10x.py can run in the SeekArcTools environment or with required modules installed as required by the script.

bash
mkdir -p ${outdir}/${gsample}/to10xdata
mkdir -p ${outdir}/${asample}/to10xdata
python ${scrdir}/code/stcbto10x.py \
  --gfq1 ${outdir}/${gsample}/step1/${gsample}_1.fq.gz \
  --gfq2 ${outdir}/${gsample}/step1/${gsample}_2.fq.gz \
  --afq1 ${outdir}/${asample}/step1/${asample}_1.fq.gz \
  --afq2 ${outdir}/${asample}/step1/${asample}_2.fq.gz \
  --outgfq1 ${outdir}/${gsample}/to10xdata/${gfq1} \
  --outgfq2 ${outdir}/${gsample}/to10xdata/${gfq2} \
  --outafq1 ${outdir}/${asample}/to10xdata/${afq1} \
  --outafq2 ${outdir}/${asample}/to10xdata/${afq2} \
  --outafq3 ${outdir}/${asample}/to10xdata/${afq3} \
  --x10 ${scrdir}/file/merged_10xCB.csv \
  --cb_file ${outdir}/stcbto10x.csv

IMPORTANT

The ATAC data format required by Cell Ranger ARC differs from the DD platform outputs. ATAC libraries need R1, R2, and R3.

Data formats required by Cell Ranger ARC

GE data format

text
Sample name: [Sample Name]S1_L00[Lane Number][Read Type]_001.fastq.gz
Format:
    I1: Dual index i7 read (optional)
    I2: Dual index i5 read (optional)
    R1: Read 1
    R2: Read 2

ATAC data format

text
Sample name: [Sample Name]S1_L00[Lane Number][Read Type]_001.fastq.gz
Format option 1:
    I1: Dual index i7 read (optional)
    R1: Read 1
    R2: Dual index i5 read
    R3: Read 2
Or format option 2:
    I1: Dual index i7 read (optional)
    R1: Read 1
    I2: Dual index i5 read
    R2: Read 2

Step 3: Cell Ranger ARC count analysis

Refer to 10x official documentation for detailed parameters:

bash
cellranger-arc count \
  --id=demo10Xsc \
  --reference=refdata-cellranger-arc-mm10-2020-A-2.0.0 \
  --libraries=library.csv \
  --localcores=8 \
  --localmem=64

Configure the converted FASTQ paths in the configuration file:

text
# library.csv
fastqs,sample,library_type
data/,demo_GE,Gene Expression
data/,demo_arc,Chromatin Accessibility

TIP

Converted GEX and ATAC libraries can also be analyzed separately:

  • ATAC: use cellranger-atac count with --chemistry=ARC-v1
  • GEX: use cellranger count with --chemistry=ARC-v1
bash
/cellranger-atac-2.0.0/cellranger-atac count \
  --id=LD758_ATAC \
  --reference=refdata-cellranger-arc-mm10-2020-A-2.0.0 \
  --fastqs=data/ \
  --localcores=8 \
  --localmem=64 \
  --sample=demo_arc \
  --chemistry=ARC-v1

Script download: Download

0 comments·0 replies