ATAC + GE Cell Ranger Analysis
Time: 4 min
Words: 788 words
Updated: 2025-08-08
Reads: 0 times
Paired ATAC and GE Libraries
Overview
Both ATAC and GE data need to be converted to barcode whitelists supported by 10x Cell Ranger ARC.
NOTE
ATAC and GE use different whitelists but have a mapping relationship. DD ATAC and GE data share one whitelist: P3CB.barcode.txt.gz
.
Step 1: fastp QC (optional)
bash
# QC parameters for 150:150 split raw outputs:
software/fastp \
-i atac/demo_arc_S1_L001_R1_001.fastq.gz \
-I atac/demo_arc_S1_L001_R2_001.fastq.gz \
-o fastpYaml/demo/demo_arc_S1_L001_R1_001.fastq.gz \
-O fastpYaml/demo/demo_arc_S1_L001_R2_001.fastq.gz \
-j demo_atac_fastp.json \
-h demo_atac_fastp.html \
--cut_tail_window_size 1 --cut_tail_mean_quality 3 --cut_tail --length_required 60
software/fastp \
-i atac/demo_GE_S1_L001_R1_001.fastq.gz \
-I atac/demo_GE_S1_L001_R2_001.fastq.gz \
-o fastpYaml/demo/demo_GE_S1_L001_R1_001.fastq.gz \
-O fastpYaml/demo/demo_GE_S1_L001_R2_001.fastq.gz \
-j demo_GE_fastp.json \
-h demo_GE_fastp.html \
--cut_tail_window_size 1 --cut_tail_mean_quality 3 --cut_tail --length_required 60
Step 2: Data Conversion
Load environment and set parameters
SeekSoul Tools 1.2.2 environment can be used:
bash
# Load environment
conda activate path/seeksoultools/seeksoultools.1.2.2/
# Parameters
outdir='path/SeekArcTools/arc_2_10x/demo/'
sample="demo"
gsample='demo_GE'
gfq1="demo1_GE_S1_L001_R1_001.fastq.gz"
gfq2="demo_GE_S1_L001_R2_001.fastq.gz"
asample='demo_arc'
afq1="demo_arc_S1_L001_R1_001.fastq.gz"
afq2="demo_arc_S1_L001_R2_001.fastq.gz"
afq3="demo_arc_S1_L001_R3_001.fastq.gz"
# SeekGene raw outputs have no ATAC R3 file, but Cell Ranger requires it.
scrdir="path/arc_2_10x/to10Xcs/"
# Resources
core='16'
memory='60'
Run Step 1: extract and correct DD barcodes
bash
# Step 1: specify P3CB.barcode.txt.gz to add GE barcodes onto read IDs
mkdir -p ${outdir}/${gsample}
python ${scrdir}/code/barcode.py \
--fq1 ${outdir}/Rawdata/${gfq1} \
--fq2 ${outdir}/Rawdata/${gfq2} \
--samplename ${gsample} \
--outdir ${outdir}/${gsample} \
--barcode ${scrdir}/file/P3CB.barcode.txt.gz \
--chemistry DD-Q \
--core ${core}
# Step 1: specify P3CB.barcode.txt.gz to add ATAC barcodes onto read IDs
mkdir -p ${outdir}/${asample}
python ${scrdir}/code/atacbarcode.py \
--fq1 ${outdir}/Rawdata/${afq1} \
--fq2 ${outdir}/Rawdata/${afq2} \
--samplename ${asample} \
--outdir ${outdir}/${asample} \
--barcode ${scrdir}/file/P3CB.barcode.txt.gz \
--chemistry DD_AG \
--core ${core}
Convert SeekGene barcodes to corresponding 10x barcodes
stcbto10x.py
can run in the SeekArcTools environment or with required modules installed as required by the script.
bash
mkdir -p ${outdir}/${gsample}/to10xdata
mkdir -p ${outdir}/${asample}/to10xdata
python ${scrdir}/code/stcbto10x.py \
--gfq1 ${outdir}/${gsample}/step1/${gsample}_1.fq.gz \
--gfq2 ${outdir}/${gsample}/step1/${gsample}_2.fq.gz \
--afq1 ${outdir}/${asample}/step1/${asample}_1.fq.gz \
--afq2 ${outdir}/${asample}/step1/${asample}_2.fq.gz \
--outgfq1 ${outdir}/${gsample}/to10xdata/${gfq1} \
--outgfq2 ${outdir}/${gsample}/to10xdata/${gfq2} \
--outafq1 ${outdir}/${asample}/to10xdata/${afq1} \
--outafq2 ${outdir}/${asample}/to10xdata/${afq2} \
--outafq3 ${outdir}/${asample}/to10xdata/${afq3} \
--x10 ${scrdir}/file/merged_10xCB.csv \
--cb_file ${outdir}/stcbto10x.csv
IMPORTANT
The ATAC data format required by Cell Ranger ARC differs from the DD platform outputs. ATAC libraries need R1, R2, and R3.
Data formats required by Cell Ranger ARC
GE data format
text
Sample name: [Sample Name]S1_L00[Lane Number][Read Type]_001.fastq.gz
Format:
I1: Dual index i7 read (optional)
I2: Dual index i5 read (optional)
R1: Read 1
R2: Read 2
ATAC data format
text
Sample name: [Sample Name]S1_L00[Lane Number][Read Type]_001.fastq.gz
Format option 1:
I1: Dual index i7 read (optional)
R1: Read 1
R2: Dual index i5 read
R3: Read 2
Or format option 2:
I1: Dual index i7 read (optional)
R1: Read 1
I2: Dual index i5 read
R2: Read 2
Step 3: Cell Ranger ARC count analysis
Refer to 10x official documentation for detailed parameters:
bash
cellranger-arc count \
--id=demo10Xsc \
--reference=refdata-cellranger-arc-mm10-2020-A-2.0.0 \
--libraries=library.csv \
--localcores=8 \
--localmem=64
Configure the converted FASTQ paths in the configuration file:
text
# library.csv
fastqs,sample,library_type
data/,demo_GE,Gene Expression
data/,demo_arc,Chromatin Accessibility
TIP
Converted GEX and ATAC libraries can also be analyzed separately:
- ATAC: use
cellranger-atac count
with--chemistry=ARC-v1
- GEX: use
cellranger count
with--chemistry=ARC-v1
bash
/cellranger-atac-2.0.0/cellranger-atac count \
--id=LD758_ATAC \
--reference=refdata-cellranger-arc-mm10-2020-A-2.0.0 \
--fastqs=data/ \
--localcores=8 \
--localmem=64 \
--sample=demo_arc \
--chemistry=ARC-v1
Script download: Download