Skip to content

GFP Transgene Not Detected

Author: Ruifeng Gao
Time: 3 min
Words: 412 words
Updated: 2025-08-06
Reads: 0 times
Build Reference Genome Exogenous Gene

Table of Contents


Problem Description

The customer found that although the GFP transgene was included in the reference genome, the expression of the GFP gene could not be detected in the cloud platform analysis results.


Troubleshooting Process

  1. Initial Confirmation:
    • The expression matrix contained the GFP gene, but the RDS file did not, resulting in no corresponding gene in the cloud platform analysis.
  2. Sample Verification:
    • For example, in one sample, the BAM file could be aligned to the GFP gene, and all alignments were 150 matches.
    • However, the XS tag was always XS:Z:Unassigned_NoFeatures.
    • This indicated that the GFP gene could be aligned, but was not recognized during quantification by featureCounts.
shell
zcat features.tsv.gz | grep -n GFP
zcat matrix.mtx.gz | awk -v gene_row=index '$1 == gene_row { count++ } END { print count }'
samtools view sample_SortedByCoordinate_withTag.bam | grep "GFP"

  1. Multiple Sample Review:
    • Other samples showed the same issue: the GFP gene was not quantified.

Cause Analysis

IMPORTANT

The main reason why the GFP gene was not recognized by featureCounts is that the GTF file format was non-standard: the annotation for the exogenous GFP gene lacked gene and transcript lines, containing only exon information. This prevented featureCounts from quantifying it correctly.


Solution

TIP

You need to reconstruct the GTF annotation file for the GFP gene, ensuring that it contains all three annotation lines: gene, transcript, and exon.

Standard GFP Gene GTF Format Example:

text
# Add gene line
echo -e 'GFP\tunknown\tgene\t1\t720\t.\t+\t.\tgene_id "GFP"; gene_name "GFP"; gene_biotype "protein_coding";' > GFP.gtf

# Add transcript line (append with >>)
echo -e 'GFP\tunknown\ttranscript\t1\t720\t.\t+\t.\tgene_id "GFP"; transcript_id "GFP"; gene_name "GFP"; gene_biotype "protein_coding";' >> GFP.gtf

# Add exon line (append with >>)
echo -e 'GFP\tunknown\texon\t1\t720\t.\t+\t.\tgene_id "GFP"; transcript_id "GFP"; gene_name "GFP"; gene_biotype "protein_coding"; exon_number 1; exon_id "GFP"' >> GFP.gtf

NOTE

After reconstructing the reference genome with the corrected GTF file and re-running quantification, the GFP gene can be properly detected and quantified.

0 comments·0 replies