Ultra-Rapid RNA-Seq Data Analysis
The DRAGEN Transcriptome (RNA-Seq) Pipeline performs Next Generation Sequencing (NGS) secondary analysis of RNA transcripts. The Transcriptome Pipeline offers multiple operating modes, including reference-only alignment and annotation-assisted alignment. DRAGEN transcriptome alignments are compatible with downstream transcript assembly tools, novel transcript discovery, differential gene expression, gene fusion detection, and other RNA-Seq applications.
The DRAGEN Transcriptome pipeline accepts input FASTQ/BAM/CRAM and produces an output aligned BAM/CRAM. DRAGEN offers the option to input a gene annotations file (GTF) to guide the spliced alignments. DRAGEN is also capable of running in a “2-pass” mode which uses novel splice junctions, as detected in the first pass, to guide the second pass mapping / aligning phase.
Transcriptome Pipeline Speed
The DRAGEN Transcriptome Pipeline offers multiple modes, including reference-only alignment and annotation-assisted alignment. The alignment accuracy and splice junction discovery accuracy tables for each mode are shown on the following pages. The reference-only alignment and annotation-assisted alignment pipelines were performed using the Engstrom Sim2 Dataset*.
*BEERS Sim 2 datasets obtained from Nature Methods – Systematic evaluation of spliced alignment programs for RNA-seq data. doi:10.1038/nmeth.2722
Splice Junction Discovery Accuracy
Splice Junction Discovery
Cumulative counts of true and false junctions were computed over a range of thresholds for the number of supporting alignments. A point further to the left on a curve has a higher supporting alignment count threshold than a point to the right.
Overall Read Alignment Accuracy*
Read Alignment Accuracy
Each bar plot shows the number of perfect alignments (all bases in read aligned correctly), number of partially correct alignments (at least one base aligned correctly but not all) and totally incorrect alignments.
*Reference-only and annotation-assisted alignment pipelines were performed using the Engstrom Sim2 Dataset
Alignment Accuracy for Gene Annotation Input
DRAGEN also offers annotation-assisted alignment that is achieved with gene annotation input (GTF format). GTF format is used to improve the sensitivity of splice junction discovery. DRAGEN may take a GTF as input, providing the pipeline with the precise locations of known splice junctions for a given species. The annotation assisted alignment pipelines were also performed using the Engstrom Sim2 Dataset.
Splice Junction Discovery Accuracy*
Gene Annotation Input
Splice Junction Discovery: Annotations
Gene annotation input improves the sensitivity of splice junction discovery. Accurate gene annotations are available for a limited number of species at present.
Overall Read Alignment Accuracy
Gene Annotation Input
Read Alignment Accuracy: Annotations
With gene annotation input, DRAGEN perfectly aligns at least 10% more reads than STAR or TopHat.