Joint Genotyping Pipeline
Ultra-Rapid Multi Genome Analysis
The DRAGEN Joint Genotyping Pipeline calls variants from multiple samples at a speed 25x faster than competing pipelines with uncompromising accuracy. The Joint Genotyping pipeline supports pedigree as well as population variant calling from a cohort of samples. The Joint Genotyping pipeline handles up to ten samples at one time. The DRAGEN Population Calling pipeline handles sample sizes of many thousands at once.
The combination of DRAGEN’s speed and hierarchical grouping of multiple samples provides the most computationally efficient analysis solution for joint genotyping.
DRAGEN Joint Genotyping Pipeline
The DRAGEN Joint Genotyping pipeline enables variant calls to be made with information from multiple samples. DRAGEN produces an output gVCF file for each of the individual samples. Each gVCF file provides a comprehensive record of every position in the genome. The gVCF files are fed into the DRAGEN Joint Genotyper to produce a single VCF for subsequent joint or family analysis. The Joint Genotyping pipeline handles ten samples at one time. The DRAGEN Population Calling pipeline handles sample sizes of many thousands at once.
Joint Calling from BCL
In the event the user is joint calling samples sequenced on the same flow cell, he can take advantage of the capability of DRAGEN to simultaneously map/align multi-sample inputs to speed up the overall process of joint calling. DRAGEN is capable of processing BCL data directly, eliminating any FASTQ conversion step. The BCL data is fed directly to the pipeline to produce unique gVCF files for each sample. Intermediate BAM/CRAM files can be generated on demand.
Speeds: Joint Genotyping Pipeline*
Accuracy: Joint Genotyping Pipeline*
Ultra-Rapid Analysis: # Platinum Genome Trios Genotyped in 48 Hours*
*All DRAGEN results are compared against BWA-MEM 0.7.12 + GATK 3.1 running on comparable servers.
ROC Plots of Variants at 50x Coverage
ROC of SNPs
A SNP (single nucleotide polymorphism) occurs when a single base differs between two genomes, in this case the subject and the reference genome. Use of the NIST Platinum Genome high confidence call set enables performance comparisons between different pipelines. In this ROC plot, a higher count of true positive SNPs and lower count of false positive SNPs is considered better.
ROC of INDELs
An INDEL (insertion or deletion) occurs when bases are inserted or deleted in the subject genome with respect to a reference genome. Use of the NIST Platinum Genome high confidence call set enables performance comparisons between different pipelines. In this ROC plot, a higher count of true positive INDELs and lower count of false positive INDELs is considered better.