The World's First Bio-IT Processor

You are here: Home / DRAGEN

GATK Best Practices Workflow on DRAGEN

Accelerated and Cost-Effective

The GATK Best Practices Workflow for Germline SNPs and Indels in Whole Genomes and Exomes is available on the DRAGEN Platform for customers that have a valid GATK License from the Broad Institute. Harnessing the tremendous processing power of the DRAGEN Bio-IT Platform, the GATK Best Practices Workflow on DRAGEN reduces the time required to analyze a whole genome from FASTQ to VCF at 30x coverage to ~22 minutes. This time saving translates directly to significant cost savings both onsite and in the cloud. DRAGEN supports versions 3.1 and 3.6 of the GATK-HC VariantCaller.

Complete End-to-End Solution

The GATK Best Practices Workflow on DRAGEN is a complete end-to-end solution that includes all required analysis phases as specified by the Broad Institute. The workflow is fully configurable and ready for use out-of-the-box. The workflow ingests BCL or FASTQ files and produces BAM, VCF and/or gVCF files as well as depth of coverage metrics. GATK Best Practices on DRAGEN can be utilized onsite, in the cloud or as a seamless hybrid cloud that includes a fully functional and easy to use Web Portal and Workflow Management System.

GATK Best Practices on DRAGEN

GATK Best Practices describes the key principles of the processing and analysis steps required to go from raw reads coming off a sequencing instrument through to an appropriately filtered variant callset that can be used in downstream analyses. GATK Best Practices breaks the workflow into two required analysis phases and one optional phase that describes ways to handle or fine-tune the output VCF file.

Step 1. Pre-Processing

The first phase of the GATK Best Practices Workflow involves recommendations to properly prepare raw data files for analysis.

Map to Reference: The first step is mapping the sequenced reads to the reference genome to produce a file in SAM/BAM format sorted by coordinate. The workflow offers supreme flexibility of data analysis and can handle any raw data input format and can stream BCL data directly from sequencer storage, a solution unique to the DRAGEN Platform, enabling the customer to go directly from raw sequencing data to an output VCF. DRAGEN can also convert BCL to FASTQ or BAM/CRAM then proceed with the GATK Best PracticesWorkflow.

Mark Duplicates: Once data has been mapped to the reference genome, the workflow marks duplicate reads from DNA fragments that may have been sequenced multiple times because of PCR amplification or optical duplicate artifacts. Marking duplicates mitigates biases introduced by data generation steps by flagging the duplicates but the process does not remove the reads.

Step 2. Variant Discovery

GATK Best Practices divides variant discovery into two separate steps: variant calling and variant filtering. The first step is designed to maximize sensitivity while the filtering step aims to deliver a level of specificity that can be customizedfor each project. Variant Discovery

Call Variants: The workflow is capable of calling SNPs and Indels simultaneously via local de-novo assembly of haplotypes in an active region. The workflow uses hardware-accelerated implementations of the SmithWatermanand PairHMMalgorithmsfor variantcalling.

Joint Genotype: The workflow produces an output gVCF file for each individual sample providing a comprehensive record of each position in the genome along with genotype likelihoods. The gVCF files are fed into the joint genotyping tool to produce a multisample VCFfor subsequent joint or family analysis.

Filter Variants: GATK Best Practices recommends filtering the raw variant callset by using a variant quality score recalibration (VQSR), which uses machine learning to identify annotation profiles of variants that are likely to be real, and assigns a VQSLOD score to each variant.

Comprehensive Platform for Genomic Data Analysis and Storage

The GATK Best Practices Workflow on DRAGEN plugs into the larger DRAGEN eco-system, a comprehensive platform for analyzing, reanalyzing, storing and archiving genomic data at the lowest cost and highest fidelity. The DRAGEN Platform is based on the highly reconfigurable DRAGEN Bio-IT Processor which uses a field-programmable-gate-array (FPGA) to provide hardware-accelerated implementations of genome pipeline algorithms, such as BCL conversion, compression, mapping, alignment, sorting, duplicate marking and haplotype variant calling. The highly flexible DRAGEN Platform allows users to effortlessly set-up and manage highly complex workflows and can be utilized onsite, in the cloud or as a seamless hybrid cloud that includes a fully functional and easy-to-use Web Portal and Workflow Management System. The Hybrid Cloud configuration provides users with the flexibility to scale up to the cloud during times of high capacity and return to onsite analysis when demand is reduced.

Workflow Management System

The WMS is comprehensive in terms of functional capabilities: users can effortlessly automate their workflows between any number of sequencing instruments of any type, can automate their pipelines across any number of samples, and can spread these samples across an infinite number of lanes and flow cells. The WMS can integrate with any Laboratory Information Management System (LIMS).
Many of the largest genome sequencing centers in the world use this technology to run thousands of genomes in a year, without the need for widespread resources to facilitate and manage the work. The WMS is scalable in terms of file size, number of files, and/or number of DRAGENs. It enables production-scale, clinical-grade genomic analysis.

Drag and Drop Web Portal

The DRAGEN Web Portal enables customers to use one single interface to manage both the onsite and cloud platform. Customers can manage the complete DRAGEN Platform through the Web portal, including selecting or customizing genomic pipelines, syncing hybrid cloud analysis with onsite DRAGEN server and FPGA platform, automating and scheduling concurrent single, multi-sample and/or population calling pipelines , and compressing data files. The DRAGEN Web Portal features an information-rich dashboard that enables customers to manage everything at a glance, including the complex automation functionality of the Workflow Management System and all DRAGEN ultra-rapid analysis algorithms.

GATK Web Portal 2

GATK Web Portal 1

Best Practices Speeds

Accelerate your GATK Best Practices Workflow on the DRAGEN Platform