ABSTRACT

There are numerous tools for variant calling. This chapter concentrates on genome analysis toolkit (GATK). It also considers information on installing GATK and basic usage. There are many options for variant calling with GATK. The initial variant call set produced by GATK is designed to maximize sensitivity at the cost of a certain amount of false-positive calls. Accordingly, the results obtained from GATK or any other variant caller need to be filtered according to their quality. The goal of hard filtering and variant-quality score recalibration (VQSR) is to reduce the number of false-positive calls without greatly reducing the sensitivity. VQSR develops an estimate of the relationship between single nucleotide polymorphism (SNP) call annotations and the probability that a called variant is a true positive. VQSR calculates a variant quality score log-odds (VQSLOD) score for each variant and adds this score to the INFO section of the variant call format (VCF) file.