Selecting variants of interest from a callset
This document describes why you might want to extract a subset of variants from a callset and how you would achieve this. Often, a VCF containing many samples and/or variants will need to be subset in...
View ArticleMutect2 does not call this variant
Hello GATK team, I'm trying to do some benchmarking with Mutect2 and a synthetic dataset. Even if the results are very good in terms of precision (~99%), recall or sensitivity can be better (~80%). In...
View Article(How to) Map and clean up short read sequence data efficiently
If you are interested in emulating the methods used by the Broad Genomics Platform to pre-process your short read sequencing data, you have landed on the right page. The parsimonious operating...
View ArticleminReadsPerAlignmentStart, ntc
Hi all I have a couple of questions, I am working with GBS data and I am wondering if to lower the value of minReadsPerAlignmentStart (from 10 default to something lower) will help me to call more...
View ArticleUnable to parse header with error:...
Hi there, Today when I tried to perform Joint Genotyping by using GenomeAnalysisTK-3.4-46 with the parameters below, I got an error message like this: **ERROR MESSAGE: Unable to parse header with...
View ArticleThe GATK Best Practices for variant calling on RNAseq, in full detail
We’re excited to introduce our Best Practices recommendations for calling variants on RNAseq data. These recommendations are based on our classic DNA-focused Best Practices, with some key differences...
View ArticleExtremely high depth of coverage
Dear all, I've run the DepthOfCoverage tool on 263 WGS samples and have found some unusual total and averages for some regions. Does it mean any sort of error on the alignment or I can just filtered...
View ArticleGenotypeGVCFs error
Hi I am getting the following error. I ran the exact same samples/pipeline a couple of weeks ago using 3.6 and it worked fine, now with 3.7 I am getting an error: INFO 09:49:49,742 HelpFormatter -...
View ArticleFiles needed for variant recalibration for Arabidopsis thaliana
Dear all, I am analyzing different Arabidopsis thaliana ecotypes for the first time using GATK. I am following the best practices guidelines without any problem but when I reached the step for variant...
View ArticleWhat are the standard resources for non-human genomes?
We're trying to put together some recommendations for folks who want to use GATK tools on non-human genomes. But we really don't have much experience with non-human genomes, so we're hoping that those...
View ArticleAllele frequency and depth VCF produced by MuTect2
Hi all, From my understanding of the VCF output, the AF[format] field (Allele fraction of the event in the tumor) equals to : AD[format] / DP[format]. With AD being the depth of coverage of each allele...
View ArticleI have all Allele frequency 0.5 or 1.0
I finished my RNA-seq variant callling using GATK pipeline described in your workflow. But I realized that all the Allele Frequency(AF) values in the vcf files are 0.5 or 1.0. Is it normal?
View ArticleVCF header with AD format 'Number=R' causes error in VQSR VariantRecalibrator...
I have a VCF header with the following number annotation for the AD field: ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">...
View ArticleI get very different MQ values when using GVCF vs BP_RESOLUTION
Hello! I had a question about the difference between using HaplotypeCaller's --emitRefConfidence GVCF vs BP_RESOLUTION. Maybe the answer is obvious or in the forum somewhere already but I couldn't spot...
View ArticleUsing CRAM files in Picard SamToFastq with Queue
Hi, I'm trying to convert CRAM files to FASTQ format with SamToFastq as part of a pipeline I have written in a Queue script. Is there any parameter for passing a reference file in the SamToFastq Queue...
View ArticleProject level VCF
I am combining gVCFs that are processed exactly the same, I observed that most variants have this tag: "GT:AD:DP:GQ:PL". However for some variants, within the same variant, they can have this tag...
View ArticleMismatch between number of variants in the input and output of the genotypeGVCF
I am merging few hundred of samples for a project level VCF. The following summarize my steps: a) performed a combineGVCF on a set of gVCF (pVCF1) and then a combineGVCF on another set of gVCF (pVCF2)...
View ArticleRegarding GenotypeGVCFs
Hi, This question may be not relevant but i could find answer to my question so thought of posting it, I have been running Haplotyper and CombineGVCFs followed by GenotypeGVCFs but there is drastic...
View ArticleHelp us help you: a note to those asking questions
Here are some rules-of-thumb for posting questions Post a new question instead of continuing an ongoing discussion thread. The exception to this is if your question relates directly to the discussion...
View Article"Flag" type in fields of GenomicsDB vid_mapping_file
Hi folks, We're having a preliminary investigation of GenomicsDB and hoping to try importing some VCFs. In the Fields Information section of the vid_mapping_file, we need to define some INFO fields...
View Article