Variant Quality Score Recalibration (VQSR)
This document describes what Variant Quality Score Recalibration (VQSR) is designed to do, and outlines how it works under the hood. The first section is a high-level overview aimed at non-specialists....
View ArticleWhy are some indels being called when in the read mapping there are none?
Greetings, I have found some INDELS called that are not present in the read mapping and dont understand why it might happen or how to solve it. I have tried to the calling with and without indel...
View ArticleMQRankSum and ReadPosRankSum for SNPs in a haploid organism?
Hi, Apologies if this has been addressed previously. I'm working with genomic resequencing data for a haploid organism, and I have created a VCF file using GenotypeGVCFs from 33 gVCFs created using...
View ArticleIndel Realigner - no Knowns
Hi all, I'm working through the best practices and am a bit of a novice. I'm working on a non-model organism with no known indel list. Is it still useful to use the Indel realigner?
View ArticleHow to extract homozygous SNP from varient VCF file
After extraction of SNPs from the varient vcf file, how I can separete SNP into two categories: homozygous SNP and biallelic heterozygous SNP. please share gatk command.
View ArticleCan I filter homozygous and heterozygous SNP based on number of reads support...
If I separate homozygous and heterozygous SNP into two separate file, can I again filter them based on number of reads supporting these SNP?
View ArticleGATK HaplotypeCaller produces no output
Hi, A colleague was experiencing a very long run-time for a GATK HaplotypeCaller run and asked me to look at it. I noticed that although it had been running for about 5 days, it hadn't even created a...
View ArticleClipReads
Hi, I am getting an error when trying to soft clip bases with ClipReads --clipSequence that the sequences I am trying to clip are not at the end of a read. I can see this is mentioned in the...
View Article(How to) Generate an unmapped BAM from FASTQ or aligned BAM
Here we outline how to generate an unmapped BAM (uBAM) from either a FASTQ or aligned BAM file. We use Picard's FastqToSam to convert a FASTQ (Option A) or Picard's RevertSam to convert an aligned BAM...
View ArticleGATK4 at Bio-IT: Luncheon with Intel and Q&A sessions
This is becoming a bit of a yearly tradition; next week we're heading over to Bio-IT World Expo in Boston (so a short hop across the Charles River) to announce the majorly rebooted version of GATK...
View ArticleCan I use ContEst just for tumor only sample?
Hi, I want to use the ContEst to estimate my Exom-seq tumor sample. We do not have the normal control. Can I use these two vcf files in the command line? Thanks. -B:pop,vcf...
View ArticleObtaining the data used in automated tests in a more automated manner
GATK has integration tests that depend on data, I presume all part of the GATK bundle. Is there anything in the build that downloads the expected data in a more automated manner than me manually...
View ArticleHow MuTect identifies candidate somatic mutations
Please note that this article refers to the original standalone version of MuTect. A new version is now available within GATK (starting at GATK 3.5) under the name MuTect2. This new version is able to...
View ArticleBAM problem prevents a fix_misencoded_quality_scores step
I ran the following code: java -Xmx100g -jar /work/reecygroup/GATK/GenomeAnalysisTK.jar \ -T BaseRecalibrator --unsafe -nct 16 \ -R /work/reecygroup/index/bos_taurus/bos_taurus_all_dna.fa \ -I...
View Articlereadbackedphasing (HaplotypeCaller) outputs much more 0|1 then 1|0, why?
Hi! With the aim of phasing haplotype from SNPs of a single individual, I have used HaplotypeCaller which performes ReadBackedPhasing automatically (accuracy of SNP calling is beyond the question)....
View ArticleVersion highlights for GATK version 3.3
Another season, another GATK release. Personally, Fall is my favorite season, and while I don’t want to play favorites with versions (though unlike with children, you’re allowed to say that the most...
View ArticleAllele-specific annotation and filtering
Introduction and FAQs The current recalibration paradigm evaluates each position, and passes or filters all alleles at that position, regardless of how many alternate alleles occur there. This has...
View ArticleHC step 1: Defining ActiveRegions by measuring data entropy
This document describes the procedure used by HaplotypeCaller to define ActiveRegions on which to operate as a prelude to variant calling. For more context information on how this fits into the overall...
View ArticleBest practices for calling variants in RNA-seq data
Dear GATK team, Is there any link where I can find details about calling variants from RNA-seq data? Thanks
View Articlemissed variant calling for amplicon-based sequencing data using HaplotypeCaller
Hi all, I am using GATK v3.7 HaplotypeCaller to genotype 2000 dbSNP variants, including SNPs and INDELs, from amplicon-based sequencing data. --alleles is applied in HC, however, nearly 100 SNPs can...
View Article