Variant Quality Score Recalibration (VQSR)
This document describes what Variant Quality Score Recalibration (VQSR) is designed to do, and outlines how it works under the hood. The first section is a high-level overview aimed at non-specialists....
View ArticleConvert bams to vcf files
Hi, I have bam files downloaded from the 1000 Genomes Project and I need to have fasta files as the reference files (for -R option) in order to turn my bams into vcf files. Can you please tell me where...
View Articlebest practices for calling snps from RNAseq reads mapped to denovo transcriptome
Dear GATK community, I would like to use GATK to call SNPs from RNAseq reads (from 7 libraries) mapped to a de novo transcriptome assembly (no reference genome available). I am having trouble finding...
View Article(How to) Generate an unmapped BAM from FASTQ or aligned BAM
Here we outline how to generate an unmapped BAM (uBAM) from either a FASTQ or aligned BAM file. We use Picard's FastqToSam to convert a FASTQ (Option A) or Picard's RevertSam to convert an aligned BAM...
View ArticleSparkGenomeReadCounts - NumberFormatException
Looking at trying out the somatic CNV calling on some WGS tumour-normal pairs, trying to get the read counts with SparkGenomeReadCounts, and I get this error: [ameyner2@node2c15 read_counts]$...
View ArticleGVCF vs Individual calling
Here are 2 scenarios involving 3 WGS samples with depth of coverage 5X-10X: A- Feed the 3 bam files into HaplotypeCaller and run in discovery mode; B- convert each bam file into a GVCF, and then...
View ArticleCombining variants from different files into one
Solutions for combining variant callsets depending on purpose There are three main reasons why you might want to combine variants from different files into one, and the tool to use depends on what you...
View Article[GATK 4 beta] read_position and clipping filters in FilterMutectCalls
Hello, I would like to understand the clipping and read_position filters better. Is the read_position filter useful because base quality gets worse toward the end of read in Illumina sequencing? And,...
View ArticleWhen should I use -L to pass in a list of intervals?
The -L argument (short for --intervals) enables you to restrict your analysis to specific intervals instead of running over the whole genome. Using this argument can have important consequences for...
View ArticleCode exception in GenotypeGVCFs
Hi all, I meet some error while generate vcf files with GenotypeGVCFs, please help~~ Thanks a lot! my CMD: java -Xmx60g -jar GenomeAnalysisTK-nightly-2016-09-23-gfade77f/GenomeAnalysisTK.jar -T...
View ArticleSensible ways to split CombineGVCFs
Hi, At the moment I have a project which is going to require me to use HaplotypeCaller on about 13000 WES samples and I've been running a test batch of 800 through to to test the methods. One hiccup...
View ArticleIs GATK overestimating the heterozygous calls?
Hi, I have 24 genotypes distributed in 4 different populations. I used HaplotypeCaller with the option –ERC –GVCF and obtained the vcf file for each genotype. Then combined all the genotypes to a...
View ArticleAnnotation modules in Haplotypecaller and Genotype gVCFs
I am performing WGS using the GATK best practice guidelines for the '-ERC GVCF' cohort analysis workflow. If I ran HaplotypeCaller in default mode (i.e. without specifying any particular annotation...
View Articlecan I use indel realignment bam file for extract SNPs?
Hi everybody, I'm following the pipeline for VC in RNAseq and I have some doubts. At that moment I've done: 1)Split'N'Trim and reassign mapping qualities (output: split.bam) 2)Indel Realignment: at...
View ArticleVariantRecalibrator for bacterial genome annotation
Hello!, i want to filter bad variants using VariantRecalibrator, the problem is am a bit lost which are the databases i can use as resource. My input file is vcf from HaplotypeCaller. I would really...
View ArticleClock drift error
I'm trying to use genotypeGCFs on 350 small gVCF files (from bacterial genomes) and I'm getting this "clock drift" error: INFO 13:02:46,006 ProgressMeter - chr1:44001 0.0 3.0 h 15250.3 w 2.0% 6.3 d 6.2...
View ArticleMissing version label for the downloaded GATK release...
Hello, I am currently developing a pipeline for genome assembly and annotation, in which GATK is one of many dependencies. Since the current version of GATK (3.7) still need manual registration and...
View ArticleQuestion: Question: Mutect 2 and related vcf in GDC
There are 2 vcf files from the DNA sequencing available in GDC: one for the tumor and the other one for the normal sample. I'm a little confused about the presence of the normal vcf file. Is this file...
View ArticleHow do I create a metadata(.fam) file for the VariantsToBinaryPed function?
I am new to working with genomics data, and I haven't been able to find any helpful instruction on how to create a metadata file for the VariantsToBinaryPed function. It seems counter-intuitive that...
View ArticleMuTect2 and VQSR: anyway of calling VQSLOD for MuTect2 ?
Hello GATK Team ! @Sheila @Geraldine_VdAuwera Since my last question ( here ), I am trying to build a workflow which can process all my samples with a snakemake workflow. From the previous question...
View Article