Mutect2 Sample name
Hi, I would like to use mutect2 to call variants from WGS data, I understand that GATK accept bam files that ONLY have readgroup information. Picard tools AddOrReplaceGroups can add read group info. to...
View ArticleRead groups
There is no formal definition of what is a read group, but in practice, this term refers to a set of reads that were generated from a single run of a sequencing instrument. In the simple case where a...
View ArticleMuTect2 is calling different variants when changing -minPruning value
Context : Working with targeted sequencing data (amplicon gene panel), depth is high, thus trying to adjust -minPruning value seems relevant. -minPruning argument : Paths with fewer supporting kmers...
View ArticleA snip and deletion without DP in g.vcf file
I have noticed that some times g.vcf files will have calls where the DP field is completely missing from the FORMAT string, even though a variant is called with good GQ. When this happens, the INFO...
View ArticleWhich known sites files to select for BQSR from hg38 resource bundle?
Dear GATK team I am following the GATK best practice guidelines for WGS. I have aligned to hg38 and I would like to know which files I should select from the hg38 resources for the BQSR step. My...
View Article(How to) Mark duplicates with MarkDuplicates or MarkDuplicatesWithMateCigar
This tutorial updates Tutorial#2799. Here we discuss two tools, MarkDuplicates and MarkDuplicatesWithMateCigar, that flag duplicates. We provide example data and example commands for you to follow...
View ArticleBuilding GATK 3.7 and a stackoverflow building Queue Extensions
Occasionally when my CI docker image build runs mvn verify, it gets a stack-overflow error when building GATK Queue Extensions Distribution. This is not 100% of the time and hard to reproduce...sorry...
View Articlewhich DP value shuold I use ?
I used HaplotypeCaller module to detect mutations,I noticed that the DP values was different in the INFO column and in the FORMAT column ,like the picture: Why this happened ?which DP value shuold I...
View Articleexome best practice
Hi, I want to use MUTCET 1.1.7 to detect SNV for my exome sequence(The sequencing depth was greater than 200X).Can I directly use the bam file from the picard's mark duplicates ? The BaseRecalibrator...
View ArticleMultisample vs Single sample(Paired sample MUT &WT separate BAM files)
Hello, I'm currently working with zebrafish mutants, and compare phenotypically wild-type and mutant siblings for mutations. I have 3 different mutants, like 3 pairs of different Mutant and Wild type...
View ArticleWhy is HaplotypeCaller missing these two variants when they are homozygous?
I have some known variants called by a different caller, further confirmed by Sanger sequencing. I've noticed HaplotypeCaller (HC) will call variants at this location when they are in a heterozygous...
View ArticleQueue custom job schedulers
Implementing a Queue JobRunner The following scala methods need to be implemented for a new JobRunner. See the implementations of GridEngine and LSF for concrete full examples. 1. class...
View Articletoo many memory and file handle resource required by GenotypeGVCFs
Hi. It seems that too many memory and file handle resource required by GenotypeGVCFs command line: java -XX:-UseCompressedOops -Xms1440g -XX:MinHeapFreeRatio=25 -XX:MaxHeapFreeRatio=50 -jar...
View ArticleHow to extract annotation for a single sample after genotyping a combined...
I have been running the GATK best practices for annotating numerous exome datasets. I have created gVCFs for each exome, combined into a single datafile, genotyped the datafile, and now have a combined...
View Articlereference genome with millions of contigs: How to get HC or UG started
Good day everyone, I have been starting to use GATK (v. 3.5) with a reference genome that contains 2.6 x 10E6 contigs. I managed the bam processing steps (realignment) to work with this genome and its...
View Article(howto) Install and run Oncotator for the first time
1. Download the Oncotator package, the default datasources package, and (recommended) transcript override list from the Downloads page Please note: Broadies who wish to run the installed Oncotator on...
View ArticleI am attempting to use the GATK indel calling but am facing an...
INFO 23:14:07,294 GenomeAnalysisEngine - Strictness is SILENT INFO 23:14:07,392 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 250 INFO 23:14:07,400...
View ArticleMaximizing sensitivity of HaplotypeCaller for pooled sample
I'm attempting to call variants (primarily SNPs) with HaplotypeCaller from a pooled sample containing 95% wild-type and 5% polymorphic strain data (C. elegans, SE-50bp, 20-fold genomes). Per...
View ArticleThe GATK Best Practices for variant calling on RNAseq, in full detail
We’re excited to introduce our Best Practices recommendations for calling variants on RNAseq data. These recommendations are based on our classic DNA-focused Best Practices, with some key differences...
View Articlejava.lang.NullPointerException for Picard SamToFastq OUTPUT_PER_RG=T on bam...
Hello, When running Picard's SamToFastq with OUTPUT_PER_RG=T on a bam file without read group information, you get the following error: Exception in thread "main" java.lang.NullPointerException at...
View Article