What is your opinion on the differences between those two MuTect2 output ?
Hello, I am working on reproducing results from TCGA data on prostate cancer using MuTect2. I will state the main differences between their parameters and mine, and also some results: They used GRCh38...
View ArticleHow to use --setFilteredGtToNocall and --maxNOCALLfraction with SelectVariants
Hello, I have a VCF file on which I first use VariantFiltration (GATK v3.7) with various filter expressions, such as "QD < 2.0" and "MQ < 40.0", as well as various genotype filter expressions,...
View ArticleWhere to find CreateSequenceDictionary.jar?
I've downloaded the recent GATK 3.7-0 release, which does not include CreateSequenceDictionary.jar. Is the following command replaced by something new? java -jar CreateSequenceDictionary.jar R=...
View ArticleA USER ERROR has occurred: couldn't write file because writing failed with...
The command I'm running: spark-submit $JAR gatk-package-4.alpha.2-1147-g1477e1c-SNAPSHOT-local.jar PrintReadsSpark \ -I gs://mybucket/newbam.sort.bam \ -O gs://mybucket/outputfinalspark.bam The error:...
View ArticleCalling variants in RNAseq
Overview This document describes the details of the GATK Best Practices workflow for SNP and indel calling on RNAseq data. Please note that any command lines are only given as example of how the tools...
View ArticleUsing JEXL to apply hard filters or select variants based on annotation values
1. JEXL in a nutshell JEXL stands for Java EXpression Language. It's not a part of the GATK as such; it's a software library that can be used by Java-based programs like the GATK. It can be used for...
View ArticleOverclippedReadFilter doesn't filter anything
Hello, I have 51bp reads and am trying to filter out those where about 20-30 bases have been soft-clipped. The command I have been using is: java -Xmx8g -jar $GATK_JAR -R $REFERENCE -T PrintReads -rf...
View ArticleHaplotypeCaller and detection of large indels
Hi, I am wondering about the detection of large indels with the haplotypecaller. I have an example where to my mind there is quite clearly a large deletion (a couple of kb) in the sample, but it is not...
View Article(howto) Evaluate a callset with CollectVariantCallingMetrics
Related Documents Evaluating the quality of a variant callset (howto) Evaluate a callset with VariantEval Context This document will walk you through use of Picard's CollectVariantCallingMetrics tool,...
View ArticleERROR MESSAGE: Fasta index file /home/debbie/GATKtest/5.C56_mark.bam.fai
Dear sir, When I use [debbie@server GATKtest]$ java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R mysample.bam -D ~/dbsnp_138.b37.vcf -o myfile.HapCall.snp.g.vcf it comes Error below: INFO...
View Articletabix
Dear friends, before using known variants like mills_vcf = "Mills_and_1000G_gold_standard.indels.hg38.vcf."; kgenomes_phase1_vcf = "1000G_phase1.snps.high_confidence.hg38.vcf"; kgenomes_omni_vcf =...
View ArticleMuTect2 Beta-version Duration
Is there an estimation for how much longer MuTect2 will be in beta mode (and therefore not yet recommended for production)? Thanks!
View Articlewhy is local realignment removed from the best practice pipeline?
Dear madam or sir, I noticed that local realignment is removed from the updated GATK best practice pipeline, why? As for GATK v3.7, wiill it affect the results of HC calling without the local...
View ArticleHow should I select samples for a Panel of Normals for somatic analysis?
The Panel of Normals (PoN) plays two important roles in somatic variant analysis: Exclude germline variant sites that are found in the normals to avoid calling them as potential somatic variants in the...
View ArticleCalling variants on cohorts of samples using the HaplotypeCaller in GVCF mode
This document describes the new approach to joint variant discovery that is available in GATK versions 3.0 and above. For a more detailed discussion of why it's better to perform joint discovery, see...
View ArticleHow does GATK4 Accept Sharded Data and SNP&Indels calling Pipeline
We found that many GATK4 commands accept an option to let them output "sharded" files. But we didn't find how those commands accept "sharded" data that generated from the last step. For example, gatk...
View ArticleStatistical methods: Fisher’s Exact Test
Overview Fisher’s Exact Test is a statistical test that is used to analyze contingency tables, where contingency tables are matrices that contain the frequencies of the variables in play. According to...
View ArticleTesting different capture reaction template
I did a test of capture sequencing : 96 libraries representing 56 individuals For one individual we made 1,2,3 or 4 libraries. The libraries were pooled before capture reactions so that the same...
View ArticleSelectVariants is not recognizing a sample by name
Hi, I'm having a strange error when using the SelectVariants tool; when I try to extract one sample from a multi sample vcf (which I'm sure includes the sample) I get the following error message:...
View ArticleMore alleles than chromosomes
My karyotype is 46,XY,22qs+/45,X,22qs+. The 22qs+ is an exceedingly rare but familial improperly-satellited autosome. The rest is just mosaicism. Some of my cells are XY (ordinarily male) and others...
View Article