Reference Genome Components
Document is in BETA. It may be incomplete and/or inaccurate. Post suggestions to the Comments section. This document defines several components of a reference genome. We use the human GRCh38/hg38...
View ArticleGenotypeGVCFs on whole genomes taking too long
Dear GATK team members and forum users, I am analysing 200 germline whole genomes following the GATK best practises. I am experiencing issues with GenotypeGVCFs, whose runtime is increasing...
View ArticleQuestions about DepthOfCoverage
This discussion was created from comments split from: Using DepthOfCoverage to find out how much sequence data you have.
View ArticleWhat are the variant types MIXED, MNP, SYMBOLIC, NO_VARIATION ?
This is a straightforward question. I couldn't find any documentation explaining these various types of variants, but only how to select or to not select them. .
View ArticleHow can I invoke read filters and their arguments?
Most GATK tools apply several read filters by default. You can look up exactly what are the defaults for each tool in their respective Technical Documentation pages. But sometimes you want to specify...
View ArticleVQSR --maxGaussians paramater
Hi, I am performing VQSR (GATK 3.7) using the SNP model on individual chromosomes on hundreds of WGS data. However, for some chromosomes, it ran without a problem using the --maxGaussians default,...
View ArticleError ReorderSam
I have downloaded HG00152.7.M_120219_3.bam from ArrayExpress. I have validated it using: java -jar /scratch/ev250/bin/picard/picard-2.8.2.jar ValidateSamFile \ I=$filestart.rehead.bam \ MODE=SUMMARY No...
View ArticleDepthOfCoverage yet again
Hi, I can't find an answer to my questions about one of the DepthOfCoverage output files, so I am asking you guys directly. Here're the top lines of one of my interval_summary files Target...
View ArticleQuestions about input files
This discussion was created from comments split from: What input files does the GATK accept / require?.
View ArticleUsing GATK as part of a pipeline (licensing?)
I have a pipeline for analysing sequencing data, and one of the steps is a variant calling step using GATK. I'm thinking about somehow making my pipeline easy to run and then make it publicly available...
View ArticleCatVariants require all -V's to have the same INFO keys?
I called somatic SNVs with MuTect and somatic INDELs with MuTect2, and then attempted to use CatVariants to combine them into one file for downstream processing. However CatVariants will always break...
View ArticleFilteration creteria
i want to filter a vcf file resulted GATK. my criterias are QUAL < 30, FS > 60, MQ < 30, DP < 10 and GQ < 30. filterExpression in my command line in GATK must be as following? GQ is in...
View ArticleError running CollectAlignmentSummaryMetrics on a bam generated from .maf file
Hello, Recently I run an alignment with LAST tool (http://last.cbrc.jp/ - fasta aligner for long reads alignment), it produces .maf file which I then converted to sam(with...
View ArticleHow to assign memory and CPU cores for genotypeGVCF?
We followed the workflow in http://gatkforums.broadinstitute.org/gatk/discussion/3893/calling-variants-on-cohorts-of-samples-using-the-haplotypecaller-in-gvcf-mode to run GATK hyplotypecaller for a...
View ArticleDetails on how Picard-Tools define duplicate reads
Dear colleagues, I am trying to implement a script to group duplicate reads into families and would like to understand which criteria Picard's MarkDuplicates uses. I've read that it compares the 5'...
View ArticleLeftAlignAndTrimVariants --splitMultiallelics changes GT from known to unknown
I have a VCF file with this line (i.e. GT=0/1=G/T): 20 10120854 . G T,A 32175.56 . AC=399,18;AF=0.111,5.006e-03;AN=3596;BaseQRankSum=1...
View ArticleIs there any tool that could select variants from .BAM and make a .fasta file?
Hi all! I'm new in these kind of analysis. I need to select several specific regions from a .BAM file and make a .fasta one to do a data base in the future. Is there any tool that could help me? Thanks...
View ArticleHow to define the name of the genotype column for MuTect2 VCFs?
Hi, as I see, HaplotypeCaller and MuTect1 is reading the name of the sample from the input BAM, in fact from the SM field of the RG line. How can it be assigned in case of MuTect2? All I am getting is...
View ArticleHaplotype caller (GATK 3.7) warning message for InbreedingCoeff
Hi, I am running HC with both .g.vcf and bamout as parameters. The command executes successfully, but with warning message: Annotation will not be calculated. InbreedingCoeff requires at least 10...
View ArticleAD different with DP
Dear all, I use this comand and version of mutect: GATKCommandLine=<ID=MuTect,Version=3.1-0-g72492bb,Date="Fri Feb 03 13 :47:15 CET 2017",Epoch=1486126035093,CommandLineOptions="analysis_type=...
View Article