Base Quality Score Recalibration (BQSR)
BQSR stands for Base Quality Score Recalibration. In a nutshell, it is a data pre-processing step that detects systematic errors made by the sequencer when it estimates the quality score of each base...
View ArticleHow to add sample names in VCF?
I am using GATK best practices for germline SNPs and Indels 4.1.2.0. After mapping and recalibration, I run haplotypecaller in GVCF mode. I am combining all vcf files (output from haplotypecaller)...
View Articleis my pipeline correct to perform bqsr bootstrap with gatk4 ?
Hello gatk team, I'm confused because I have no idea about the exact pipeline to use to perform bqsr bootstrap in order to have my final_recalibrated.bam and do my variant calling. I was familiar with...
View ArticleAccess to the TCGA PoN
Hi there, I am authorized to access controlled data from TCGA, but I don't have access to the following file: controlled_access_token_pon_from_tcga8000.final_summed_tokens.hist.bin I have been told...
View ArticleVariantRecalibrator resource known training and truth confusion
Running VariantRecalibrator on mouse data raw vcf file with the following command: gatk --java-options "-Xmx4g" VariantRecalibrator -R Mus_musculus.GRCm38.dna.primary_assembly_ordered.fa -V...
View ArticleCNVPipeline stage 7 error (unable to read the list of metadata directories)
I ran into this issue with reading metadata location when running CNVPipeline. I have 4 metadata directories, so I put their locations into a list called metadata.list The error message below can be...
View ArticleExtracting MQ and QUAL values for invariant sites in VCF files
I'm having problems getting mapping quality (MQ) values and PHRED called site quality scores (QUAL) for invariant sites in the VCF files generated by GATK, even when I specify that all sites should be...
View ArticleWhat variants does Mutect2 --germline-resource filter out?
Hi everybody, I am new analyzing WES and I try GATK4 workflow for the detection of somatic variants. I run Mutect2 in tumor-only mode with these commands: gatk Mutect2 -R reference.ucsc.hg19.fasta -L...
View ArticleApparent difference between active region algorithm in mutect.pdf and...
Hello, First off, thank you so much for an excellent toolkit and brilliant forum. Both have and continue to help me out so much in my work. I am very grateful. My question relates to an apparent...
View Article(How to) Call common and rare germline copy number variants
Document is in BETA. It may be incomplete and/or inaccurate. Post suggestions and read about updates in the Comments section. The tutorial outlines steps in detecting germline copy number variants...
View ArticleHow can I debug and develop the algorithm in GATK ,such as haplotypecaller,...
Is there any paper or documents ? and those test data where i need download,such as NA12878.HiSeq.b37.chr20.10_11mb.bam.
View ArticleQuery about wgs_calling_regions.hg38.interval_list from GRCh38 gatk bundle
Hi GATK team, We are using wgs_calling_regions.hg38.interval_list from gatk bundle to call variants. Could you please confirm the details about the removed/masked regions from the reference....
View ArticleGenomeSTRip no genotype vcf
Dear all, I am calling SVs for WGS using the GenomeSTRiP tool. The calling is finished sucessfully for some chromosomes, but just only discovery vcf were generated for other chromosome. I run scripts...
View ArticleReference Genome Components
This document defines several components of a reference genome. We use the human GRCh38/hg38 assembly to illustrate. GRCh38/hg38 is the assembly of the human genome released December of 2013, that...
View ArticleECNT Value in Mutect2
Hello, I am using Mutect2 and FilterMutectCalls to call variants in mtDNA. According to the vcf file, the value recorded for ECNT is "Number of events in this haplotype". I am assuming that this is the...
View ArticleCombineGVCFs performance
I've got 300 gvcfs as a results of a Queue pipeline, that I want to combine. When I run CombineGVCFs (GATK v3.1-1) this however seems fairly slow: INFO 15:24:22,100 ProgressMeter - Location...
View ArticleGermline short variant discovery (SNPs + Indels)
Important: This document is currently being updated Purpose Identify germline short variants (SNPs and Indels) in one or more individuals to produce a joint callset in VCF format. Reference...
View ArticleFilter by TLOD only in Mutect2
Hello I am using Mutect2 in GATK v4.1.4.0 to look for somatic variants in several tumor samples with matched germline. Because of the nature of the samples, I know I can trust variants with relatively...
View ArticleMutect2 stops running midway
Hi! I am analyzing WES data in a few samples. It had worked just fine until some days ago. However, for the remaining samples, it is not working anymore. I am running Mutect2 in all of them with the...
View ArticleGenotypeGVCF stuck(?) after ProgressMeter - Starting traversal
I am running GenotypeGVCF on ~1700 samples. I use the intervals from...
View Article