Base Quality Score Recalibration (BQSR)
BQSR stands for Base Quality Score Recalibration. In a nutshell, it is a data pre-processing step that detects systematic errors made by the sequencer when it estimates the quality score of each base...
View ArticleClean version of dbSNP in the GATK resource bundle
Hi I understand that version 129 of dbSNP is considered clean and does not share data from other databases such as 1000G projects. What steps of the variant calling in WGS/WES analysis can be affected...
View ArticleBSQR
To perform base recalibration the documetation requires a VCF database of known polymorphic sites to mask out such as dbSNP to be used as an input file. Which one of the following dbSNP files do you...
View ArticleSplit'N'Trim Errors
Hello all, I am having a problem during the Split'N'Trim phase of the RNAseq Best Practices. The script I have used is as follows: java -jar /data1/APPS/gatk/GenomeAnalysisTK.jar -T SplitNCigarReads -R...
View Articletrio pipeline
Dear friends I am analyzing a trio I have followed the pipeline described in van der Auwera et al. 2013 on each person individually up to HaplotypeCaller and VariantRecalibrator is there a pipeline I...
View Articleregister GATK 3 day workshop
I would like to analyze NGS data with the GATK software. I am interested in the GATK Standard 3-day workshop (2017-11-08 | GATK workshop in Huntington, WV, USA) . When can I start to register the...
View ArticleError: Unable to retrieve result, with "VariantRecalibrator"
My command lines are as following: java -Xmx8g -jar $CLASSPATH/GenomeAnalysisTK.jar \ -T VariantRecalibrator \ -R $GenomeReference \ -input $InputVCF \ -nt 6 \...
View ArticleWhich datasets should I use for reviewing or benchmarking purposes?
New WGS and WEx CEU trio BAM files We have sequenced at the Broad Institute and released to the 1000 Genomes Project the following datasets for the three members of the CEU trio (NA12878, NA12891 and...
View Articlequestion about ‘gatk4beta’
While using 'gatk4beta', It always stops with nothing produced ,when running 'grep -i avx /proc/cpuinfo'. I can't avoid the problem, no matter what module I use.
View ArticleVQSR annotations to include low coverage WGS
Hi team, 1- Intuition tells me that we should not include all the annotations listed below for VQSR of WGS with coverage < 1. Which annotations do you suggest trying? java -jar GenomeAnalysisTK.jar...
View Articleusing MUTECT2 on tumor-only sample
Dear all, although this question has been asked a long time ago, if you do not mind asking it again, as I am looking for some updated workflows, strategies, ideas : "what would be the acceptable...
View ArticlecombineGVCFs with duplicate sample id?
I am performing the joint calling workflow on a large batch of samples and I have a handful that were sequenced twice, using two different capture kits. For these, the sample ID in the GVCFs are the...
View ArticleVariant in VCF of multiple samples called by HaplotypeCaller absent in their...
Hello GATK team, I followed GATK best practices and called variants with haplotypecaller in 6 exome samples. However, in 4 patients (total) I have a variant on Chr12 that is absent in the BAM file. the...
View ArticlePicard LiftoverVcf erroring
Hi, java-jdk/1.8.0_92, picard/2.8.1 I'm trying to convert the coordinates of an exome vcf file (hg19) to a vcf with the latest reference's (grch38) coordinates. The command I use is below along with...
View ArticleCollectRNASeqMetrics
Please help. Can you run CollectRNASeqMetrics on single end reads? When I run it it gives me stats for R1 and R2 transcript strand reads.
View Article[GATK 4 beta] clustered_events in Mutect2/FilterMutectCalls
Hi, I have a question about filtering Mutect2 calls. A well-characterized SNV (vcf records below 17:7577120) is filtered out by clustered_events filter. It appears that an artificial haplotype is...
View ArticleReference Genome Components
Document is in BETA. It may be incomplete and/or inaccurate. Post suggestions to the Comments section. This document defines several components of a reference genome. We use the human GRCh38/hg38...
View ArticleSpanning or overlapping deletions (* allele)
We use the term spanning deletion or overlapping deletion to refer to a deletion that spans a position of interest. The presence of a spanning deletion affects how we can represent genotypes at any...
View Articlewhat is the problem with mutect v2 ?
hello, I have a question regarding the quality and functionality of mutect v2 because I encountered some problem with data analysis when using mutect v2 versus mutect v1. In fact, I am analysing...
View Articlepicard liftovervcf parsing error
I am using picard liftover vcf to algin variants from hg38 to hg19. I am not sure what the error is referring to though it seems to be an issue with the vcf (I have copied a few lines of the format)....
View Article