Speeding up VQSR for 2000+ WGS samples
Hi GATK folks I am joint genotyping fairly large cohorts of 30x WGS data with 2000 to 5000 samples follwing the Best Practices using GATK 3.5 By and large this works pretty well, however one major...
View ArticlePicard metrics definitions
Hi! I was taking a look at the documentation about the duplication metrics of MarkDuplicates and I have some doubts. Is the UNPAIRED_READS_EXAMINED field taking into account only primary alignments or...
View ArticleContEst:what if I do not have a genotype array of my nomal samples?
Hi, I want to use ContEst to estimate the contamination levels of my patient-matched normal samples, but all my data are WGS data, and I dot not have genotype array of my normal samples. My code is...
View ArticleVariant not called by GATK
Hi there, I am doing alignment using Novoalign and using best GATK (version 3.5) practice to call variant. Sample is NA12878. One of my expected variant at location chrX:31219271 is not called. I ran...
View ArticleGenerateAltAlleleFasta
1. Introduction The GenerateAltAlleleFasta utility processes a VCF file to extract the sequences of the alternate alleles. For each structural variation record in the VCF, this utility will generate...
View ArticleHaplotypeCaller: Alleles for a VariantContext must contain at least one...
Howdy. I'm playing with the 7-12 nightly to fix the HashMap iterator issue in http://gatkforums.broadinstitute.org/gatk/discussion/comment/30982#Comment_30982 When running HaplotypeCaller however, I...
View ArticleUnderstanding and adapting the generic hard-filtering recommendations
This document aims to provide insight into the logic of the generic hard-filtering recommendations that we provide as a substitute for VQSR. Hopefully it will also serve as a guide for adapting these...
View ArticleMuTect2 tumorOnly vs paired loses true variants
Hi GATK team ! I have an issue with MuTect2. I'm using GATK last version (nighlty build from 16th of March) in a somatic context on an amplicon design. I have a variant that I know is true one...
View ArticleSequence dictionaries differ after running CollectRnaSeqMetrics - what should...
Hello! I was trying to run the CollectRnaSeqMetrics from Picard tools. My command was: java -Xmx4g -jar picard.jar CollectRnaSeqMetrics \...
View ArticleGenotypeGVCFs --includeNonVariantSites emits reference as symbolic
Hi, It seems like starting with GATK 3.6 (or at least sometime after GATK 3.5), when running GenotypeGVCFs and emitting all bases with --includeNonVariantSites. Non-variant sites are now being emitted...
View ArticleIdentifying Rare SNPs
Hello, I have been reading through GATK's docs and forums trying to wrap my head around the best way to approach the problem of sequencing viral populations. In this case, you have sequenced a sample...
View Article(How to) Fix a badly formatted BAM
Fix a BAM that is not indexed or not sorted, has not had duplicates marked, or is lacking read group information. The options on this page are listed in order of decreasing complexity. You may ask, is...
View Article(howto) Prepare a reference for use with BWA and GATK
Objective Prepare a reference sequence so that it is suitable for use with BWA and GATK. Prerequisites Installed BWA Installed SAMTools Installed Picard Steps Generate the BWA index Generate the Fasta...
View ArticleStatus for dealing with paired-end reads
Hi all, new here and its my first post, so sorry if its not relevant enough. I have been thinking about the same problem that occurred here twice already, and i cannot seem to come to a conclusion on...
View ArticleFiltering variants by depth of coverage using SelectVariants
I am trying to manually filter the results from HaplotypeCaller because my organism is anything but model. I sequenced 60 samples and had an average coverage of ~60X per sample across the genome. As...
View ArticleHaplotypeCaller does not generate FS, QD annotations even when specifically...
It seems all recent GATK's HaplotypeCaller versions (versions 3.2 to 3.6, haven't tested older versions) miss to generate some of annotations like FS and QD (even when requested explicitly by...
View ArticleInclude VCF filename in the error message when VCF file format is not recognized
Hi, run RealignerTargetCreator but using multiple '-known file.vcf' arguments. One of the files is causing a problem. It is not very helpful message anyway: GenomeAnalysisTK-3.6/GenomeAnalysisTK.jar -T...
View ArticleChanging name of g.vcf files after combineGVCFs
Dear Sir/Madam, I'm running the combineGVCFs in order to create batches of 100 GVCFs. The job is running perfectly, reading first of all the 100 files with the proper name, but once is the combined...
View Articlepicard error :picard.analysis.directed.CalculateHsMetrics
${JAVA_HOME}/java -jar ${Picard_PATH}/picard.jar CleanSam VALIDATION_STRINGENCY=SILENT INPUT=${EXOME_RESULT_DIR}/2-Mapping/BWA/${SAMPLE_ID}_output.bam OUTPUT=picard/${SAMPLE_ID}_output.fix.bam...
View ArticleMuTect2 tumor only mode: empty VCFs
Hello, I am trying to run Mutect2 in tumor only mode without a matching normal. However, when I did this, MuTect2 produced output VCFs that had a full header, but had no variant calls. Here is a sample...
View Article