CombineGVCFs subsampling questions
Hi GATK! I want to merge ~3000 HC outputs into one large cohort. However, even I run it directly by scattering on 30M genome chunk, it would still take a long time to compute. So I think I should first...
View Articleerrors when running picard ValidateSamFile on bam file got from SplitNCigarReads
Hi when I ran picard ValidateSamFile on the bam file got from GATK (VERSION 3.5) SplitNCigarReads, I got errors, HISTOGRAM java.lang.String Error Type Count ERROR:INVALID_CIGAR 1638...
View Articlebwa mem and GATK error
Hi, I have used bwa mem to align with the below command: bwa mem -R '@RG\tID:X\tLB:Y\tSM:Z\tPL:ILLUMINA' ref.fa seq1.fastq seq2.fastq | samtools view -bS - > alignment.bam Then used GATK lastest...
View ArticleA snip and deletion without DP in g.vcf file
I have noticed that some times g.vcf files will have calls where the DP field is completely missing from the FORMAT string, even though a variant is called with good GQ. When this happens, the INFO...
View ArticleMutect2 run time vs. Mutect
Hi, I isolated a region of 20M bp in chr18, and ran both Mutect and Mutect2 on a tumor BAM (no matched normal BAM). The BAM has 9.8M reads.about 7M get filtered out due to duplicate marking, so we are...
View ArticleTrouble Downloading GenomeStrip?
Hello, I can't seem to download GenomeStrip from the website: http://software.broadinstitute.org/software/genomestrip/download-genome-strip The issue is the registration prior to downloading, when I...
View ArticleTwo times of SNP Calling with the same raw data and process, BaseQRankSum...
I have do SNP calling with the same raw data and process. However in the vcf file, BaseQRankSum values are different for a lot of sites. What's more, QD value and MQRankSum value for some sites appear...
View ArticleVQSR with missing annotation fields
Hi, I am calling variants (non-model organism) following the best practice workflow. After haplotypecaller (with GVCF) and GenotypeGVCFs, I want to perform VQSR (separately for SNPs and INDELs) to the...
View ArticlePicard RevertSam and OUTPUT_BY_READGROUP does not allow specify output files
I am using a pipeline with as input bam and cram files. The first step of this pipeline is RevertSam combined with the option OUTPUT_BY_READGROUP and I walked into the fact that it is impossible to...
View Articleoutput.plots.R.pdf is empty
My output.plots.R.pdf is empty. I suppose there is an error in the Rscript. When I rerun the output.plots.R there is the following error. Any idea why that is happening? Warning message: Non Lab...
View Articlerecalibration tables for 1000 genomes
I am certain that the GATK team & associates have computed Base Quality Score Recalibration tables for each of the BAM files in the 1000 Genomes dataset. Are these recalibration tables available...
View ArticleMarkDuplicates---“did not start with a parseable number”
I was running RNA-seq data through the MarkDuplicates in Picard package for SNP calling getting the message: WARNING 2016-08-31 16:48:12 AbstractOpticalDuplicateFinderCommandLineProgram A field field...
View Articlecalculating the fraction of a cohort sharing the same germline variants
Hi, OBJECTIVE: I am trying to calculate the percentage of samples in my cohort that share identical variants. APPROACH: Being new to genomics, I followed the GATK pipeline for germline (and somatic)...
View ArticleIs there a way to automatically get nighly builds
Dear GATK team, i maintain GATK on the NIH biowulf cluster (https://hpc.nih.gov/apps/GATK.html). We have all recent stable builds available. However, sometimes users ask for a nightly builds b/c of...
View Articlejava.util.NoSuchElementException exception in HaplotypeCaller
One out of my 200 samples failed, no matter using single thread more multiple threads. Here is the message: INFO 15:27:03,632 ProgressMeter - 1:148025863 8.6339635E7 3.5 m 2.0 s 5.7% 61.9 m 58.4 m INFO...
View ArticleQuestion related to VQSR
Hi Everyone, It might be very basic but I just want to reach some clarifications what i understand after applying VQSR steps to WGS sequencing data. For SNPs I've set tranches as described in the best...
View ArticleStoring HC phased PGT into GT
Dear team, I am very excited to be able to use phased genotypes for various genetic analyses. Based on GATK best practices, I have generated a VCF using HC/GenotypeGVCF. I noticed phased genotypes were...
View ArticleHaplotypeCaller fails on joint calling of gVCFs
Hi, when I run HaplotypeCaller on on a bunch of .g.vcf files, at the joint discovery stage, it creates the output vcf with the header, but doesn't output the variants. There are a couple of warnings I...
View ArticleMutect is not working
Dear Cancer team, I installed mvn, gatk-protected, and mutect. (https://github.com/broadinstitute/mutect) After that, I came upon the following error message: ERROR...
View Article(How to) Map and clean up short read sequence data efficiently
If you are interested in emulating the methods used by the Broad Genomics Platform to pre-process your short read sequencing data, you have landed on the right page. The parsimonious operating...
View Article