CombineVariants in GATK4
Is it planned to add CombineVariants tool into GATK4.0 toolkit (it existed in previous GATK versions)? The only similar tool currently available in GATK4.0 Beta is GatherVCFs which has very limited...
View ArticleWhat should I use as known variants/sites for running tool X?
1. Notes on known sites Why are they important? Each tool uses known sites differently, but what is common to all is that they use them to help distinguish true variants from false positives, which is...
View ArticleUsing GenomicsDBImport to prepare GVCFs for input to GenotypeGVCFs in GATK4
In GATK4, the GenotypeGVCFs tool can only take a single input, so if you have GVCFs from multiple samples (which is usually the case) you will need to combine them before feeding them to GenotypeGVCFs....
View ArticleROD files out of FASTA? + other questions
Hey all, newbie here. tl;dr: I have a fasta file containing two sequences of my region of interest (~5.5 kbp), that differ in ~100 SNPs. What is the fastest way to generate a ROD file out of these...
View ArticleRunning genotypeGVCFs with ~4000 human exome data: stuck on "ProgressMeter -...
Hello, I am running genotypeGVCFs with ~4000 human exome data. To speed up the process, I have splited exome.interval_list into sub_interval_list which one interval file contains ~100kb regions. Then I...
View ArticleCatVariants mis-sorting input VCFs
GATK Team, When using CatVariants on my data, I noticed that it was consistently mis-sorting the output (when not using --assumeSorted; in my case, using the --assumeSorted is not feasible as the input...
View Article[ERROR] make_acnv_pon_config
Hi, I want build a CNV PoN by "make_acnv_pon_config". But, I got a message as following. Any suggestions? Thanks a lot! I will really appreciate your help. Failures: message: Task...
View ArticleVQSR: Bad input: Values for DP annotation not detected for ANY training...
Hi team, I'm have a vcf callset file generated using HaplotypeCaller in --emitRefConfidence GVCF mode with subsequent GenotypeGVCFs. I used the generated output.vcf file as input for...
View ArticleArrayIndexOutOfBoundsException in VariantsToBinaryPed
Hey GATK Team, Ive encountered a GATK runtime error, which says might be the result of a bug, but tracked it down to a file suffix issue. I tried GATKv3.2-2 and GATKv2.7-2 and the "problem" seems...
View ArticleVariantToVCF errors
Hey all I've been having quite a few issues trying to convert hapmap formatted files to vcf using GATK and the error messages are quite opaque. I've spend a few days trudging through the forum but for...
View ArticleVariantAnnotator error with Freebayes merge: Must initialize the cache of...
Main question (for the community): How to solve / avoid getting the error Must initialize the cache of allele anyploid indices for ploidy 1 with the VariantAnnotator (3.7-0) and freebayes1.1.0 . The...
View ArticleShould I analyze my samples alone or together?
Together is (almost always) better than alone We recommend performing variant discovery in a way that enables joint analysis of multiple samples, as laid out in our Best Practices workflow. That...
View ArticleFully parameterized example workflow for somatic variant calling
Hi again GATK folks, Ya'll provide a lot of nice resources to get up and going with GATK, but what I can't find for the life of me is information on a functioning somatic variant calling pipeline where...
View Article[E::bwa_idx_load_from_disk] fail to locate the index files
Hello, I tried to execute the command line in putty (linux) as follows: $ bwa mem -M -t 8 Homo_sapiens_assembly38.fasta SRR1517898_1.fastq SRR1517898_2.fastq > SRR1517898.sam but this appears:...
View ArticleVariant calling using Single End 50 bp RNA-seq
Hi, I have a human cell line RNA-seq data set which consists of 50 bp single end reads. There are 60 million reads. Due to the reads being single end and short (50 bp) I was writing to ask is it...
View ArticleGATK 3.7 and GATK 4 beta2
Dear team, I am using GATK 4 Beta2 for testing HaplotypeCaller for our NGS workflow. The command which I used is: time -p /gpfs/software/genomics/GATK/4b.2/gatk/gatk-launch HaplotypeCaller \...
View ArticlePicard: refFlat file not accepted
Hi there, I'm having some issues getting Picard to work. I want to run the CollectRnaSeqMetrics function, but it requires the input of a refFlat file. I'm having some trouble getting this refFlat file...
View ArticleCan't run any GATK4 tools on GCS dataproc
I'm having a strange issue running GATK4 tools on a dataproc cluster. I'm submitting from a Broad VM with an empty bash profile. As an example, here's what happens when I try to reproduce this...
View ArticleFoolproof way to get reads which will pass Picard, without losing too many
Ok, after some time away, we are getting back to writing a remapping tool that can handle lots of different cases. The protocol suggested by the Broad is good, but fails in cases when the bam had...
View ArticleUnderstanding and adapting the generic hard-filtering recommendations
This document aims to provide insight into the logic of the generic hard-filtering recommendations that we provide as a substitute for VQSR. Hopefully it will also serve as a guide for adapting these...
View Article