Quantcast
Channel: Recent Discussions — GATK-Forum

GATK 4.1.4 DenoiseReadCounts: Sample intervals must be identical to the...

Hi I've been getting failures in Terra from the most recent 2-CNV_Somatic_Pair workflow copied from help-gatk/Somatic-CNVs-GATK4 that has an error message: 18:58:30.763 INFO SVDDenoisingUtils -...

View Article


appropriate members for generating "known-sites" list

I have 46 complete genomes and a good reference genome. Two of the individuals are "outgroups" (two different species). The rest are the same species as the reference genome. One of the outgroups...

View Article


what is the current page for VCF hard-filtering?

Hi GATK people, I am preparing a training session using 4.1.4.1 and would need the current official list of parameters for hard filtering of hg38 VCF data. I also demonstrate the VQSR but for...

View Article

HaplotypeCaller successfully generated my g.vcf.gz, no error message but exit...

Hi the gatk team, I used HaplotypeCaller to generate a vcf file. Everything went fine : I got no error message, the vcf.gz and the vcf.gz.tbi were generated , however, the exit status was not '0' but...

View Article

Error of INDEL mode during VQSR process

Hello, I'm trying to do VQSR on exome data of 50,000 samples. Since this dataset is too big, I used GenomicsDBImport to merge. Whole exome merging is also slower than expected, so I did this per...

View Article


low GQ around Indels

I thought the low GQ around indel was fixed in v4, but i still see it. Here is the arguments and output i am getting. gatk-4.1.3.0/gatk HaplotypeCaller -I test.bam -ERC BP_RESOLUTION -R hg19_v6.fasta...

View Article

Image may be NSFW.
Clik here to view.

GQ zero in reference mode and strand bias

Why do we get GQ = 0 at chr6:42946358 ? Command used - gatk-4.1.3.0/gatk HaplotypeCaller -I all_PE_sortedRG.bam -R hg19.fasta -L 6:42946053-42946662 -ERC BP_RESOLUTION -O temp_pe.vcf...

View Article

Is it recommended to joint genotype samples in a mutation accumulation...

Hi, I have a general usage question. I am analyzing whole-genome sequence data from a yeast mutation accumulation experiment. The experimental design is as follows: 1 ancestor 48 "progenitor" lines...

View Article


Number of reads with PCR duplicates varies when checked manually and in...

This is the sam file post base recalibration "Drr.sam". I wanted to look for the number of lines with PCR duplicates. So at first, I checked the total number of lines in sam files then looked for lines...

View Article


GenotypeGVCF and memory failure

Hello, I am working through GATK4 (4.1.4.0) for germline short variant discovery. I have been following the best practices, but this is my first time using the pipeline. So far, I have run...

View Article

GATK Support for files Stored in Azure Blog or Data Lake when processing in...

I was trying to use GATK wrapper to run tasks on Apache Spark 2.2.1. However, my files are on Azure Blog and Azure DataLake, I noticed that GATK has support of Google Cloud storage. In my case i needed...

View Article

About NA12878 download page.

Hi team, I follow your website find your download page. ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/ Is there a correct answer for NA12878 on this page? If not, could I ask where can I...

View Article

Image may be NSFW.
Clik here to view.

Metochontrial WDL flow allele fraction(AF) is inconcistent with AD?

I had some metochontrial samples, and do GATK meto flow as GATK WDL. then I find the allele fraction of AF is inconcistent with real alt_fraction,that is AD[2]/DP: for example,AF of this record is...

View Article


Variant filtering based on mapping score

Dear GATK team, I'm writing you regarding variant filtering in the multi-sample vcf file. I have got a multi-sample vcf file containing both SNP and Indel (from human whole-genome sequencing) produced...

View Article

HaplotypeCallerSpark throws error Unable to find class:...

Hi, I am trying to run HaplotypeCallerSpark on Apache Spark cluster but the job failed with below error. However, when I added the htsjdk-2.14.0.jar to spark it throws a different error. Exception in...

View Article


Structural Variation Discovery Pipeline

GATK v4.1.3.0 Hi, do you have any general guideline, also not official, about Structural Variation Discovery? I used only the tool reported above but I do not know if I have to use additional tools or...

View Article

CombineGVCFs quits 25% of the way through genome

Using CombineGVCFs, I get a java error message and it stops running on Chromosome 6 of 21. I have combined each flowcell and lane combination into their own cohort, so this step is combining all of...

View Article


Using GATK tool on my coordinate sorted bam file,...

Using the tool on my coordinate sorted bam file and i keep getting an error INFO 2019-12-16 10:46:24 MarkDuplicatesWithMateCigar Read 1,000,000 records. Elapsed time: 00:00:12s. Time for last...

View Article

GenomicsDBImport multi sample GVCF

Hello, I am new to using (or rather attempting to use) the GenomicsDBImport tool as part of the joint discovery workflow. I am not quite clear from reading the documentation, but I was wondering if...

View Article

Excessive heterozygosity in a dataset filtered by VQSLOD

Greetings TL;DR: I have lots of excess het even after filtering (I used VQSLOD and then filtered by InbreedingCoeff too). Why, and what do I do? See questions at end. Details: I have a WES+ dataset of...

View Article

Error: Input files reference and features have incompatible cont; while...

Hi All, While I am running SelectVariants modules for splitting SNPs ans Indels from 1000GP phase3 vcf files, it throws an error "A USER ERROR has occurred: Input files reference and features have...

View Article


How to build the variant recalibration bundles?

Hi Gatk team, I want to ask how can I create or build my own resource- bundle to feed variant recalibrator model with my data for training or testing? Thank you very much!

View Article


Problem with g++ not being detected when GATK germlineCNVCaller is running

Hello Team, I am trying to run GATK germlineCNVCaller on a local cluster and getting this warning: 'WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized...

View Article

Adopter Trimming Vs Non Trimming in Mutect2

Dear gatk Team We have performed mutation calling (using Mutect2-gatk 4.1.4.0) on samples where 1) adopter trimming was NOT performed and, 2) adopter trimming was performed. We have observed some major...

View Article

PrintRead -BQSR gets stuck on a contig

Hi GATK team! I am analysing WGS data following GATK best practices. I am using GATK version 3.8-1-0, java version 1.8.0_191 and human genome version GRCh38 from ensembl. I downloaded from GATK bundle...

View Article


Illumina Pipeline

Hello everyone, I'm currently working with an Illumina pipeline where we do the following: 1. Align reads against reference genome using Isaac 2. Call structural variants using Manta 3. Call small...

View Article

Image may be NSFW.
Clik here to view.

Mutect2 indel alignment issues

Hi, I have recently began testing Mutect2 for the analysis of single tumour samples. Most of the tools seem to work as expected however I have an issue with a particular indel. It is a 16 bp indel...

View Article

Homo_sapiens_assembly38.fasta.gz.fai does NOT exist

Hi, I downloaded the GATK bundle folder from the GATK website. It includes the reference genome file Homo_sapiens_assembly38.fasta.gz and its index file Homo_sapiens_assembly38.fasta.fai. Now when I...

View Article

Image may be NSFW.
Clik here to view.

Version 4.1.4.0 is the latest visible, but apparently version 4.1.4.1 is the...

When viewing the documentation, there is a notification at the top of the page that 4.1.4.1 is the latest version, but the dropdown only allows us to fetch docs for 4.1.4.0. Example:

View Article



Image may be NSFW.
Clik here to view.

gatk CombineGVCFs, java.lang.ArrayIndexOutOfBoundsException: Index -2 out of...

Hi, Please see my GATK log file below, I try to use "gatk CombineGVCFs" to merge 3 GVCF files into one GVCF file. There is an error message of "java.lang.ArrayIndexOutOfBoundsException: Index -2 out of...

View Article

Getting an error running Mutect2 wdl (FilterAlignmentArtifacts missing...

Hi, I'm trying to run the Somatic SNVs and INDELs (mutect2.wdl - GATK 4.1.4.0; https://github.com/gatk-workflows/gatk4-somatic-snvs-indels/blob/master/mutect2.wdl) and I kept getting an error when the...

View Article

GATK somatic CNV pipeline NaNs

Dear GATK Team, I ran the GATK somatic CNV calling pipeline on Terra (v1.3.1 for the PoN and 1.3.0 for the somatic pair workflow) on unpaired canine tumor WGS data. I noticed that I am seeing runs of...

View Article

BI and BD Default Quality Scores?

After running BQSR recalibrate variants on WGS, I want to convert back the entries in the BI and BD tags back to numeric values. However, I cannot for the life of me find documentation online to help...

View Article


Convert output of CombineGVCFs to GenomicsDB

Hello, I have some combined (multi-sample) GVCF files from old projects, where their single sample GVCF files are no longer available. I'm wondering if there is a way to convert the combined GVCF...

View Article

GenomicsDBImport update database and MQ values

Dear GATK Team, I recently stumbled upon a problem, maybe you can help with that! I've recently done a multisample calling (around 1000 genomes) using the new GenomicsDBImport and then GenotypeGVCFs on...

View Article

GATK error by unexpected data type of SA_MAP_AF

Hello, FilterMutectCalls was shut down by a non-double value of 'NaN' for 'SA_MAP_AF'. The error message and the corrosponding variant call by Mutect2 are below. GATK-4.0.6.0 chr19 501743 . T...

View Article


Whether the gkl version is 0.6.0 based on Queue-3.8-1-0-gf15c1c3ef ?

In the Queue-3.8-1-0-gf15c1c3ef package ,the files libgkl_pairhmm_fpga.so,libgkl_pairhmm.so and libgkl_pairhmm_omp.so are all 2.0MB, but I download the code of version 0.6.0 from gkl githup, then I...

View Article


GenomeSTRiP error: partition.genotypes.map.dat file missing at Stage 12

The error below was generated in CNVDiscoveryStage12. I am using JDK/1.8.0_11 and svtoolkit_2.00.1918. How can I solve this? INFO 13:08:01,285 HelpFormatter - Program Args: -cp...

View Article

A ridiculus amount of memory

hI, PICARD dies on me after using a ridiculus amount of memory. The command line was: /gpfs0/biores/home/erubin/jdk/jdk1.8.0_131/bin/java -jar /gpfs0/biores/users/erubin/jars/picard-2.11.0.jar...

View Article

GATK-4.1.3.0 GetPileupSummaries "No suitable codecs found" error

Hi, I am trying to run the contamination pipeline for GATK-4.1.3.0. If I use somatic-hg38_af-only-gnomad.hg38.vcf as reference, I get out of memory errors, exactly as detailed in a previous thread...

View Article

Why am I getting this error massage?

I am running PICARD as follows. The program died on me after almost finishing. I ran the following command: bash tmp2.bash > & stdout.tmp2.bash & after setting the environmental variable for...

View Article


Do I have to adjust contig ploidy priors with mainly male/female samples in...

If my cohort used for model building consists mainly of male samples do I have to use priors different from CONTIG_NAME PLOIDY_PRIOR_0 PLOIDY_PRIOR_1 PLOIDY_PRIOR_2 PLOIDY_PRIOR_3 X 0.01 0.49 0.49 0.01...

View Article

Filter samples of bad quality before running GermlineCNVCaller

Do you filter out samples of bad quality (e.g. high variability in read counts) before constructing the model in GermlineCNVCaller cohort mode as it is known from other CNV calling methods? Which...

View Article


Image may be NSFW.
Clik here to view.

GATK GenotypeGVCFs miss a real deletion

I used GATK HaplotypeCaller GVCF model to call WES SNV, then I found that a variant in a family had filtered by the step GenotypeGVCFs. sample1 and sample2 are a family. sample1 is child and sample2 is...

View Article

GATK germlineCNVCaller with GPU and CUDA

Hello I am running GATK germlineCNVCaller on a machine with an Nvidia GPU that seems to be configured correctly as it runs an test program in the GPU. Without the GPU on 16 CPUs it can take 2.5 days to...

View Article


What is the difference between SplitVcfs (Picard) and SelectVariants?

Hi all, Usually I do splitting SNPs and INDELs from VCF file using "gatk SelectVariants" for SNPs --select-type-to-include SNP and Indels --select-type-to-include INDEL. Accidentally I came to see the...

View Article

Percent progress in HaplotypeCaller GATK 4

Hi all In GATK 3, HaplotypeCaller printed out a % progress and estimated time to finish. GATK 4 doesn't seem to do that anymore, is that correct? Or is there an option I can activate to do it? And if...

View Article

**Attention**: GATK Team out of office Dec 10th and 11th

We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other...

View Article

**Attention**: GATK Team is Out Of Office from Dec 24th to Jan 1st

Hi everyone, We will be out of office for the holidays from Dec 24th to Jan 1st 2020. We will be back to monitor the GATK forum starting Jan 2nd 2020. We cannot guarantee a response to any posts during...

View Article


Missing PS field in the VCF file produced by GenotypeGVCFs

Hello, I followed GATK best practices to produce a VCF file for 20 individuals. GATK version is 4.1.0.0. The BAM files were all verified by ValidateSamFile, no errors or warnings were detected....

View Article



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>