Hi,
I am using HaplotypeCaller in GVF mode and GenotypeGVCFs to call some high-depth targeted sequencing data. I have some variants that have been confirmed using Sanger sequencing and I am using these variants as a gold standard to evaluate the pipeline. HaplotypeCaller is missing some of these variants and I would like to understand why.
Here is a variant in the g.vcf file that I thought should of been called:
15 75644465 . C . . END=75644465 GT:DP:GQ:MIN_DP:PL 0/0:546:0:546:0,0,1860
The allele depth is 335 C / 266 T. It looks like the variant should of been called but it is not present in the vcf file. I have attached a pileup at that site.
Here are the commands I was using with GATK 3.4-0. I have followed the best practices pipeline with the exception of duplicate removal because of the way the platform I am using works.
export REGION=15:75643465-75645465
$JAVA -Xmx2048m -jar $GATK -T HaplotypeCaller -R $REF -I $BAM --dbsnp $DBSNP --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -L $REGION -o $GVCF
$JAVA -Xmx2048m -jar $GATK -T GenotypeGVCFs -R $REF -L $REGION -o $OUT/sample.vcf --variant $GVCF --dbsnp $DBSNP
Do you know why this variant was not called heterozygous?