Hi,
I want to merge the g.vcf files I get from HaplotypeCaller using CombineGVCFs. When I do that, the called genotypes vanish. The g.vcf of one sample before merging:
NC_000001 13273 . G C,<NON_REF> 1712.77 .
BaseQRankSum=0.663;ClippingRankSum=0.000;DP=140;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-1.797;RAW_MQ=216005.00;ReadPosRankSum=-0.178
GT:AD:DP:GQ:PL:SB
0/1:67,73,0:140:99:1741,0,1495,1941,1714,3655:32,35,38,35
and the same variant in the merged file (It's the second sample):
NC_000001 13273 . G C,<NON_REF> . .
BaseQRankSum=0.663;ClippingRankSum=0.00;DP=361;ExcessHet=3.01;MQRankSum=-1.797e+00;RAW_MQ=394720.00;ReadPosRankSum=-1.780e-01
GT:AD:DP:GQ:MIN_DP:PL:SB
./.:.:70:99:38:0,114,1190,114,1190,1190
./.:67,73,0:140:99:.:1741,0,1495,1941,1714,3655:32,35,38,35
./.:.:61:99:35:0,105,1028,105,1028,1028
./.:0,37,0:37:99:.:1183,111,0,1183,111,1183:0,0,23,14
./.:0,75,0:75:99:.:2109,223,0,2109,223,2109:0,0,33,42
./.:.:0:0:0:0,0,0,0,0,0 ./.:.:54:99:35:0,102,1268,102,1268,1268
The GATK commandline is:
/opt/gatk/4.0.0.0/gatk --java-options -Xmx32G CombineGVCFs
-R GRCh38_latest_genomic_final.fa
-V 17450281-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17380470-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17470830-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17470788-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370765-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370767-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-V 17370768-WholeExome-171218_NS500396_0299_AHYV7NBGX3_raw_variants.g.vcf
-O cohort.g.vcf
When I run ValidateVariants on the merged file I get the following Error:
A USER ERROR has occurred:
Input /srv/nfs/ngsdata/GATK/171218_NS500396_0299_AHYV7NBGX3/_gatk/cohort.g.vcf
fails strict validation: one or more of the ALT allele(s) for the record at position NC_000001:13273 are not observed at all in the sample genotypes of type
Any ideas?
Thanks and best regards,
Daniel