Hi,
I am trying to compare Genotype calls between two multi-sample VCFs. I ran the GenotypeConcordance tools from both GATK and Picard (Picard version for just one of the samples). But I see that the GenotypeConcordance reported by the two tools are different.
Relevant portion of the results from the two tools below.
*** Picard ***
VARIANT_TYPE TRUTH_SAMPLE CALL_SAMPLE HET_SENSITIVITY HET_PPV HET_SPECIFICITY HOMVAR_SENSITIVITY HOMVAR_PPV HOMVAR_SPECIFICITY VAR_SENSITIVITY VAR_PPV VAR_SPECIFICITY GENOTYPE_CONCORDANCE NON_REF_GENOTYPE_CONCORDANCE
SNP HG00109 HG00109 0.436364 0.405042 ? 0.430914 0.386074 ? 0.434393 0.397989 0.612558 0.960685 0.960685
INDEL HG00109 HG00109 0.285897 0.233739 ? 0.310651 0.251152 ? 0.290306 0.236989 0.592559 0.66537 0.66537
**** GATK ****
Sample Non-Reference Sensitivity Non-Reference Discrepancy Overall_Genotype_Concordance
ALL 0.394 0.050 0.991
HG00109 0.405 0.042 0.993
I understand that Picard splits up the SNP and INDEL concordances. But since both of them are lower than the GATK reported concordance, I can't see any scenario where combining the two would make it equal to the GATK reported concordance.
Any suggestions on what the reason for the difference might be and which one is probably more accurate? Both tools were run with all default arguments and the two VCFs being compared were both generated by HaplotypeCaller with pretty much identical settings.
I found an old post suggesting that the Picard version is likely better and the GATK version might be deprecated soon ( http://gatkforums.broadinstitute.org/gatk/discussion/5795/genotype-concordance-output ). But since it is more than a year old, I don't know if it still holds true. The GATK version is slightly easier for my purposes (comparing multiple samples at one go). But want to check before starting to use it.
thanks!