Hello,
I am calling variants with Mutect2 (default parameters) from bulk WGS Tumor/Normal pairs following Somatic SNV Best Practices, and in the VCF outputs I am finding a lot of variants like this (the last one is the tumor):
chr1 1037759 . CTT C . PASS ECNT=1;HCNT=1;MAX_ED=.;MIN_ED=.;NLOD=4.60;RPA=12,10;RU=T;STR;TLOD=26.17 GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:31,0:NaN:0:0:0,0:0:0 0/1:13,21:1.00:8:13:0,611:0:0
The genotype suggested for the tumor is heterozygous. However, the AF is 1.00. I also see that the QSS for reference allele is 0, but with IGV I checked that the base and mapping qualities at this position are normal for both reference and alternative-supporting reads and that they are primary alignments and have their mates mapped. I got up to 14% AF=1.00, which seems very weird to me for this type of analysis.
It doesn't happen for all the deletions, though:
chr1 1128849 . CTT C . PASS ECNT=1;HCNT=2;MAX_ED=.;MIN_ED=.;NLOD=3.77;RPA=11,9;RU=T;STR;TLOD=20.27 GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:24,0:0.00:0:0:60,0:0:2 0/1:20,13:0.867:6:7:60,403:2:0
In this case QSS is not 0, and I feel that it could have some relationship ( (13403)/((2060)+(13*403))=0.814, near to the AF).
Most of the cases are indels, but not all of them:
chr2 28505882 . A G . PASS ECNT=1;HCNT=8;MAX_ED=.;MIN_ED=.;NLOD=4.00;TLOD=23.46 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:16,0:0.00:0:0:.:149,0:4:1 0/1:2,15:1.00:1:0:0.00:0,30:0:0
So in summary, I don't understand the way in which AF is calculated. Am I misunderstanding the AF concept or the way it works for this type of samples? Or may I be skipping the reason of QSS=0? Should I use AD for my calculations instead?
Thank you very much!
GATK version: 3.7-0-gcfedb67
Java version: 1.8.0_31
WGS paired end samples
Bulk Tumor/Normal pairs
Sequenced with HiSeqX using TruSeq Nano DNA (350) library kit