Hi- I'm using Mutect2 from GenomeAnalysisTK-3.8-0-ge9d806836, everything looks good but I noticed that the vcf header of the output contains the INFO:
##INFO=<ID=PON,Number=1,Type=String,Description="Count from Panel of Normals">
I guess this field tells how many samples in the PON contain the variant, right? However, I never see the PON field ever used in the VCF records even if some variants are marked with the "panel_of_normals" filter, for example:
chr1 186341 . T G . panel_of_normals;t_lod_fstar ECNT=1;HCNT=10;MAX_ED=.;MIN_ED=.;NLOD=14.65;TLOD=5.22 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/1:50,3:0.057:2:1:0.667:1542,92:22:28 0/0:66,1:0.015:0:1:0.00:2068,29:41:25
Is this expected?
And possibly related... The description of the "panel_of_normals" filter says: Seen in at least 2 samples in the panel of normals. However, I prepared the PON using --minN 1
which should activate the panel_of_normals filter for just one normal found. Is this just an oversight in the description or am I missing something?
Here's the synopsis of the relevant commands I used:
java -Xmx5g -jar ~/applications/gatk/GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar \
-T MuTect2 \
-R {params.ref} \
-I:tumor {input.tumour} \
-I:normal {input.normal} \
--normal_panel {input.pon} \
--dbsnp {params.dbsnp} \
--cosmic {params.cosmic} \
-L {params.chrom} \
--min_base_quality_score 20 \
--disable_auto_index_creation_and_locking_when_reading_rods \
-o {output.vcf}
and for PON:
java -Xmx8g -jar ~/applications/gatk/GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar \
-T CombineVariants \
-R {params.ref} \
{params.variant_str} \
-minN 1 \
--setKey "null" \
--filteredAreUncalled \
--filteredrecordsmergetype KEEP_IF_ANY_UNFILTERED \
-o pon/panelOfNormals.tmp.vcf
Thank you!