Hello,
I have to admit the way I used MuTect2 is quite unorthodox and probably not strictly supported, but the problem I ran into is one of inconsistent read depth, which I don't think to have much to do what I'm attempting to do...
What I attempted to do was running the same BAM files on MuTect2 twice but with the TUMOR and NORMAL designation reversed. After some filtering and sample renaming, the two output files were then combined by CombineVariants --assumeIdenticalSamples.
And then I noticed the same position being called in both runs, with highly different read depths. The following is a example:
chr1 180810202 chr1:180810202_A/G A G . alt_allele_in_normal ECNT=1;FG=7.81174;FS=0;HCNT=2;NLOD=166.03;TLOD=15.20 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:FS:QSS:REF_F1R2:REF_F2R1 0/1:307,10:0.035:10:0:0:-0:11263:305:2 0/0:741,13:0.019:13:0:0:-0:25874:741:0
chr1 180810202 chr1:180810202_A/G A G . alt_allele_in_normal ECNT=1;FG=0;FS=4.82164e-16;HCNT=1;NLOD=106.40;TLOD=10.66 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:FS:QSS:REF_F1R2:REF_F2R1 0/0:513,11:0.023:11:0:0:-0:19226:511:2 0/1:476,11:0.024:11:0:0:-0:16203:476:0
The read depths from IGV shows that the allelic depths for both samples when ran as TUMOR were more similar to IGV values ( According to IGV the first sample would have an AD of 314,8 and the second sample would have the AD of 483:10).
For reference, both runs were ran using GATK 3.5-0 with similar parameters, for example:
INFO 17:09:44,416 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:09:44,419 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled 2015/11/25 04:03:56
INFO 17:09:44,419 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 17:09:44,419 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 17:09:44,422 HelpFormatter - Program Args: -T MuTect2 -dontUseSoftClippedBases -R /scratch/lym_myl_rsch/mma/mmu_genome/GRCm38.primary_assembly.genome.fa -I:normal [skip] -I:tumor [skip] -o [skip] --dbsnp /scratch/lym_myl_rsch/mma/mmu_genome/mgp.v5.merged.snps_all.dbSNP142.vcf.gz
INFO 17:09:44,429 HelpFormatter - Executing as [skip] on Linux 2.6.32-431.23.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14.
INFO 17:09:44,429 HelpFormatter - Date/Time: 2016/05/19 17:09:44
INFO 17:09:44,429 HelpFormatter - --------------------------------------------------------------------------------
If there are any difference other than the designation reversal, the MuTect2 analysis that produced the first line of output were ran with from scattered intervals and then combined with CatVariants, while for the second line I ran the entire genome all at one run.
Any idea on what's the problem with this?