Quantcast
Channel: Recent Discussions — GATK-Forum
Viewing all articles
Browse latest Browse all 12345

Allele frequency and depth VCF produced by MuTect2

$
0
0

Hi all,

From my understanding of the VCF output, the AF[format] field (Allele fraction of the event in the tumor) equals to :
AD[format] / DP[format].
With AD being the depth of coverage of each allele per sample (we use the alt allele when calculating AF),
and DP being the "filtered" depth of coverage for each sample (we use the one computed from the tumor sample when calculating AF).

And with some further reading, I think I figured that :
AD[format] <=> all sample-reads minus uninformative reads.
AD is computed with GATK DepthPerAlleleBySample.
DP[format] <=> all sample-reads minus filtered reads (which is != from uninformative reads).
DP[info] <=> all site-levels-reads (T+N samples), minus nothing.
DP is computed with GATK Coverage

From the GATK doc (http://gatkforums.broadinstitute.org/gatk/discussion/4721/using-depth-of-coverage-metrics-for-variant-evaluation), one can read the following :

The key difference is that the AD metric is based on unfiltered read counts while the sample-level DP is based on filtered read counts (see tool documentation for a list of read filters that are applied by default for each tool). As a result, they should be interpreted differently.

If AF is indeed AD[format]/DP[format], isn't it strange to computed AF by dividing an unfiltered-read depth by a filtered-read depth ?

Ps : I tried to "verify" the DP[info] depth (computed inside the MuTect2 run), by using GATK DepthOfCoverage with the same input (non-marked_recalibrated T/N BAMs). For a given position, I find a higher depth with GATK DepthOfCoverage.(501 vs 434). Is the DP[info] really based on unfiltered-reads ? Or do GATK Coverage & GATK DepthOfCoverage have some minor differences ?


Viewing all articles
Browse latest Browse all 12345

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>