Hi GATK !
I just finished a Master Thesis which was named: "Detection of Somatic Mutations in RNAseq data from Glioblastoma". The professor was pretty happy with the results that we obtained and wanted to publish something. Unfortunately I am confused about the way I should describe MuTect2 in my method section as there is no paper yet.
The experiment involved somatic mutation discovery in glioblastoma, using RNAseq and Exome data in parallel with MuTect2 (GATK 3.6, STAR 2-pass alignment for RNAseq, BWA for exome). Although RNAseq seemed to contain more false positives, it was able to detect quite a lot of mutations that seemed to be real ones (contained in COSMIC, good scores with SIFT/FATHMM, included in a set of GBM-related genes).
Anyway, I'm just wondering how I could quote MuTect2. After a few discussion on this forum, I understood pretty much that
1.Identify active regions
2.Reassemble the ActiveRegions through DeBruijn-like graph
3.Determine Likelihoods of the haplotypes using PairHMM algorithm
were following the HaplotypeCaller method, and
4. Assign the most likely genotype to the sample
was using MuTect1 likelihoods with TLOD/NLOD ratio
Am I close to reality by summarizing MuTect2 like that ?
Thank you very much and have a great day !
Alexandre Coudray
EPFL (Lausanne, Switzerland)
Master Thesis at the University of Texas at Austin (Vishy Iyer lab)