Quantcast
Channel: Recent Discussions — GATK-Forum
Viewing all articles
Browse latest Browse all 12345

Detecting called Indels with low read support on both sides?

$
0
0

Hi,

I am mapping chimpanzee samples to the human reference hg19. I mappend the samples using the standard protocol (BWA mem, remove duplicates, indel realigner) and called them with GATK 3.7 Haplotype Caller. After all variant filtering (hard filter + remove duplicated and low mappability regions from external bed files), I found an interesting insertion in one of my samples:

In chr2:48033272 there is a deletion of this sequence TTTTTGTTTTAATTCCT . The human reference has GCA|TTTTTGTTTTAATTCCTT|TTTTGTTTTAATTCCTT|TG this sequence duplicated. This sample is called homozygous for the deletion.

A few bp after this, GATK calls an insertion:
chr2 48033352 . C CAACCGATGTTGCTTTTCTGTCCTAGCATTTTTGTTTTAATTCCTT 108.02 PASS

Long story short:

  • There are only 6 reads supporting this insertion.
  • Of them, only 3 have the full "GATK-ALT-insertion". All of these 3 have, at least, 1 bp more.
  • None of these reads have the 3' side of the reference. It should be: CTTTAACAGGAAGAGGTAC ins TGCAACATTTGATGGG
  • I lied. One of these reads does have the full sequence:
    TAACAGGAAGAGGTAC | AACCGATGTTGCTTTTCTGTCCTAGCATTTTTGTTTTAATTCCTTTGAGTTACTTCCTTATGCATATTTTACTTTAACAGGAAGAGGTAC | TGCAACATTTGATGGGACAGCAATAGCAAATGCAGTTGTTAAAGA

It is a duplication of the whole previous sequence, including the deletion 80bp upstream. I want to run functional analysis of the variants detected, and I am changing from a frameshift insertion to a non-frameshift insertion.

Ok, I have detected this wrong indel, but I am calling 1.9M Indels in this dataset, and 1.7M more in another one and I am worried about reporting strong functional annotation to erroneous variants.

Is there any method I can use to detect this kind of indel (not enough reads supporting both tips of the insertion)? Or I can only filter by QD?
My hard filter removes QUAL<50 and QD<2 . This particular variant has QUAL=108.02, QD=4.91 and Genotyping Quality=99

Any advice?

Thanks in advance,

Txema


Viewing all articles
Browse latest Browse all 12345

Trending Articles