Quantcast
Channel: Recent Discussions — GATK-Forum
Viewing all articles
Browse latest Browse all 12345

Allele balance filtering

$
0
0

I am getting a number of incorrect/questionable genotype calls after proceeding through the "Best Practices" guidelines. These incorrect calls are resulting in a lot of false positives in my subsequent association analysis. After discussing these issues with a couple of people, I have been advised to try setting an AB filter of 0.2. There are a few references to an AB filter on the GATK site but none (that I can find) that come out and state the exact commands needed to create this filter. I tried using VariantFiltration:

java -Xmx10g -jar /shared/bin/gatk/GenomeAnalysisTK.jar -T VariantFiltration -R /mnt/DATA/Cores/hiseq2000/annie/reference/Homo_sapiens/GRCh37/b37/human_g1k_v37.fasta --variant july2017.biallelic.noindel.recode.vcf --filterExpression "AB < 0.2" --filterName ABfilter -o july2017.biallelic.noindel.filtered.vcf

This just resulted in numerous warnings of "Interpreter - ![0,2]: 'AB < 0.2;' undefined variable AB". From reading other forums, I believe that I somehow have to get the "AB" variable into my vcf before I can filter on the value. I have no idea how to do that and at what step in the best practices it should have been done.

The main issue seems to be true ref/ref genotypes being called heterozygous. We have sanger sequenced a dozen or so and they are all ref/ref.

6   32557529    rs1059547   C   G   35212   PASS    .   GT:AD:DP:FT:GQ:PGT:PID:PL:PP    0/1:132,30:162:PASS:99:.:.:943,0,5348:944,0,5353    0/1:129,38:167:PASS:99:.:.:1149,0,5328:1150,0,5333  0/0:120,9:129:lowGQ:1:0|1:32557529_C_G:0,2,5038:0,1,5040    0/1:120,25:145:PASS:99:.:.:520,0,5041:521,0,5046    0/1:148,33:181:PASS:99:.:.:836,0,5970:837,0,5975    0/0:148,12:160:PASS:50:.:.:0,51,6315:0,50,6319  0/1:163,22:185:PASS:99:0|1:32557529_C_G:367,0,6746:368,0,6751   0/1:64,10:74:PASS:99:.:.:221,0,2741:222,0,2746  0/1:115,25:140:PASS:99:.:.:549,0,4823:550,0,4828    0/0:163,0:163:PASS:99:.:.:0,120,1800:0,119,1804 0/1:88,8:96:lowGQ:10:.:.:9,0,3596:10,0,3601
6   57398118    rs147643173 G   T   32652.9 PASS    .   GT:AD:DP:FT:GQ:PGT:PID:PL:PP    0/1:54,21:75:PASS:99:0|1:57398117_T_TA:498,0,2153:500,0,2156    0/1:62,27:89:PASS:99:0|1:57398117_T_TA:935,0,2406:937,0,2409    0/1:34,8:42:PASS:99:0|1:57398117_T_TA:213,0,1398:215,0,1401 0/1:58,0:58:lowGQ:2:.:.:0,0,1045:2,0,1048   0/1:28,16:44:PASS:99:0|1:57398117_T_TA:588,0,1103:590,0,1106    0/1:33,17:50:PASS:99:0|1:57398117_T_TA:534,0,1336:536,0,1339    0/1:26,5:31:PASS:99:0|1:57398117_T_TA:132,0,1050:134,0,1053 0/1:35,14:49:PASS:99:0|1:57398117_T_TA:482,0,1421:484,0,1424    0/1:50,10:60:PASS:99:0|1:57398117_T_TA:185,0,2171:187,0,2174    0/1:40,22:62:PASS:99:0|1:57398117_T_TA:768,0,1600:770,0,1603    0/1:61,19:80:PASS:99:0|1:57398117_T_TA:595,0,2540:597,0,2543

Ideally I would like to apply something on a per site/per sample basis that would set any heterozygous call that does not meet AB threshold of 0.2 to be called as ./. I don't know if this is even possible. I am open to any other suggestion. I know I can set the ones with the lowGQ flag to ./. if I wanted but not all of the problem sites indicate lowGQ. For reference I am using GATK Version=3.4-46-gbc02625.

Thanks,

Annie


Viewing all articles
Browse latest Browse all 12345

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>