Hi everyone,
I'm trying to find a way to filter some heterozygous genotypes that might have been misassigned due to PCR or sequencing errors and result in a very unrealistic allelic balance bias like the following (bias greater than 1/3 or sometimes greater than 1/4):
#CHROM | POS | ID | REF | ALT | QUAL | FILTER | INFO | FORMAT | RCS12 |
---|---|---|---|---|---|---|---|---|---|
6 | 26088662 | . | T | G | 21188.57 | PASS | AC=24;AF=0.480;AN=50;AS_BaseQRankSum=-3.650;AS_FS=6.798;AS_InbreedingCoeff=-0.0015;AS_MQ=43.72;AS_MQRankSum=-0.700;AS_QD=14.87;AS_ReadPosRankSum=-0.050;AS_SOR=0.588;BaseQRankSum=-2.573e+00;DP=1845;ExcessHet=2.7216;FS=6.798;InbreedingCoeff=-0.0015;MLEAC=25;MLEAF=0.481;MQ=43.75;MQRankSum=-6.640e-01;NDA=1;QD=14.87;ReadPosRankSum=-1.400e-02;SOR=0.588 | GT:AD:DP:FT:GQ:PL | 0/1:5,20:25:PASS:74:418,0,74 |
6 | 26090224 | . | T | C | 8949.70 | PASS | AC=15;AF=0.288;AN=52;AS_BaseQRankSum=-2.300;AS_FS=1.761;AS_InbreedingCoeff=0.1568;AS_MQ=38.03;AS_MQRankSum=-0.800;AS_QD=12.24;AS_ReadPosRankSum=0.600;AS_SOR=0.547;BaseQRankSum=-1.452e+00;DP=1588;ExcessHet=0.9921;FS=1.761;InbreedingCoeff=0.1568;MLEAC=15;MLEAF=0.288;MQ=38.52;MQRankSum=-6.890e-01;NDA=1;QD=12.24;ReadPosRankSum=0.345;SOR=0.547 | GT:AD:DP:GQ:PL | 0/1:17,4:21:48:48,0,432 |
6 | 26090951 | . | C | G | 6430.41 | PASS | AC=7;AF=0.135;AN=52;AS_BaseQRankSum=0.400;AS_FS=0.000;AS_InbreedingCoeff=-0.1556;AS_MQ=43.86;AS_MQRankSum=-0.500;AS_QD=10.31;AS_ReadPosRankSum=1.700;AS_SOR=0.637;BaseQRankSum=0.221;DP=1697;ExcessHet=5.0213;FS=0.000;InbreedingCoeff=-0.1556;MLEAC=7;MLEAF=0.135;MQ=43.85;MQRankSum=-4.380e-01;NDA=1;QD=10.31;ReadPosRankSum=-1.650e-01;SOR=0.637 | GT:AD:DP:GQ:PL | 0/1:9,37:46:99:918,0,140 |
I found that, in the past, GATK had the AlleleBalanceBySample annotation that would be very useful to remove the genotype assigned in such situations.
I know that I can do that using JEXL expressions sample by sample. However, when genotyping several samples together, iterating over the samples it's time consuming.
Do you know a faster way of doing such thing?
Thank you in advance.
Best regards,
Miguel