Quantcast
Channel: Recent Discussions — GATK-Forum
Viewing all articles
Browse latest Browse all 12345

GATK4 results for PrecisionFDA Consistency challenge data

$
0
0

Dear team,

I've run GATK4.0.0 using Cromwell (30.2) and WDLs at https://github.com/gatk-workflows/gatk4-data-processing and https://github.com/gatk-workflows/gatk4-germline-snps-indels. I had bwa aligned and deduped BAMs, so I modified "processing-for-variant-discovery-gatk4.wdl" to start from BQSR, but otherwise used the published WDLs with minimal modifications.

The results for the public PrecisionFDA datasets (https://precision.fda.gov/) are interesting. The recall and precision were great for the Truth challenge datasets (HiSeq2500, PCR-free, ~50x), but not for the Consistency challenge datasets (HiSeqX, PCR+, ~30x). In particular for indels from the Consistency challenge datasets, the recall and precision were far worse than GATK3 results available for these datasets: ~92% and ~79% for the Garvan dataset and ~89% and 83% for the HLI dataset after VQSR filtration.

Do these numbers match with what you normally get for PCR+ HiSeq X WGS datasets with depths ~35x? If not, are there any parameters that I need to change?

Also, I think it will be very helpful to the community if the team make your GATK4 results publicly available for these popular public datasets.

Best,

Sangtae


Viewing all articles
Browse latest Browse all 12345


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>