I've been following the guidelines for how to generate the known sites SNP database for organisms for which none already exists. I used my entire sample set (66 samples) to generate a VCF SNP file, then hard filtered. I used this filtered SNP file as the input for known sites dbsnp.vcf (removing the last 66 columns to save a lot of processing time, since I think it only uses the SNP position(???)).
I then ran the BaseRecalibration protocol for a subset of my samples. I've attached the plots of one of these, which is representative of the whole. I am concerned because the mean quality score dropped by a whopping 10, reducing most of my bases to below 20 QUAL, which wouldn't pass most filters that use Q30. Should this be concerning? If so, what might be causing this?
Thanks,
Alex