Hi,
I am a little confused about the best practices for running Haplotyple Caller to call variants given the pooled nature of my study, any feedback is super appreciated!
I have 10 replicates of pooled, RNAseq data each for two samples (10 replicates for Sample A, 10 replicates for Sample B ). By pooled I mean each replicate has mRNA from 20 individuals all mixed together with no barcoding (population genetics study).
I had planned to just merge the bam files of these replicates, who have RGSMs of SampleA and SampleB, and simply run Haplotype Caller for Sample A and Sample B. However, that would mean I would set ploidy = 2 x 200. This seems very high!
Would it be better to run Haplotype Caller for each replicate separately, without merging the bam files and setting ploidy = 2 x 20, And then use some kind of tool such as CombineVariants to stack my vcf files into two samples for downstream comparisons?
Any advice?
Regards!
Chris