Hi,
Following the advice seen elsewhere on the forum I have performed variant calling with whole-genome resequencing data on a per-scaffold basis. Now, I need to merge 30000 or so individual VCFs and I am using CatVariants for that using the following command:
java -Xmx5G -cp GenomeAnalysisTK-3.7-0/GenomeAnalysisTK.jar org.broadinstitute.gatk.tools.CatVariants -R $ref.fasta -out $out.vcf -assumeSorted -V $allvcfs.list
Unfortunately, it appears to be very slow (about only 5000 regions processed after >24h), so I am wondering if this is the expected behavior and if there is a way to increase the speed.
Thanks
JM