Quantcast
Channel: Recent Discussions — GATK-Forum
Viewing all articles
Browse latest Browse all 12345

GenotypeGVCFs resource problems

$
0
0

Hello,

I am using GATK 4.0 to run GenotypeGVCFs on a cohort of 50 samples. After a number of runs which alternately overran assigned cpus or memory (up to ncpus=20, --java-options "-Xmx10g" ), a system administrator of our computing grid suggested that I include

-XX:+UseSerialGC -XX:-BackgroundCompilation

in the java options.

Now GenotypeGVCFs runs happily on the one cpu that I assign to the job, but it still terminates with memory overruns.
The last job had the following full command line

java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Xmx20g -XX:+UseSerialGC -XX:-BackgroundCompilation -jar /.../gatk/4.0.1.0/gatk-package-4.0.1.0-local.jar GenotypeGVCFs -R genome.fna -V cohort.g.vcf.gz --heterozygosity 0.00144 --heterozygosity-stdev 0.0273 --indel-heterozygosity 2.1E-4 -O all_samples_gatk4_test.vcf.gz

The process ran at the memory limit for most of the time and terminated after 19:23 hrs with this message:

Runtime.totalMemory()=20759052288
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

The output file contained only the header lines.

I could, of course, increase memory even more. I am puzzled though, because I previously analysed the same dataset with GATK 3.7 and GenotypeGVCFs ran successfully with 8gb of memory.

Is there any way I can force GenotypeGVCFs to complete the job with a 'reasonable' amount of memory?

Kind regards,
Beate


Viewing all articles
Browse latest Browse all 12345

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>