Hi,
I am trying to run the cnv discovery pipeline using the newest release of genomsetrip on ~400 bam files but I keep getting the following error at various locations of the pipeline, shown here for stage6:
ERROR 14:30:20,951 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/u/nobackup/eeskin2/alden/bipolar_sv/svtoolkit/cleaned_scripts/.queue/tmp' '-cp' '/u/home/a/alden/svtoolkit/lib/SVToolkit.jar:/u/home/a/alden/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/home/a/alden/svtoolkit/lib/gatk/Queue.jar' '-cp' '/u/home/a/alden/svtoolkit/lib/SVToolkit.jar:/u/home/a/alden/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/u/home/a/alden/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.ExtractBAMSubset' '-I' '/u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/bam_headers/merged_headers.bam' '-O' '/u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/cnv_stage6/seq_chr2/seq_chr2.merged_headers.bam' '-L' 'NONE' '-sample' '/u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/cnv_stage5/eval/DiscoverySamples.list'
ERROR 14:30:20,960 FunctionEdge - Contents of /u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/cnv_stage6/seq_chr2/logs/CNVDiscoveryStage6-1.out:
INFO 14:29:38,959 HelpFormatter - ---------------------------------------------------------
INFO 14:29:38,962 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ExtractBAMSubset
INFO 14:29:38,966 HelpFormatter - Program Args: -I /u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/bam_headers/merged_headers.bam -O /u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/cnv_stage6/seq_chr2/seq_chr2.merged_headers.bam -L NONE -sample /u/home/a/alden/eeskin2/bipolar_sv/svtoolkit/cleaned_scripts/cnvdiscovery_batch1/cnv_stage5/eval/DiscoverySamples.list
INFO 14:29:38,971 HelpFormatter - Executing as alden@n7261 on Linux 2.6.32-573.26.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03.
INFO 14:29:38,972 HelpFormatter - Date/Time: 2016/10/31 14:29:38
INFO 14:29:38,972 HelpFormatter - ---------------------------------------------------------
INFO 14:29:38,972 HelpFormatter - ---------------------------------------------------------
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at htsjdk.samtools.SAMTextHeaderCodec.advanceLine(SAMTextHeaderCodec.java:131)
at htsjdk.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:86)
at htsjdk.samtools.SAMFileHeader.clone(SAMFileHeader.java:355)
at org.broadinstitute.sv.util.sam.SAMUtils.filterHeaderToSampleSet(SAMUtils.java:151)
at org.broadinstitute.sv.util.sam.SAMUtils.getMergedSAMFileHeader(SAMUtils.java:89)
at org.broadinstitute.sv.apps.ExtractBAMSubset.run(ExtractBAMSubset.java:86)
at org.broadinstitute.sv.commandline.CommandLineProgram.execute(CommandLineProgram.java:54)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.sv.commandline.CommandLineProgram.runAndReturnResult(CommandLineProgram.java:29)
at org.broadinstitute.sv.commandline.CommandLineProgram.run(CommandLineProgram.java:25)
at org.broadinstitute.sv.apps.ExtractBAMSubset.main(ExtractBAMSubset.java:56)
)
I've noticed that the I think the obvious fix is to raise the maximum heap size using -Xmx flag, which I have set in the queuescript that I am using to run the cnv discovery portion of genomestrip (I have attached my *.sh file to this post as a text file, which was basically pilfered from the installtest example.
However, this value does not seem to be set in the downstream java commands initiated by the pipeline (I notice in the output that it is only 2g instead of 4g)
How can I raise this value?
Thanks so much for your attention,
alden