I'm trying to use the BaseRecalibrator tool on a BAM file but the program doesn't run to the end. The messages returned by the tool did not allow me to correct the error by myself. I am running version 4.1.2.0 of GATK4.
Here is the complete message:
```
16:09:12.733 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 11, 2019 4:09:14 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
16:09:14.487 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.488 INFO BaseRecalibrator - The Genome Analysis Toolkit (GATK) v4.1.2.0
16:09:14.488 INFO BaseRecalibrator - For support and documentation go to
16:09:14.488 INFO BaseRecalibrator - Executing as pamesl@NODE01 on Linux v2.6.32-573.7.1.el6.x86_64 amd64
16:09:14.489 INFO BaseRecalibrator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
16:09:14.489 INFO BaseRecalibrator - Start Date/Time: 11 juillet 2019 16:09:12 CEST
16:09:14.489 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.489 INFO BaseRecalibrator - ------------------------------------------------------------
16:09:14.490 INFO BaseRecalibrator - HTSJDK Version: 2.19.0
16:09:14.490 INFO BaseRecalibrator - Picard Version: 2.19.0
16:09:14.490 INFO BaseRecalibrator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:09:14.491 INFO BaseRecalibrator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:09:14.491 INFO BaseRecalibrator - Deflater: IntelDeflater
16:09:14.491 INFO BaseRecalibrator - Inflater: IntelInflater
16:09:14.491 INFO BaseRecalibrator - GCS max retries/reopens: 20
16:09:14.491 INFO BaseRecalibrator - Requester pays: disabled
16:09:14.492 INFO BaseRecalibrator - Initializing engine
16:09:15.263 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz
16:09:15.411 INFO FeatureManager - Using codec VCFCodec to read file file:///data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf
16:09:15.428 INFO BaseRecalibrator - Shutting down engine
[11 juillet 2019 16:09:15 CEST] org.broadinstitute.hellbender.tools.walkers.bqsr.BaseRecalibrator done. Elapsed time: 0.05 minutes.
Runtime.totalMemory()=2224553984
java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.getContigNames(SequenceDictionaryUtils.java:463)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.getCommonContigsByName(SequenceDictionaryUtils.java:457)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.compareDictionaries(SequenceDictionaryUtils.java:234)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.validateDictionaries(SequenceDictionaryUtils.java:150)
at org.broadinstitute.hellbender.utils.SequenceDictionaryUtils.validateDictionaries(SequenceDictionaryUtils.java:98)
at org.broadinstitute.hellbender.engine.GATKTool.validateSequenceDictionaries(GATKTool.java:760)
at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:702)
at org.broadinstitute.hellbender.engine.ReadWalker.onStartup(ReadWalker.java:50)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
Using GATK jar /data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /data2/home/pamesl/miniconda3/envs/smk_env/share/gatk4-4.1.2.0-1/gatk-package-4.1.2.0-local.jar BaseRecalibrator -I /data1/scratch/pamesl/projet_cbf/data/bam/SJCBF016_G-C0DG1ACXX.5_marked_duplicates.bam -R /data1/scratch/pamesl/projet_cbf/data/hg19_data/reference_hg19/ucsc.hg19.fasta.gz --known-sites /data1/scratch/pamesl/projet_cbf/data/dbSNP/dbsnp_138.hg19.vcf.gz --known-sites /data1/scratch/pamesl/projet_cbf/data/mills_1000G/Mills_and_1000G_gold_standard.indels.hg19.sites.vcf -O /data1/scratch/pamesl/projet_cbf/data/bam/recal_data_SJCBF016_G-C0DG1ACXX.5.table
```
I checked the validity of the BAM file SJCBF016_G-C0DG1ACXX.5_marked_duplicates.bam using the ValidateSamFile tool and got the following result:
```
No errors found
Tool returned:
0
```
I have a feeling that the problem comes from my Mills_and_1000G_gold_standard files.indels.hg19.sites.vcf, dbsnp_138.hg19.vcf.gz or my reference file ucsc.hg19.fasta.gz but I don't know which way to go.
Edit: I will perform ValidateVariants on each VCF files and post results tomorrow.
Best regards,
Paul-Arthur