Hey GATK Devs!
I'm using GATK (v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18) to call SNPs and Indels on a single sample. The reads were aligned with BWA (v0.7.12) and duplicates marked with Picard MarkDuplicates (v2.0.1). I run GATK HC to completion without issue (command-line parameters reported below); however, when I merge the -bamout
output BAM files with samtools (v1.3.1), samtools issues the following warning:
[bam_translate] PG tag "MarkDuplicates" on read "HS1:152:C0HBWACXX:5:1202:17823:110576" encountered with no corresponding entry in header, tag lost. Unknown tags are only reported once per input file for each tag ID.
The output BAM files indeed lack any @PG
lines in the headers, but preserve the PG:Z:
tags on the reads. Is this intended?
The options to GATK I issued were:
-T HaplotypeCaller -R genome.fasta -L genome.00001.bed -mmq 25 -mbq 30 -ERC BP_RESOLUTION -I proper.srt.mdup.bam -o proper.00001.g.vcf.gz -bamout proper.00001.g.bam
Thanks!