Hi, trying to prepare some BAM files using picard/MarkDuplicates I got a SAM validation error (Padding operator not between real operators in CIGAR) for a few reads. I figured I'd remove them using FilterSamReads. I built a list of reads by running ValidateSamFile, then putting the read names into a text file to use as read list file. However, if I run the filtering step, FilterSamReads will stop with the same exception as MarkDuplicates, complaining about the very reads I'm trying to remove. I'm probably missing out on something here, but I'm stuck right now. Picard version used is: 2.16.0-1-g763d98e-SNAPSHOT, Java is: OpenJDK 64-Bit Server VM 1.8.0_131-8u131-b11-1~bpo8+1-b11, command is:
java -jar /home/picard/build/libs/picard.jar FilterSamReads I=input.bam FILTER=excludeReadList O=output.bam USE_JDK_INFLATER=true USE_JDK_DEFLATER=true RLF=input.bam.brokenreads
Exception thrown is:
ERROR 2017-12-12 14:48:43 FilterSamReads Failed to filter 1724-0121-WholeExome_S1_L001_R1_001_paired.bam
htsjdk.samtools.SAMFormatException: SAM validation error: ERROR: Read name NS500396:228:HKL5MBGX2:1:12210:18362:13274_2:N:0:TAAGGCGA, Padding operator not between real operators in CIGAR
at htsjdk.samtools.SAMUtils.processValidationErrors(SAMUtils.java:454)
at htsjdk.samtools.BAMRecord.getCigar(BAMRecord.java:253)
at htsjdk.samtools.SAMRecord.getAlignmentEnd(SAMRecord.java:606)
at htsjdk.samtools.SAMRecord.computeIndexingBin(SAMRecord.java:1575)
at htsjdk.samtools.SAMRecord.isValid(SAMRecord.java:2087)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:811)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:797)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:765)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
at picard.sam.FilterSamReads.writeReadsFile(FilterSamReads.java:193)
at picard.sam.FilterSamReads.doWork(FilterSamReads.java:213)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
input.bam.brokenreads looks like this:
NS500396:228:HKL5MBGX2:1:12210:18362:13274_2:N:0:TAAGGCGA
NS500396:228:HKL5MBGX2:4:23505:12020:5193_2:N:0:TAAGGCGA
NS500396:228:HKL5MBGX2:1:13203:21100:6035_1:N:0:TAAGGCGA
etc.
Thanks for any help!