Hello,
Recently I run an alignment with LAST tool (http://last.cbrc.jp/ - fasta aligner for long reads alignment), it produces .maf file which I then converted to sam(with http://last.cbrc.jp/doc/maf-convert.html) then to bam (with picard). Until now everything looks fine, next I try to run picard CollectAlignmentSummaryMetrics and it throws this error:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.collectQualityData(AlignmentSummaryMetricsCollector.java:329)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector$IndividualAlignmentSummaryMetricsCollector.addRecord(AlignmentSummaryMetricsCollector.java:195)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:127)
at picard.analysis.AlignmentSummaryMetricsCollector$GroupAlignmentSummaryMetricsPerUnitMetricCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:93)
at picard.metrics.MultiLevelCollector$AllReadsDistributor.acceptRecord(MultiLevelCollector.java:192)
at picard.metrics.MultiLevelCollector.acceptRecord(MultiLevelCollector.java:315)
at picard.analysis.AlignmentSummaryMetricsCollector.acceptRecord(AlignmentSummaryMetricsCollector.java:89)
at picard.analysis.CollectAlignmentSummaryMetrics.acceptRead(CollectAlignmentSummaryMetrics.java:147)
at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:138)
at picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:77)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
I am adding the head of the bam file:
0034a196-edbc-429f-89c4-b5280a486760_Basecall_2D_2d 0 burn-in 1 100 4H21=1D18=.....1D17=2D1X6=6D1X11=2I1X31=19H * 0 0 GGGCGGCGACCTCGCGGGT.....AGCATGCCACG * NM:i:152 AS:i:10909
06c0ff36-09df-4bb3-b952-146fca6f60ae_Basecall_2D_2d 0 burn-in 1 100 8H21=1D3=......2D1=1I57=2D68=1D1=2D42=29H * 0 0 GGGCGGCGACCTCGCGGG...........GCAAGCGTGA * NM:i:402 AS:i:33419
I deleted values in the middle of SEQ and CIGAR strings because they are very long.
Running ValidateSamFile on this bam file shows not relevant problem:
HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_READ_GROUP 1
WARNING:RECORD_MISSING_READ_GROUP 2441
For the same sequencing run I had fastq files which I aligned with bwa and when I run CollectAlignmentSummaryMetrics on the bam file from this workflow it worked fine. here is a head of the bam from this workflow (alignment with bwa using fastq):
0034a196-edbc-429f-89c4-b5280a486760_Basecall_2D_2d 0 burn-in 1 60 4S18M1D1....M6D32M19S * 0 0 TGCTGG...TGTTTGA /)6-,(-.../9/)0,*, MD:Z:18^T..A11G31 NM:i:138 AS:i:1920 XS:i:0
06c0ff36-09df-4bb3-b952-146fca6f60ae_Basecall_2D_2d 0 burn-in 1 60 8S18M1D1...D1M2D42M29S * 0 0 GTATTGC...ATGTGTTTC =.01-)**)./....'-.+*+ MD:Z:18^.^A1^AA42 NM:i:371 AS:i:5836 XS:i:0
Same as before, I removed the characters in the middle of the long strings.
Hope you could help me with my problems.
Thanks and have a great day.