Hello,
I've been running a drop-seq experiment, and the 2nd to last command fails:
java -Xmx4000m -jar 3rdParty/picard/picard.jar MergeBamAlignment REFERENCE_SEQUENCE=mm10/mm10.fasta UNMAPPED_BAM=unaligned_mc_tagged_polyA_filtered.bam ALIGNED_BAM=aligned.sorted.bam INCLUDE_SECONDARY_ALIGNMENTS=false PAIRED_RUN=false OUTPUT=merged.bam
which generates this not-so-helpful error message
Exception in thread "main" java.lang.NullPointerException
at picard.sam.AbstractAlignmentMerger.createNewCigarIfMapsOffEndOfReference(AbstractAlignmentMerger.java:631)
at picard.sam.AbstractAlignmentMerger.createNewCigarsIfMapsOffEndOfReference(AbstractAlignmentMerger.java:654)
at picard.sam.AbstractAlignmentMerger.updateCigarForTrimmedOrClippedBases(AbstractAlignmentMerger.java:686)
at picard.sam.AbstractAlignmentMerger.transferAlignmentInfoToFragment(AbstractAlignmentMerger.java:514)
at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:410)
at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:138)
at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:248)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)
which I thought that I had solved on http://gatkforums.broadinstitute.org/gatk/discussion/8627/picard-sam-mergebamalignment-fails but I didn't
This command requires two files, which I've examined with Picard's validateSamFile
for unaligned_mc_tagged_polyA_filtered.bam the error is
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_PLATFORM_VALUE 1
and for aligned.sorted.bam, the error message is:
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_READ_GROUP 1
WARNING:MISSING_TAG_NM 11720805
WARNING:RECORD_MISSING_READ_GROUP 11720805
which I think can be fixed by Picard's AddOrReplaceReadGroups (http://broadinstitute.github.io/picard/command-line-overview.html#AddOrReplaceReadGroups), but this command requires several options, which I don't know (the default options didn't work), i.e.
RGID=4 \
RGLB=lib1 \
RGPL=illumina \
RGPU=unit1 \
RGSM=20
Illumina is obvious, but I don't know what to put for these options for a DropSeq experiment. How I can find these values for Picard's AddOrReplaceReadGroups?