Hello,
I did my alignment to a reference genome using Bowtie2. I used paired end and unpaired sequences in doing so. Then I used view from samtools to convert my sam files to bam files. During the validation of my bam files using picard, I get the following errors:
HISTOGRAM java.lang.String
Error Type Count
ERROR:MISMATCH_FLAG_MATE_NEG_STRAND 65513
ERROR:MISSING_READ_GROUP 1
WARNING:RECORD_MISSING_READ_GROUP 4715744
...
Furthermore, when I am trying to get the read group fields from the .bam files using this, which I found in the readgroups from the GATK site:
samtools view -H sample.bam grep '@RG'
I get this but no fields:
@SQ SN:scaffold_505 LN:1031
@SQ SN:scaffold_506 LN:1007
@SQ SN:scaffold_507 LN:1005
@PG ID:bowtie2 PN:bowtie2 VN:2.2.9 CL:"/opt/bowtie2/2.2.9/gcc/bowtie2-align-s --wrapper basic-0 -p 12 -x /scratch/mauro/Sit_Index/Sit_312v2 -S ./BC5align2/B01_TCCAG.sam -1 B01_TCCAG.1.fq -2 B01_TCCAG.2.fq -U B01_TCCAG.rem.1.fq,B01_TCCAG.rem.2.fq"
@HD VN:1.0 SO:unsorted
This is just an example of the last section of the output for one of my bam files. I have both .bam and .sam files in the same folder. I tried with the sam files and I get exactly the same output.
Does this mean my sam and bam files have the same format so I really do not have bam files?
Is there a problem in doing the alignment with Bowtie2 and then converting the sam files to bam files with view from samtools?
I understand I can correct/edit a bam file using the AddOrReplaceReadGroups tool or the FixMateInformation tool from picard but since I can't even get the Read Group Fields I am lost.
Any suggestions?
Hoping to hear from you,
Margarita