Hi I am not much familiar with bioinformatics and SNP Genotyping. As I am trying to identify the SNP in my sample from SRA Database so this the pipeline i am following
I - STAR Aligner to mapping
II - 2 Pass mapping using SJ.out.tab
III - SAM to BAM conversion Sorting and Indexing
IV- Mark Duplicate
V- SplitNtrim
VI Base Recalibration using known VCF file
When i Use SplitNtim output for base recalibration using vcf file i got an ERROR MESSAGE: The platform (platform) associated with read group GATKSAMReadGroupRecord @RG:id is not a recognized platform. Allowable options are ILLUMINA,SLX,SOLEXA,SOLID ,454,LS454,COMPLETE,PACBIO,IONTORRENT,CAPILLARY,HELICOS,UNKNOWN
For this I have to add RG to the file which i am not able to identified As I check Mapping of STAR Aligner @RG information is not available in that SAM output file.
Second As per SRA database Its single run in illumia so i used RGPL=illumina RGLB=lib1 RGPU=unit1 RGSM=20 to add in splitntrim output Bam file i got this error
Exception in thread "main" htsjdk.samtools.SAMFormatException: SAM validation error: ERROR: Read name SRR2183534.2625035, No real operator (M|I|D|N) i n CIGAR and generating smaller size file than the input file.
whether i am doing it correct or if yes how can i add RG in Bam and from where i get this information or i can run without that
sorry for this long post