I am trying to move off home grown UMID tools I wrote to the new fgbio UMID toolset but I am trying to understand how I do this in a way that doesn't double count variation in the overlapping region. All the reads I use overlap. Currently I pear assemble the reads and then create SAM files from those FASTQ files, extract the UMIDs myself and move on from there. I can't find a way to "assemble" the SAM/BAM files that combines them into a SAM/BAM that only has the overlap represented once. I can merge them but I don't think that is what I want.
Any help? Thanks