Dear colleagues,
I am trying to implement a script to group duplicate reads into families and would like to understand which criteria Picard's MarkDuplicates uses. I've read that it compares the 5' ends of reads (either single-end or paired-end), but haven't found much more. Is there any page or publication where these details are provided?
Thanks!