Hi,
I have produced a set of VCF files with UnifiedGenotyper, using a custom BED file. Subsequently I have used Picard LiftOverVcf in order to lift these VCF files from hg38 to hg19.
/jdk/current/bin/java -jar picard.jar LiftoverVcf I=input.vcf O=output.vcf CHAIN=hg38ToHg19.over.chain REJECT=rejected.vcf R=ucsc.hg19.fasta
While inspecting the resulting VCFs, I have realised that the newly-generated "lifted-over" VCF files contain a few duplicated genomic positions, some of which having different base counts. I checked the corresponding genomic positions in the original VCFs, however there are no such "duplications" in the original VCFs.
#Before lift-over, hg 38
chr21 43107642 . G . . . BaseCounts=0,0,6,0;DP=6;LowMQ=1.0000,1.0000,6;MQ=0.00;MQ0=6;PercentNBase=0.0000;VariantType=NO_VARIATION GT ./.
#After lift-over, hg19
chr21 44527752 . G . . PASS BaseCounts=0,0,5,0;DP=5;LowMQ=1.0000,1.0000,5;MQ=0.00;MQ0=5;PercentNBase=0.0000;VariantType=NO_VARIATION GT ./.
chr21 44527752 . G . . PASS BaseCounts=0,0,6,0;DP=6;LowMQ=1.0000,1.0000,6;MQ=0.00;MQ0=6;PercentNBase=0.0000;VariantType=NO_VARIATION GT ./.
I was not able to understand why the LiftOverVcf produces these duplicates. Could you please help me on this?