Quantcast
Viewing all articles
Browse latest Browse all 12345

Picard Sort Vcf Error

Hello.

I am using GATK version 3.6, picard-2.8.2.jar

I downloaded hapmap_3.3.hg38.vcf from gatk resource bundle. I then used the below command to remove chr notation.
awk '{gsub(/^chr/,""); print}' hapmap_3.3.hg38.vcf > no_chr_hapmap_3.3.hg38.vcf.vcf

Before (hapmap_3.3.hg38.vcf)
chr1 2242065 rs263526 T C . PASS AC=724;AF=0.259;AN=2792
chr1 2242417 rs16824926 C . . PASS AN=530
chr1 2242880 rs11581436 A . . PASS AN=540

After (no_chr_hapmap_3.3.hg38.vcf.vcf)
1 6421563 rs4908891 G A . PASS AC=1086;AF=0.389;AN=2792
1 6421782 rs4908892 A G . PASS AC=1692;AF=0.606;AN=2792
1 6421856 rs12078257 T C . PASS AC=368;AF=0.132;AN=2790

Then, use Picard SortVcf to sort the no_chr_hapmap_3.3.hg38.vcf.vcf
java -jar picard-2.8.2.jar SortVcf I=removedChr_HapMap.vcf O=sortedHapMap.vcf SEQUENCE_DICTIONARY=hg38.dict

hg38.dict
@SQ SN:1 LN:248956422 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:2648ae1bacce4ec4b6cf337dcae37816
@SQ SN:10 LN:133797422 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:907112d17fcb73bcab1ed1c72b97ce68
@SQ SN:11 LN:135086622 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:1511375dc2dd1b633af8cf439ae90cec
@SQ SN:12 LN:133275309 UR:file:/media/ubuntu/Elements/TOOL/hg38.fa M5:e81e16d3f44337034695a29b97708fce

I have then encountered this error:

Exception in thread "main" java.lang.IllegalArgumentException: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:126)
at picard.vcf.SortVcf.doWork(SortVcf.java:95)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)
Caused by: java.lang.AssertionError: SAM dictionaries are not the same: SAMSequenceRecord(name=chr1,length=248956422,dict_index=0,assembly=20) was found when SAMSequenceRecord(name=1,length=248956422,dict_index=0,assembly=null) was expected.
at htsjdk.samtools.SAMSequenceDictionary.assertSameDictionary(SAMSequenceDictionary.java:170)
at picard.vcf.SortVcf.collectFileReadersAndHeaders(SortVcf.java:124)
... 4 more

I have tried a lot of times but still getting back the same error. Kindly do advise how can I solve this problem.

I would then like to perform SelectVariants to extract variants that missed in HapMap but present in my dataset.

Thank you so much in advance.

Cheers,
Moon


Viewing all articles
Browse latest Browse all 12345

Trending Articles