Hi, I'm trying to use LiftoverVcf for structural gnomad variants hg19 to h38. I get MismatchedRefAllele in the rejectVCF files, as well as output messages that look like this:
" failed to match chain 2 because intersection length 44949950 < minMatchSize 4.5788481E7 (0.98168695 < 1.0)"
Could you help me figure out what the issue is please?
My input is (here I am subsetting the input, but I get this error for all variants on the larger set:
<br /> /n/app/java/jdk-1.8u112/bin/java -Xms4G -Xmx4G -jar /n/app/picard/2.8.0/bin/picard-2.8.0.jar LiftoverVcf I=${FOLDER}/gnomad_v2_sv.sites_test.vcf O=${PATH}/gnomad_v2_sv.sites_test_hg38.vcf CHAIN=${FOLDER}/hg19ToHg38.over.chain REJECT=rejected_variants.vcf R=${FOLDER}/Homo_sapiens_assembly38.fasta
My reference is in this format:
head ${FOLDER}/Homo_sapiens_assembly38.fasta
>chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
This is what the chain file looks like (obtained here http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/):
head ${FOLDER}/hg19ToHg38.over.chain
chain 20851231461 chr1 249250621 + 10000 249240621 chr1 248956422 + 10000 248946422 2
167376 50041 80290
40302 253649 288020
1044699 1 2
3716 0 3
1134 4 18
3377 0 1
7258 1 1
27 1 1
1275 1680 5595
VCF input file (I added chr)
##fileformat=VCFv4.2
##contig=<ID=1,length=249250621>
##contig=<ID=2,length=243199373>
##contig=<ID=3,length=198022430>
##contig=<ID=4,length=191154276>
##contig=<ID=5,length=180915260>
##contig=<ID=6,length=171115067>
##contig=<ID=7,length=159138663>
##contig=<ID=8,length=146364022>
##contig=<ID=9,length=141213431>
##contig=<ID=10,length=135534747>
##contig=<ID=11,length=135006516>
##contig=<ID=12,length=133851895>
##contig=<ID=13,length=115169878>
##contig=<ID=14,length=107349540>
##contig=<ID=15,length=102531392>
##contig=<ID=16,length=90354753>
##contig=<ID=17,length=81195210>
##contig=<ID=18,length=78077248>
##contig=<ID=19,length=59128983>
##contig=<ID=20,length=63025520>
##contig=<ID=21,length=48129895>
##contig=<ID=22,length=51304566>
##contig=<ID=X,length=155270560>
##contig=<ID=Y,length=59373566>
##ALT=<ID=BND,Description="Translocation">
##ALT=<ID=CPX,Description="Complex SV">
##ALT=<ID=CTX,Description="Reciprocal chromosomal translocation">
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=DUP,Description="Duplication">
##ALT=<ID=INS,Description="Insertion">
##ALT=<ID=INS:ME,Description="Mobile element insertion of unspecified ME class">
##ALT=<ID=INS:ME:ALU,Description="Alu element insertion">
##ALT=<ID=INS:ME:LINE1,Description="LINE1 element insertion">
##ALT=<ID=INS:ME:SVA,Description="SVA element insertion">
##ALT=<ID=INS:UNK,Description="Sequence insertion of unspecified origin">
##ALT=<ID=INV,Description="Inversion">
##FILTER=<ID=MULTIALLELIC,Description="Multiallelic site">
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=PCRPLUS_ENRICHED,Description="Site enriched for non-reference genotypes among PCR+ samples. Likely reflects technical batch effects. All PCR- samples have been assigned null GTs for these sites.>">
##FILTER=<ID=PREDICTED_GENOTYPING_ARTIFACT,Description="Site is predicted to be a genotyping false-positive based on analysis of minimum GQs prior to GQ filtering.">
##FILTER=<ID=UNRESOLVED,Description="Variant is unresolved">
##FILTER=<ID=VARIABLE_ACROSS_BATCHES,Description="Site appears at variable frequencies across batches. Likely reflects technical batch effects.>">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Number of non-reference alleles observed (for biallelic sites) or individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AFR_AC,Number=A,Type=Integer,Description="Number of non-reference AFR alleles observed (for biallelic sites) or AFR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AFR_AF,Number=A,Type=Float,Description="AFR allele frequency (for biallelic sites) or AFR copy-state frequency (for multiallelic sites).">
##INFO=<ID=AFR_AN,Number=1,Type=Integer,Description="Total number of AFR alleles genotyped (for biallelic sites) or AFR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=AFR_FREQ_HET,Number=1,Type=Float,Description="AFR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_FREQ_HOMALT,Number=1,Type=Float,Description="AFR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_FREQ_HOMREF,Number=1,Type=Float,Description="AFR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of AFR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HET,Number=1,Type=Integer,Description="Number of AFR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HOMALT,Number=1,Type=Integer,Description="Number of AFR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HOMREF,Number=1,Type=Integer,Description="Number of AFR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele frequency (for biallelic sites) or copy-state frequency (for multiallelic sites).">
##INFO=<ID=ALGORITHMS,Number=.,Type=String,Description="Source algorithms">
##INFO=<ID=AMR_AC,Number=A,Type=Integer,Description="Number of non-reference AMR alleles observed (for biallelic sites) or AMR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AMR_AF,Number=A,Type=Float,Description="AMR allele frequency (for biallelic sites) or AMR copy-state frequency (for multiallelic sites).">
##INFO=<ID=AMR_AN,Number=1,Type=Integer,Description="Total number of AMR alleles genotyped (for biallelic sites) or AMR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=AMR_FREQ_HET,Number=1,Type=Float,Description="AMR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_FREQ_HOMALT,Number=1,Type=Float,Description="AMR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_FREQ_HOMREF,Number=1,Type=Float,Description="AMR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of AMR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HET,Number=1,Type=Integer,Description="Number of AMR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HOMALT,Number=1,Type=Integer,Description="Number of AMR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HOMREF,Number=1,Type=Integer,Description="Number of AMR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles genotyped (for biallelic sites) or individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=CHR2,Number=1,Type=String,Description="Chromosome for END coordinate">
##INFO=<ID=CPX_INTERVALS,Number=.,Type=String,Description="Genomic intervals constituting complex variant.">
##INFO=<ID=CPX_TYPE,Number=1,Type=String,Description="Class of complex variant.">
##INFO=<ID=EAS_AC,Number=A,Type=Integer,Description="Number of non-reference EAS alleles observed (for biallelic sites) or EAS individuals at each copy state (for multiallelic sites).">
##INFO=<ID=EAS_AF,Number=A,Type=Float,Description="EAS allele frequency (for biallelic sites) or EAS copy-state frequency (for multiallelic sites).">
##INFO=<ID=EAS_AN,Number=1,Type=Integer,Description="Total number of EAS alleles genotyped (for biallelic sites) or EAS individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=EAS_FREQ_HET,Number=1,Type=Float,Description="EAS heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_FREQ_HOMALT,Number=1,Type=Float,Description="EAS homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_FREQ_HOMREF,Number=1,Type=Float,Description="EAS homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of EAS individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HET,Number=1,Type=Integer,Description="Number of EAS individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HOMALT,Number=1,Type=Integer,Description="Number of EAS individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HOMREF,Number=1,Type=Integer,Description="Number of EAS individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the structural variant">
##INFO=<ID=EUR_AC,Number=A,Type=Integer,Description="Number of non-reference EUR alleles observed (for biallelic sites) or EUR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=EUR_AF,Number=A,Type=Float,Description="EUR allele frequency (for biallelic sites) or EUR copy-state frequency (for multiallelic sites).">
##INFO=<ID=EUR_AN,Number=1,Type=Integer,Description="Total number of EUR alleles genotyped (for biallelic sites) or EUR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=EUR_FREQ_HET,Number=1,Type=Float,Description="EUR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_FREQ_HOMALT,Number=1,Type=Float,Description="EUR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_FREQ_HOMREF,Number=1,Type=Float,Description="EUR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of EUR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HET,Number=1,Type=Integer,Description="Number of EUR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HOMALT,Number=1,Type=Integer,Description="Number of EUR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HOMREF,Number=1,Type=Integer,Description="Number of EUR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=EVIDENCE,Number=.,Type=String,Description="Classes of random forest support.">
##INFO=<ID=FREQ_HET,Number=1,Type=Float,Description="Heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=FREQ_HOMALT,Number=1,Type=Float,Description="Homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=FREQ_HOMREF,Number=1,Type=Float,Description="Homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=N_BI_GENOS,Number=1,Type=Integer,Description="Total number of individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=N_HET,Number=1,Type=Integer,Description="Number of individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=N_HOMALT,Number=1,Type=Integer,Description="Number of individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=N_HOMREF,Number=1,Type=Integer,Description="Number of individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=OTH_AC,Number=A,Type=Integer,Description="Number of non-reference OTH alleles observed (for biallelic sites) or OTH individuals at each copy state (for multiallelic sites).">
##INFO=<ID=OTH_AF,Number=A,Type=Float,Description="OTH allele frequency (for biallelic sites) or OTH copy-state frequency (for multiallelic sites).">
##INFO=<ID=OTH_AN,Number=1,Type=Integer,Description="Total number of OTH alleles genotyped (for biallelic sites) or OTH individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=OTH_FREQ_HET,Number=1,Type=Float,Description="OTH heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_FREQ_HOMALT,Number=1,Type=Float,Description="OTH homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_FREQ_HOMREF,Number=1,Type=Float,Description="OTH homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of OTH individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HET,Number=1,Type=Integer,Description="Number of OTH individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HOMALT,Number=1,Type=Integer,Description="Number of OTH individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HOMREF,Number=1,Type=Integer,Description="Number of OTH individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=PCRPLUS_DEPLETED,Number=0,Type=Flag,Description="Site depleted for non-reference genotypes among PCR+ samples. Likely reflects technical batch effects. All PCR+ samples have been assigned null GTs for these sites.">
##INFO=<ID=PESR_GT_OVERDISPERSION,Number=0,Type=Flag,Description="PESR genotyping data is overdispersed. Flags sites where genotypes are likely noisier.">
##INFO=<ID=POPMAX_AF,Number=1,Type=Float,Description="Maximum allele frequency across any population (biallelic sites only).">
##INFO=<ID=PROTEIN_CODING__COPY_GAIN,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a copy-gain effect.">
##INFO=<ID=PROTEIN_CODING__DUP_LOF,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a loss-of-function effect via intragenic exonic duplication.">
##INFO=<ID=PROTEIN_CODING__DUP_PARTIAL,Number=.,Type=String,Description="Gene(s) which are partially overlapped by an SV's duplication, such that an unaltered copy is preserved.">
##INFO=<ID=PROTEIN_CODING__INTERGENIC,Number=0,Type=Flag,Description="SV does not overlap coding sequence.">
##INFO=<ID=PROTEIN_CODING__INTRONIC,Number=.,Type=String,Description="Gene(s) where the SV was found to lie entirely within an intron.">
##INFO=<ID=PROTEIN_CODING__INV_SPAN,Number=.,Type=String,Description="Gene(s) which are entirely spanned by an SV's inversion.">
##INFO=<ID=PROTEIN_CODING__LOF,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a loss-of-function effect.">
##INFO=<ID=PROTEIN_CODING__MSV_EXON_OVR,Number=.,Type=String,Description="Gene(s) on which the multiallelic SV would be predicted to have a LOF, DUP_LOF, COPY_GAIN, or DUP_PARTIAL annotation if the SV were biallelic.">
##INFO=<ID=PROTEIN_CODING__NEAREST_TSS,Number=.,Type=String,Description="Nearest transcription start site to intragenic variants.">
##INFO=<ID=PROTEIN_CODING__PROMOTER,Number=.,Type=String,Description="Genes whose promoter sequence (1 kb) was disrupted by SV.">
##INFO=<ID=PROTEIN_CODING__UTR,Number=.,Type=String,Description="Gene(s) for which the SV is predicted to disrupt a UTR.">
##INFO=<ID=SOURCE,Number=1,Type=String,Description="Source of inserted sequence.">
##INFO=<ID=STRANDS,Number=1,Type=String,Description="Breakpoint strandedness [++,+-,-+,--]">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=UNRESOLVED_TYPE,Number=1,Type=String,Description="Class of unresolved variant.">
##CPX_TYPE_CCR="Complex chromosomal rearrangement, involving two or more chromosomes and multiple SV signatures."
##CPX_TYPE_INS_iDEL="Insertion with deletion at insertion site."
##CPX_TYPE_INVdel="Complex inversion with 3' flanking deletion."
##CPX_TYPE_INVdup="Complex inversion with 3' flanking duplication."
##CPX_TYPE_dDUP="Dispersed duplication."
##CPX_TYPE_dDUP_iDEL="Dispersed duplication with deletion at insertion site."
##CPX_TYPE_delINVdel="Complex inversion with 5' and 3' flanking deletions."
##CPX_TYPE_delINVdup="Complex inversion with 5' flanking deletion and 3' flanking duplication."
##CPX_TYPE_delINV="Complex inversion with 5' flanking deletion."
##CPX_TYPE_dupINVdel="Complex inversion with 5' flanking duplication and 3' flanking deletion."
##CPX_TYPE_dupINVdup="Complex inversion with 5' and 3' flanking duplications."
##CPX_TYPE_dupINV="Complex inversion with 5' flanking duplication."
##CPX_TYPE_piDUP_FR="Palindromic inverted tandem duplication, forward-reverse orientation."
##CPX_TYPE_piDUP_RF="Palindromic inverted tandem duplication, reverse-forward orientation."
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 720320 gnomAD_v2_INS_1_11 N <INS:ME:SVA> 680 PASS END=720340;SVTYPE=INS;CHR2=1;SVLEN=968;ALGORITHMS=melt;EVIDENCE=SR;PROTEIN_CODING__NEAREST_TSS=AL645608.2;PROTEIN_CODING__INTERGENIC;AN=21476;AC=1;AF=4.7e-05;N_BI_GENOS=10738;N_HOMREF=10737;N_HET=1;N_HOMALT=0;FREQ_HOMREF=0.999907;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;AFR_AN=9480;AFR_AC=0;AFR_AF=0;AFR_N_BI_GENOS=4740;AFR_N_HOMREF=4740;AFR_N_HET=0;AFR_N_HOMALT=0;AFR_FREQ_HOMREF=1;AFR_FREQ_HET=0;AFR_FREQ_HOMALT=0;AMR_AN=1784;AMR_AC=0;AMR_AF=0;AMR_N_BI_GENOS=892;AMR_N_HOMREF=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;EAS_AN=2226;EAS_AC=0;EAS_AF=0;EAS_N_BI_GENOS=1113;EAS_N_HOMREF=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EUR_AN=7598;EUR_AC=1;EUR_AF=0.000132;EUR_N_BI_GENOS=3799;EUR_N_HOMREF=3798;EUR_N_HET=1;EUR_N_HOMALT=0;EUR_FREQ_HOMREF=0.999737;EUR_FREQ_HET=0.000263227;EUR_FREQ_HOMALT=0;OTH_AN=388;OTH_AC=0;OTH_AF=0;OTH_N_BI_GENOS=194;OTH_N_HOMREF=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;POPMAX_AF=0.000132
chr1 818807 gnomAD_v2_INS_1_17 N <INS:ME:ALU> 381 PASS END=818820;SVTYPE=INS;CHR2=1;SVLEN=281;ALGORITHMS=melt;EVIDENCE=SR;PROTEIN_CODING__INTRONIC=AL645608.2;AN=21476;AC=11;AF=0.000512;N_BI_GENOS=10738;N_HOMREF=10727;N_HET=11;N_HOMALT=0;FREQ_HOMREF=0.998976;FREQ_HET=0.0010244;FREQ_HOMALT=0;AFR_AN=9480;AFR_AC=1;AFR_AF=0.000105;AFR_N_BI_GENOS=4740;AFR_N_HOMREF=4739;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AMR_AN=1784;AMR_AC=0;AMR_AF=0;AMR_N_BI_GENOS=892;AMR_N_HOMREF=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;EAS_AN=2226;EAS_AC=0;EAS_AF=0;EAS_N_BI_GENOS=1113;EAS_N_HOMREF=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EUR_AN=7598;EUR_AC=10;EUR_AF=0.001316;EUR_N_BI_GENOS=3799;EUR_N_HOMREF=3789;EUR_N_HET=10;EUR_N_HOMALT=0;EUR_FREQ_HOMREF=0.997368;EUR_FREQ_HET=0.00263227;EUR_FREQ_HOMALT=0;OTH_AN=388;OTH_AC=0;OTH_AF=0;OTH_N_BI_GENOS=194;OTH_N_HOMREF=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;POPMAX_AF=0.001316
chr1 823258 gnomAD_v2_INS_1_18 N <INS:ME:SVA> 295 PASS END=823274;SVTYPE=INS;CHR2=1;SVLEN=1205;ALGORITHMS=melt;EVIDENCE=SR;PROTEIN_CODING__NEAREST_TSS=SAMD11;PROTEIN_CODING__INTERGENIC;AN=21476;AC=1;AF=4.7e-05;N_BI_GENOS=10738;N_HOMREF=10737;N_HET=1;N_HOMALT=0;FREQ_HOMREF=0.999907;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;AFR_AN=9480;AFR_AC=1;AFR_AF=0.000105;AFR_N_BI_GENOS=4740;AFR_N_HOMREF=4739;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AMR_AN=1784;AMR_AC=0;AMR_AF=0;AMR_N_BI_GENOS=892;AMR_N_HOMREF=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;EAS_AN=2226;EAS_AC=0;EAS_AF=0;EAS_N_BI_GENOS=1113;EAS_N_HOMREF=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EUR_AN=7598;EUR_AC=0;EUR_AF=0;EUR_N_BI_GENOS=3799;EUR_N_HOMREF=3799;EUR_N_HET=0;EUR_N_HOMALT=0;EUR_FREQ_HOMREF=1;EUR_FREQ_HET=0;EUR_FREQ_HOMALT=0;OTH_AN=388;OTH_AC=0;OTH_AF=0;OTH_N_BI_GENOS=194;OTH_N_HOMREF=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;POPMAX_AF=0.000105
chr1 849315 gnomAD_v2_INS_1_25 N <INS:ME:ALU> 892 PASS END=849331;SVTYPE=INS;CHR2=1;SVLEN=280;ALGORITHMS=melt;EVIDENCE=SR;PROTEIN_CODING__NEAREST_TSS=SAMD11;PROTEIN_CODING__INTERGENIC;AN=21476;AC=1;AF=4.7e-05;N_BI_GENOS=10738;N_HOMREF=10737;N_HET=1;N_HOMALT=0;FREQ_HOMREF=0.999907;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;AFR_AN=9480;AFR_AC=0;AFR_AF=0;AFR_N_BI_GENOS=4740;AFR_N_HOMREF=4740;AFR_N_HET=0;AFR_N_HOMALT=0;AFR_FREQ_HOMREF=1;AFR_FREQ_HET=0;AFR_FREQ_HOMALT=0;AMR_AN=1784;AMR_AC=0;AMR_AF=0;AMR_N_BI_GENOS=892;AMR_N_HOMREF=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;EAS_AN=2226;EAS_AC=0;EAS_AF=0;EAS_N_BI_GENOS=1113;EAS_N_HOMREF=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EUR_AN=7598;EUR_AC=1;EUR_AF=0.000132;EUR_N_BI_GENOS=3799;EUR_N_HOMREF=3798;EUR_N_HET=1;EUR_N_HOMALT=0;EUR_FREQ_HOMREF=0.999737;EUR_FREQ_HET=0.000263227;EUR_FREQ_HOMALT=0;OTH_AN=388;OTH_AC=0;OTH_AF=0;OTH_N_BI_GENOS=194;OTH_N_HOMREF=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;POPMAX_AF=0.000132
chr1 918618 gnomAD_v2_INS_1_27 N <INS:ME:SVA> 949 PASS END=46707098;SVTYPE=INS;CHR2=1;SVLEN=793;ALGORITHMS=delly,melt;EVIDENCE=SR;PROTEIN_CODING__NEAREST_TSS=C1orf170;PROTEIN_CODING__INTERGENIC;AN=21476;AC=1;AF=4.7e-05;N_BI_GENOS=10738;N_HOMREF=10737;N_HET=1;N_HOMALT=0;FREQ_HOMREF=0.999907;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;AFR_AN=9480;AFR_AC=1;AFR_AF=0.000105;AFR_N_BI_GENOS=4740;AFR_N_HOMREF=4739;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AMR_AN=1784;AMR_AC=0;AMR_AF=0;AMR_N_BI_GENOS=892;AMR_N_HOMREF=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;EAS_AN=2226;EAS_AC=0;EAS_AF=0;EAS_N_BI_GENOS=1113;EAS_N_HOMREF=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EUR_AN=7598;EUR_AC=0;EUR_AF=0;EUR_N_BI_GENOS=3799;EUR_N_HOMREF=3799;EUR_N_HET=0;EUR_N_HOMALT=0;EUR_FREQ_HOMREF=1;EUR_FREQ_HET=0;EUR_FREQ_HOMALT=0;OTH_AN=388;OTH_AC=0;OTH_AF=0;OTH_N_BI_GENOS=194;OTH_N_HOMREF=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;POPMAX_AF=0.000105
Output:
INFO 2019-06-13 16:53:15 LiftoverVcf Loading up the target reference genome.
INFO 2019-06-13 16:55:26 LiftoverVcf Lifting variants over and sorting.
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1227 because intersection length 200 < minMatchSize 4.5788481E7 (4.367911E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 2293 because intersection length 27 < minMatchSize 4.5788481E7 (5.89668E-7 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1032141 because intersection length 1397 < minMatchSize 4.5788481E7 (3.0509858E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 2693354 because intersection length 55 < minMatchSize 4.5788481E7 (1.2011755E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1769 because intersection length 11594 < minMatchSize 4.5788481E7 (2.532078E-4 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 240 because intersection length 39222 < minMatchSize 4.5788481E7 (8.56591E-4 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 237546 because intersection length 127 < minMatchSize 4.5788481E7 (2.7736235E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1953 because intersection length 931 < minMatchSize 4.5788481E7 (2.0332625E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 2306473 because intersection length 686 < minMatchSize 4.5788481E7 (1.4981934E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 883954 because intersection length 78 < minMatchSize 4.5788481E7 (1.7034853E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1579610 because intersection length 747 < minMatchSize 4.5788481E7 (1.6314147E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 28292 because intersection length 85 < minMatchSize 4.5788481E7 (1.8563621E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 94258 because intersection length 4619 < minMatchSize 4.5788481E7 (1.008769E-4 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 1282 because intersection length 808 < minMatchSize 4.5788481E7 (1.764636E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 2 because intersection length 44949950 < minMatchSize 4.5788481E7 (0.98168695 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 3340 because intersection length 322 < minMatchSize 4.5788481E7 (7.0323367E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 778 because intersection length 78610 < minMatchSize 4.5788481E7 (0.0017168074 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 5498870 because intersection length 145 < minMatchSize 4.5788481E7 (3.1667355E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 2997720 because intersection length 153 < minMatchSize 4.5788481E7 (3.341452E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 392 because intersection length 580 < minMatchSize 4.5788481E7 (1.2666942E-5 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 339 because intersection length 23647 < minMatchSize 4.5788481E7 (5.1643996E-4 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 939 because intersection length 69035 < minMatchSize 4.5788481E7 (0.0015076937 < 1.0)
INFO 2019-06-13 16:55:26 LiftOver Interval chr1:918618-46707098 failed to match chain 14906583 because intersection length 52 < minMatchSize 4.5788481E7 (1.1356568E-6 < 1.0)
INFO 2019-06-13 16:55:26 LiftoverVcf Processed 5 variants.
INFO 2019-06-13 16:55:26 LiftoverVcf 1 variants failed to liftover.
INFO 2019-06-13 16:55:26 LiftoverVcf 4 variants lifted over but had mismatching reference alleles after lift over.
INFO 2019-06-13 16:55:26 LiftoverVcf 100.0000% of variants were not successfully lifted over and written to the output.
INFO 2019-06-13 16:55:26 LiftoverVcf Writing out sorted records to final VCF.
cat rejected_variants.vcf
##fileformat=VCFv4.2
##ALT=<ID=BND,Description="Translocation">
##ALT=<ID=CPX,Description="Complex SV">
##ALT=<ID=CTX,Description="Reciprocal chromosomal translocation">
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=DUP,Description="Duplication">
##ALT=<ID=INS,Description="Insertion">
##ALT=<ID=INS:ME,Description="Mobile element insertion of unspecified ME class">
##ALT=<ID=INS:ME:ALU,Description="Alu element insertion">
##ALT=<ID=INS:ME:LINE1,Description="LINE1 element insertion">
##ALT=<ID=INS:ME:SVA,Description="SVA element insertion">
##ALT=<ID=INS:UNK,Description="Sequence insertion of unspecified origin">
##ALT=<ID=INV,Description="Inversion">
##CPX_TYPE_CCR="Complex chromosomal rearrangement, involving two or more chromosomes and multiple SV signatures."
##CPX_TYPE_INS_iDEL="Insertion with deletion at insertion site."
##CPX_TYPE_INVdel="Complex inversion with 3' flanking deletion."
##CPX_TYPE_INVdup="Complex inversion with 3' flanking duplication."
##CPX_TYPE_dDUP="Dispersed duplication."
##CPX_TYPE_dDUP_iDEL="Dispersed duplication with deletion at insertion site."
##CPX_TYPE_delINV="Complex inversion with 5' flanking deletion."
##CPX_TYPE_delINVdel="Complex inversion with 5' and 3' flanking deletions."
##CPX_TYPE_delINVdup="Complex inversion with 5' flanking deletion and 3' flanking duplication."
##CPX_TYPE_dupINV="Complex inversion with 5' flanking duplication."
##CPX_TYPE_dupINVdel="Complex inversion with 5' flanking duplication and 3' flanking deletion."
##CPX_TYPE_dupINVdup="Complex inversion with 5' and 3' flanking duplications."
##CPX_TYPE_piDUP_FR="Palindromic inverted tandem duplication, forward-reverse orientation."
##CPX_TYPE_piDUP_RF="Palindromic inverted tandem duplication, reverse-forward orientation."
##FILTER=<ID=MULTIALLELIC,Description="Multiallelic site">
##FILTER=<ID=MismatchedRefAllele,Description="Reference allele does not match reference genome sequence after liftover.">
##FILTER=<ID=NoTarget,Description="Variant could not be lifted between genome builds.">
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=PCRPLUS_ENRICHED,Description="Site enriched for non-reference genotypes among PCR+ samples. Likely reflects technical batch effects. All PCR- samples have been assigned null GTs for these sites.>">
##FILTER=<ID=PREDICTED_GENOTYPING_ARTIFACT,Description="Site is predicted to be a genotyping false-positive based on analysis of minimum GQs prior to GQ filtering.">
##FILTER=<ID=ReverseComplementedIndel,Description="Indel falls into a reverse complemented region in the target genome.">
##FILTER=<ID=UNRESOLVED,Description="Variant is unresolved">
##FILTER=<ID=VARIABLE_ACROSS_BATCHES,Description="Site appears at variable frequencies across batches. Likely reflects technical batch effects.>">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Number of non-reference alleles observed (for biallelic sites) or individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele frequency (for biallelic sites) or copy-state frequency (for multiallelic sites).">
##INFO=<ID=AFR_AC,Number=A,Type=Integer,Description="Number of non-reference AFR alleles observed (for biallelic sites) or AFR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AFR_AF,Number=A,Type=Float,Description="AFR allele frequency (for biallelic sites) or AFR copy-state frequency (for multiallelic sites).">
##INFO=<ID=AFR_AN,Number=1,Type=Integer,Description="Total number of AFR alleles genotyped (for biallelic sites) or AFR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=AFR_FREQ_HET,Number=1,Type=Float,Description="AFR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_FREQ_HOMALT,Number=1,Type=Float,Description="AFR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_FREQ_HOMREF,Number=1,Type=Float,Description="AFR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=AFR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of AFR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HET,Number=1,Type=Integer,Description="Number of AFR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HOMALT,Number=1,Type=Integer,Description="Number of AFR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=AFR_N_HOMREF,Number=1,Type=Integer,Description="Number of AFR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=ALGORITHMS,Number=.,Type=String,Description="Source algorithms">
##INFO=<ID=AMR_AC,Number=A,Type=Integer,Description="Number of non-reference AMR alleles observed (for biallelic sites) or AMR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=AMR_AF,Number=A,Type=Float,Description="AMR allele frequency (for biallelic sites) or AMR copy-state frequency (for multiallelic sites).">
##INFO=<ID=AMR_AN,Number=1,Type=Integer,Description="Total number of AMR alleles genotyped (for biallelic sites) or AMR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=AMR_FREQ_HET,Number=1,Type=Float,Description="AMR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_FREQ_HOMALT,Number=1,Type=Float,Description="AMR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_FREQ_HOMREF,Number=1,Type=Float,Description="AMR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=AMR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of AMR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HET,Number=1,Type=Integer,Description="Number of AMR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HOMALT,Number=1,Type=Integer,Description="Number of AMR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=AMR_N_HOMREF,Number=1,Type=Integer,Description="Number of AMR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles genotyped (for biallelic sites) or individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=CHR2,Number=1,Type=String,Description="Chromosome for END coordinate">
##INFO=<ID=CPX_INTERVALS,Number=.,Type=String,Description="Genomic intervals constituting complex variant.">
##INFO=<ID=CPX_TYPE,Number=1,Type=String,Description="Class of complex variant.">
##INFO=<ID=EAS_AC,Number=A,Type=Integer,Description="Number of non-reference EAS alleles observed (for biallelic sites) or EAS individuals at each copy state (for multiallelic sites).">
##INFO=<ID=EAS_AF,Number=A,Type=Float,Description="EAS allele frequency (for biallelic sites) or EAS copy-state frequency (for multiallelic sites).">
##INFO=<ID=EAS_AN,Number=1,Type=Integer,Description="Total number of EAS alleles genotyped (for biallelic sites) or EAS individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=EAS_FREQ_HET,Number=1,Type=Float,Description="EAS heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_FREQ_HOMALT,Number=1,Type=Float,Description="EAS homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_FREQ_HOMREF,Number=1,Type=Float,Description="EAS homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=EAS_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of EAS individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HET,Number=1,Type=Integer,Description="Number of EAS individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HOMALT,Number=1,Type=Integer,Description="Number of EAS individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=EAS_N_HOMREF,Number=1,Type=Integer,Description="Number of EAS individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the structural variant">
##INFO=<ID=EUR_AC,Number=A,Type=Integer,Description="Number of non-reference EUR alleles observed (for biallelic sites) or EUR individuals at each copy state (for multiallelic sites).">
##INFO=<ID=EUR_AF,Number=A,Type=Float,Description="EUR allele frequency (for biallelic sites) or EUR copy-state frequency (for multiallelic sites).">
##INFO=<ID=EUR_AN,Number=1,Type=Integer,Description="Total number of EUR alleles genotyped (for biallelic sites) or EUR individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=EUR_FREQ_HET,Number=1,Type=Float,Description="EUR heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_FREQ_HOMALT,Number=1,Type=Float,Description="EUR homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_FREQ_HOMREF,Number=1,Type=Float,Description="EUR homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=EUR_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of EUR individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HET,Number=1,Type=Integer,Description="Number of EUR individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HOMALT,Number=1,Type=Integer,Description="Number of EUR individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=EUR_N_HOMREF,Number=1,Type=Integer,Description="Number of EUR individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=EVIDENCE,Number=.,Type=String,Description="Classes of random forest support.">
##INFO=<ID=FREQ_HET,Number=1,Type=Float,Description="Heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=FREQ_HOMALT,Number=1,Type=Float,Description="Homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=FREQ_HOMREF,Number=1,Type=Float,Description="Homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=N_BI_GENOS,Number=1,Type=Integer,Description="Total number of individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=N_HET,Number=1,Type=Integer,Description="Number of individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=N_HOMALT,Number=1,Type=Integer,Description="Number of individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=N_HOMREF,Number=1,Type=Integer,Description="Number of individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=OTH_AC,Number=A,Type=Integer,Description="Number of non-reference OTH alleles observed (for biallelic sites) or OTH individuals at each copy state (for multiallelic sites).">
##INFO=<ID=OTH_AF,Number=A,Type=Float,Description="OTH allele frequency (for biallelic sites) or OTH copy-state frequency (for multiallelic sites).">
##INFO=<ID=OTH_AN,Number=1,Type=Integer,Description="Total number of OTH alleles genotyped (for biallelic sites) or OTH individuals with copy-state estimates (for multiallelic sites).">
##INFO=<ID=OTH_FREQ_HET,Number=1,Type=Float,Description="OTH heterozygous genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_FREQ_HOMALT,Number=1,Type=Float,Description="OTH homozygous alternate genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_FREQ_HOMREF,Number=1,Type=Float,Description="OTH homozygous reference genotype frequency (biallelic sites only).">
##INFO=<ID=OTH_N_BI_GENOS,Number=1,Type=Integer,Description="Total number of OTH individuals with complete genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HET,Number=1,Type=Integer,Description="Number of OTH individuals with heterozygous genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HOMALT,Number=1,Type=Integer,Description="Number of OTH individuals with homozygous alternate genotypes (biallelic sites only).">
##INFO=<ID=OTH_N_HOMREF,Number=1,Type=Integer,Description="Number of OTH individuals with homozygous reference genotypes (biallelic sites only).">
##INFO=<ID=PCRPLUS_DEPLETED,Number=0,Type=Flag,Description="Site depleted for non-reference genotypes among PCR+ samples. Likely reflects technical batch effects. All PCR+ samples have been assigned null GTs for these sites.">
##INFO=<ID=PESR_GT_OVERDISPERSION,Number=0,Type=Flag,Description="PESR genotyping data is overdispersed. Flags sites where genotypes are likely noisier.">
##INFO=<ID=POPMAX_AF,Number=1,Type=Float,Description="Maximum allele frequency across any population (biallelic sites only).">
##INFO=<ID=PROTEIN_CODING__COPY_GAIN,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a copy-gain effect.">
##INFO=<ID=PROTEIN_CODING__DUP_LOF,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a loss-of-function effect via intragenic exonic duplication.">
##INFO=<ID=PROTEIN_CODING__DUP_PARTIAL,Number=.,Type=String,Description="Gene(s) which are partially overlapped by an SV's duplication, such that an unaltered copy is preserved.">
##INFO=<ID=PROTEIN_CODING__INTERGENIC,Number=0,Type=Flag,Description="SV does not overlap coding sequence.">
##INFO=<ID=PROTEIN_CODING__INTRONIC,Number=.,Type=String,Description="Gene(s) where the SV was found to lie entirely within an intron.">
##INFO=<ID=PROTEIN_CODING__INV_SPAN,Number=.,Type=String,Description="Gene(s) which are entirely spanned by an SV's inversion.">
##INFO=<ID=PROTEIN_CODING__LOF,Number=.,Type=String,Description="Gene(s) on which the SV is predicted to have a loss-of-function effect.">
##INFO=<ID=PROTEIN_CODING__MSV_EXON_OVR,Number=.,Type=String,Description="Gene(s) on which the multiallelic SV would be predicted to have a LOF, DUP_LOF, COPY_GAIN, or DUP_PARTIAL annotation if the SV were biallelic.">
##INFO=<ID=PROTEIN_CODING__NEAREST_TSS,Number=.,Type=String,Description="Nearest transcription start site to intragenic variants.">
##INFO=<ID=PROTEIN_CODING__PROMOTER,Number=.,Type=String,Description="Genes whose promoter sequence (1 kb) was disrupted by SV.">
##INFO=<ID=PROTEIN_CODING__UTR,Number=.,Type=String,Description="Gene(s) for which the SV is predicted to disrupt a UTR.">
##INFO=<ID=SOURCE,Number=1,Type=String,Description="Source of inserted sequence.">
##INFO=<ID=STRANDS,Number=1,Type=String,Description="Breakpoint strandedness [++,+-,-+,--]">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="SV length">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=UNRESOLVED_TYPE,Number=1,Type=String,Description="Class of unresolved variant.">
##contig=<ID=1,length=249250621>
##contig=<ID=2,length=243199373>
##contig=<ID=3,length=198022430>
##contig=<ID=4,length=191154276>
##contig=<ID=5,length=180915260>
##contig=<ID=6,length=171115067>
##contig=<ID=7,length=159138663>
##contig=<ID=8,length=146364022>
##contig=<ID=9,length=141213431>
##contig=<ID=10,length=135534747>
##contig=<ID=11,length=135006516>
##contig=<ID=12,length=133851895>
##contig=<ID=13,length=115169878>
##contig=<ID=14,length=107349540>
##contig=<ID=15,length=102531392>
##contig=<ID=16,length=90354753>
##contig=<ID=17,length=81195210>
##contig=<ID=18,length=78077248>
##contig=<ID=19,length=59128983>
##contig=<ID=20,length=63025520>
##contig=<ID=21,length=48129895>
##contig=<ID=22,length=51304566>
##contig=<ID=X,length=155270560>
##contig=<ID=Y,length=59373566>
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 720320 gnomAD_v2_INS_1_11 N <INS:ME:SVA> 680 MismatchedRefAllele AC=1;AF=4.7e-05;AFR_AC=0;AFR_AF=0;AFR_AN=9480;AFR_FREQ_HET=0;AFR_FREQ_HOMALT=0;AFR_FREQ_HOMREF=1;AFR_N_BI_GENOS=4740;AFR_N_HET=0;AFR_N_HOMALT=0;AFR_N_HOMREF=4740;ALGORITHMS=melt;AMR_AC=0;AMR_AF=0;AMR_AN=1784;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_N_BI_GENOS=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_N_HOMREF=892;AN=21476;CHR2=1;EAS_AC=0;EAS_AF=0;EAS_AN=2226;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_N_BI_GENOS=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_N_HOMREF=1113;END=720340;EUR_AC=1;EUR_AF=0.000132;EUR_AN=7598;EUR_FREQ_HET=0.000263227;EUR_FREQ_HOMALT=0;EUR_FREQ_HOMREF=0.999737;EUR_N_BI_GENOS=3799;EUR_N_HET=1;EUR_N_HOMALT=0;EUR_N_HOMREF=3798;EVIDENCE=SR;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;FREQ_HOMREF=0.999907;N_BI_GENOS=10738;N_HET=1;N_HOMALT=0;N_HOMREF=10737;OTH_AC=0;OTH_AF=0;OTH_AN=388;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_N_BI_GENOS=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_N_HOMREF=194;POPMAX_AF=0.000132;PROTEIN_CODING__INTERGENIC;PROTEIN_CODING__NEAREST_TSS=AL645608.2;SVLEN=968;SVTYPE=INS
chr1 818807 gnomAD_v2_INS_1_17 N <INS:ME:ALU> 381 MismatchedRefAllele AC=11;AF=0.000512;AFR_AC=1;AFR_AF=0.000105;AFR_AN=9480;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_N_BI_GENOS=4740;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_N_HOMREF=4739;ALGORITHMS=melt;AMR_AC=0;AMR_AF=0;AMR_AN=1784;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_N_BI_GENOS=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_N_HOMREF=892;AN=21476;CHR2=1;EAS_AC=0;EAS_AF=0;EAS_AN=2226;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_N_BI_GENOS=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_N_HOMREF=1113;END=818820;EUR_AC=10;EUR_AF=0.001316;EUR_AN=7598;EUR_FREQ_HET=0.00263227;EUR_FREQ_HOMALT=0;EUR_FREQ_HOMREF=0.997368;EUR_N_BI_GENOS=3799;EUR_N_HET=10;EUR_N_HOMALT=0;EUR_N_HOMREF=3789;EVIDENCE=SR;FREQ_HET=0.0010244;FREQ_HOMALT=0;FREQ_HOMREF=0.998976;N_BI_GENOS=10738;N_HET=11;N_HOMALT=0;N_HOMREF=10727;OTH_AC=0;OTH_AF=0;OTH_AN=388;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_N_BI_GENOS=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_N_HOMREF=194;POPMAX_AF=0.001316;PROTEIN_CODING__INTRONIC=AL645608.2;SVLEN=281;SVTYPE=INS
chr1 823258 gnomAD_v2_INS_1_18 N <INS:ME:SVA> 295 MismatchedRefAllele AC=1;AF=4.7e-05;AFR_AC=1;AFR_AF=0.000105;AFR_AN=9480;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_N_BI_GENOS=4740;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_N_HOMREF=4739;ALGORITHMS=melt;AMR_AC=0;AMR_AF=0;AMR_AN=1784;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_N_BI_GENOS=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_N_HOMREF=892;AN=21476;CHR2=1;EAS_AC=0;EAS_AF=0;EAS_AN=2226;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_N_BI_GENOS=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_N_HOMREF=1113;END=823274;EUR_AC=0;EUR_AF=0;EUR_AN=7598;EUR_FREQ_HET=0;EUR_FREQ_HOMALT=0;EUR_FREQ_HOMREF=1;EUR_N_BI_GENOS=3799;EUR_N_HET=0;EUR_N_HOMALT=0;EUR_N_HOMREF=3799;EVIDENCE=SR;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;FREQ_HOMREF=0.999907;N_BI_GENOS=10738;N_HET=1;N_HOMALT=0;N_HOMREF=10737;OTH_AC=0;OTH_AF=0;OTH_AN=388;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_N_BI_GENOS=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_N_HOMREF=194;POPMAX_AF=0.000105;PROTEIN_CODING__INTERGENIC;PROTEIN_CODING__NEAREST_TSS=SAMD11;SVLEN=1205;SVTYPE=INS
chr1 849315 gnomAD_v2_INS_1_25 N <INS:ME:ALU> 892 MismatchedRefAllele AC=1;AF=4.7e-05;AFR_AC=0;AFR_AF=0;AFR_AN=9480;AFR_FREQ_HET=0;AFR_FREQ_HOMALT=0;AFR_FREQ_HOMREF=1;AFR_N_BI_GENOS=4740;AFR_N_HET=0;AFR_N_HOMALT=0;AFR_N_HOMREF=4740;ALGORITHMS=melt;AMR_AC=0;AMR_AF=0;AMR_AN=1784;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_N_BI_GENOS=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_N_HOMREF=892;AN=21476;CHR2=1;EAS_AC=0;EAS_AF=0;EAS_AN=2226;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_N_BI_GENOS=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_N_HOMREF=1113;END=849331;EUR_AC=1;EUR_AF=0.000132;EUR_AN=7598;EUR_FREQ_HET=0.000263227;EUR_FREQ_HOMALT=0;EUR_FREQ_HOMREF=0.999737;EUR_N_BI_GENOS=3799;EUR_N_HET=1;EUR_N_HOMALT=0;EUR_N_HOMREF=3798;EVIDENCE=SR;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;FREQ_HOMREF=0.999907;N_BI_GENOS=10738;N_HET=1;N_HOMALT=0;N_HOMREF=10737;OTH_AC=0;OTH_AF=0;OTH_AN=388;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_N_BI_GENOS=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_N_HOMREF=194;POPMAX_AF=0.000132;PROTEIN_CODING__INTERGENIC;PROTEIN_CODING__NEAREST_TSS=SAMD11;SVLEN=280;SVTYPE=INS
chr1 918618 gnomAD_v2_INS_1_27 N <INS:ME:SVA> 949 NoTarget AC=1;AF=4.7e-05;AFR_AC=1;AFR_AF=0.000105;AFR_AN=9480;AFR_FREQ_HET=0.00021097;AFR_FREQ_HOMALT=0;AFR_FREQ_HOMREF=0.999789;AFR_N_BI_GENOS=4740;AFR_N_HET=1;AFR_N_HOMALT=0;AFR_N_HOMREF=4739;ALGORITHMS=delly,melt;AMR_AC=0;AMR_AF=0;AMR_AN=1784;AMR_FREQ_HET=0;AMR_FREQ_HOMALT=0;AMR_FREQ_HOMREF=1;AMR_N_BI_GENOS=892;AMR_N_HET=0;AMR_N_HOMALT=0;AMR_N_HOMREF=892;AN=21476;CHR2=1;EAS_AC=0;EAS_AF=0;EAS_AN=2226;EAS_FREQ_HET=0;EAS_FREQ_HOMALT=0;EAS_FREQ_HOMREF=1;EAS_N_BI_GENOS=1113;EAS_N_HET=0;EAS_N_HOMALT=0;EAS_N_HOMREF=1113;END=46707098;EUR_AC=0;EUR_AF=0;EUR_AN=7598;EUR_FREQ_HET=0;EUR_FREQ_HOMALT=0;EUR_FREQ_HOMREF=1;EUR_N_BI_GENOS=3799;EUR_N_HET=0;EUR_N_HOMALT=0;EUR_N_HOMREF=3799;EVIDENCE=SR;FREQ_HET=9.31272e-05;FREQ_HOMALT=0;FREQ_HOMREF=0.999907;N_BI_GENOS=10738;N_HET=1;N_HOMALT=0;N_HOMREF=10737;OTH_AC=0;OTH_AF=0;OTH_AN=388;OTH_FREQ_HET=0;OTH_FREQ_HOMALT=0;OTH_FREQ_HOMREF=1;OTH_N_BI_GENOS=194;OTH_N_HET=0;OTH_N_HOMALT=0;OTH_N_HOMREF=194;POPMAX_AF=0.000105;PROTEIN_CODING__INTERGENIC;PROTEIN_CODING__NEAREST_TSS=C1orf170;SVLEN=793;SVTYPE=INS
Is this because the reference is N?
Thank you