I have had my whole genome sequenced by Veritas and they have provided VCF.GZ and BAM files with the data. I'm wanting to run it with the 1000 Genomes data in admixture.
So I downloaded the snps.genotypes VCF.GZ and VCF.GZ.TBI files from here:
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/
In vcftools I converted them to PED files. Then I took my VCF.GZ file and did the same thing, converted it to PED file. I then merged them in plink and stored the result as BED files.
I then ran them in admixture, which returns an error, saying the following:
Error: detected that all genotypes are missing for an individual.
Please apply quality-control filters to remove such individuals.
So I run it in plink again, using the "--mind" flag, and it removes 1 individual. The problem is, the individual it removed is ME! So there was no genotyping data for me!
It seems this is because the VCF file doesn't have genotyping data, so I want to rectify it. Unfortunately, as I work on this, I keep running up against terms like "SNP calling", which seems extremely vague and open-ended, and unclear as to how it helps me meet my goal. I'm just wanting to be able to get my genotyping data in the BED file with all the other participants and run it in admixture. My understanding is GATK can help me, so what can I do?