Hello GATK community,
I would like your comments/suggestions for my strategy.
I have F0 samples with two different phenotype.
I have F2 samples with unknown phenotype.
I would like to create a library with the F0 genotypes and then genotype my F2 samples using the previously created library.
STRATEGY:
I already pre-processed BAM files (I have all raw data if required).
Create genotype library with F0 samples:
GATK HaplotypeCaller for both F0 phenotype samples : java -Xmx30g -jar GenomeAnalysisTK_3-8.jar -nct 16 -T HaplotypeCaller -R GENOME --emitRefConfidence GVCF -I INPUT.bam -o OUTPUT.g.vcf
Merge the results: java -Xmx16g -jar GenomeAnalysisTK_3-8.jar -nt 16 -T GenotypeGVCFs -R GENOME --variant F0Variant1.g.vcf --variant F0Variant2.g.vcf -o Results_Merge_F0.vcf
then i used a homemade script to select only position with homozygous genotype and different genotype between both F0 phenotype samples (like 1/1 for a F0 sample and 0/0 for the other one): Results_Merge_F0_filtered.vcf
Genotype F2 sample with the library:
GATK HaplotypeCaller : java -Xmx30g -jar GenomeAnalysisTK_3-8.jar -nct 16 -T HaplotypeCaller -R GENOME --emitRefConfidence GVCF -I INPUT.bam -o OUTPUT.g.vcf -L $4 Results_Merge_F0_filtered.vcf
then i used a homemade script to identify genotype related to one (or the other) F0 phenotype.
BUUUUUUT
At this last step i mostly got homozygous SNP for my F2 samples...
I should get around 25% phenotype1 -- 25% phenotype2 -- 50% phenotype 1/2
I miss something but I don't know where.