Hello,
I have two different types of data (one is WES data for 100 cases and another is WGS data for 200 controls).
I combined the data respectively (into one WES.vcf and one WGS.vcf) using genotypeGVCF and ran VQSR/genotype refinement separately (WES.refined.vcf, WGS.refined.vcf).
The reason why I did these jobs separately for case and control was the type of data are very different (WES and WGS).
After that, I merged these two data (WES+WGS.refined.vcf) and conducted association analysis. The problem is case-specific variants (which is found on WES.refined.vcf only) were set to "./.:.:.:. (missing and no annotation info)" for control data. Then, should I check these variants in file for control data "WGS.refined.vcf" to see whether this variant is missing or homozygous ref alleles? Or is there more convenient way to check this?
And for this case in which two types of data are very different, do you suggest running VQSR and genotype refinement should be conducted separately as I did? Or should I merge all gvcf files into one vcf and then run VQSR and refinement?