gravatar for rimgubaev

2 hours ago by

Russia/Moscow/Skoltech

Hello everyone!

I faced the problem similar to one previously discussed. However, I still did not get what is the common or generally accepted practice to deal with suspiciously frequent occurrence of heterozygous SNPs in multisample vcf file for inbred lines. For me, there is at least one obvious option: basically the heterozygous positions should be simply removed. For example, remove all heterozygous SNPs that occur in more than 25% of the individuals. I know that this is quite a rude approach but it seems to be the easiest and intuitive. Maybe there are some papers on GWAS that use a similar approach
if you know such please suggest.

Furthermore, maybe SNP filtering based on PL (phred-scaled genotype likelihood) parameter should be considered in GATK. However, for me as a biologist, it is hard to understand which filters should be applied to PL in order to remove potentially false heterozygotes. There is a page on the GATK website explaining the PL calculation but there is no discussion on how to use these values to remove heterozygotes in multisample VCF.

If you know tutorials and/or papers devoted to the problem or faced the problem stated above by yourself please share.



Source link