Hi Dear all

I want to know why we separate multi-alleles Variant from bi-alleles Variant in our analyzes?

In other words, why do most researchers in their own analyzes separate the multi-alleles Variant from the bi-alleles Variant and keep most of the bi-alleles?




One answer is that managing multi-allelic calls is difficult from an analysis perspective. In an association test, like in GWAS, as each site (variant) is analysed independently, it makes no difference to split a multi-allelic call into 2 separate bi-allelic calls, and test each separately.

Also, in a germline sample, it makes little sense, biologically, that a multi-allelic call would even be present, unless our VCF contains data from more than 1 individual. On the other hand, in a cancer context, considering a bulk tumour biopsy sample, we would expect many multi-allelic calls.

