gravatar for igorm

2 hours ago by


I have a multialleic record in VCF file:

1 15274 rs62636497 A G,T . PASS DR2=0.08,0.08;AF=0.454,0.5454;IMP GT:DS:AP1:AP2:GP 1|2:0.91,1.09:0.6,0.4:0.31,0.69:0,0,0.19,0,0.54,0.28

From the record I can see that the genotype is G|T.

Then I split multiallelic record to biallelic via bcftools norm and I get:

1 15274 rs62636497 A G . PASS DR2=0.08,0.08;AF=0.454;IMP GT:DS:AP1:AP2:GP 1|0:0.91:0.6:0.31:0,0,0.19

1 15274 rs62636497 A T . PASS DR2=0.08,0.08;AF=0.5454;IMP GT:DS:AP1:AP2:GP 0|1:1.09:0.4:0.69:0,0,0.28

To get G|T genotype I need to read the non-ref from both bialleic records and combine them. Is this the rule I should apply? Or is the genotype information lost in such cases (I noticed gatk LeftAlignAndTrimVariants gives me ./. for both records for this snp when I do the --split-multi-allelics on the multiallelic record)?



modified 2 hours ago

2 hours ago


Source link