Generating Positional List from VCF
I'd like to generate a 1-based position list from VCF file for all variants.
I believe that by VCF convention, the listed position in
POS column specifies the same base for a single nucleotide substitution, but the preceding base for both insertions and deletions.
So, I thought that to specify the position of each variant as
start - end - with a script you could take the position
N provided by the VCF and convert as follows:
Insertion = N - N+1 SNP = N - N Deletion = N+1 - N+length(REF)-1
So for the following sample:
CHROM POS REF ALT 11 66091886 T TTTC 11 66108375 T G 11 67180763 GTATT G
CHROM START END 11 66091886 66091887 11 66108375 66108375 11 67180764 67180767
Just wondering if I have gone about this correctly, and this method would in fact specify where in my alignment the variant itself occurs?
• 20 views