I have some SARS-CoV-2 sequencing data that I'm trying to annotate with SnpEff, however SnpEff doesn't appear to be recognizing multiple adjacent SNPs within the same codon and correctly calling them as a single MNP (multiple-nucleotide polymorphism).

I'm using the GenBank reference MN908947.3. Here are the relevant lines of the annotated VCF file

MN908947.3  28280   .   G   C   2911.06 PASS    AC=2;AF=1.00;AN=2;DP=65;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=31.56;SOR=8.380;ANN=C|missense_variant|MODERATE|N|Gene_28273_29532|transcript|QHD43423.2|protein_coding|1/1|c.7G>C|p.Asp3His|7/1260|7/1260|3/419||,C|upstream_gene_variant|MODIFIER|ORF10|Gene_29557_29673|transcript|QHI42199.1|protein_coding||c.-1278G>C|||||1278|,C|downstream_gene_variant|MODIFIER|S|Gene_21562_25383|transcript|QHD43416.1|protein_coding||c.*2896G>C|||||2896|,C|downstream_gene_variant|MODIFIER|ORF3a|Gene_25392_26219|transcript|QHD43417.1|protein_coding||c.*2060G>C|||||2060|,C|downstream_gene_variant|MODIFIER|E|Gene_26244_26471|transcript|QHD43418.1|protein_coding||c.*1808G>C|||||1808|,C|downstream_gene_variant|MODIFIER|M|Gene_26522_27190|transcript|QHD43419.1|protein_coding||c.*1089G>C|||||1089|,C|downstream_gene_variant|MODIFIER|ORF6|Gene_27201_27386|transcript|QHD43420.1|protein_coding||c.*893G>C|||||893|,C|downstream_gene_variant|MODIFIER|ORF7a|Gene_27393_27758|transcript|QHD43421.1|protein_coding||c.*521G>C|||||521|,C|downstream_gene_variant|MODIFIER|ORF8|Gene_27893_28258|transcript|QHD43422.1|protein_coding||c.*21G>C|||||21| GT:AD:DP:GQ:PL  1/1:0,65:65:99:2925,196,0
MN908947.3  28281   .   A   T   2912.06 PASS    AC=2;AF=1.00;AN=2;DP=78;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=26.91;SOR=8.380;ANN=T|missense_variant|MODERATE|N|Gene_28273_29532|transcript|QHD43423.2|protein_coding|1/1|c.8A>T|p.Asp3Val|8/1260|8/1260|3/419||,T|upstream_gene_variant|MODIFIER|ORF10|Gene_29557_29673|transcript|QHI42199.1|protein_coding||c.-1277A>T|||||1277|,T|downstream_gene_variant|MODIFIER|S|Gene_21562_25383|transcript|QHD43416.1|protein_coding||c.*2897A>T|||||2897|,T|downstream_gene_variant|MODIFIER|ORF3a|Gene_25392_26219|transcript|QHD43417.1|protein_coding||c.*2061A>T|||||2061|,T|downstream_gene_variant|MODIFIER|E|Gene_26244_26471|transcript|QHD43418.1|protein_coding||c.*1809A>T|||||1809|,T|downstream_gene_variant|MODIFIER|M|Gene_26522_27190|transcript|QHD43419.1|protein_coding||c.*1090A>T|||||1090|,T|downstream_gene_variant|MODIFIER|ORF6|Gene_27201_27386|transcript|QHD43420.1|protein_coding||c.*894A>T|||||894|,T|downstream_gene_variant|MODIFIER|ORF7a|Gene_27393_27758|transcript|QHD43421.1|protein_coding||c.*522A>T|||||522|,T|downstream_gene_variant|MODIFIER|ORF8|Gene_27893_28258|transcript|QHD43422.1|protein_coding||c.*22A>T|||||22| GT:AD:DP:GQ:PL  1/1:0,65:65:99:2926,196,0
MN908947.3  28282   .   T   A   2912.06 PASS    AC=2;AF=1.00;AN=2;DP=78;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=28.71;SOR=8.380;ANN=A|missense_variant|MODERATE|N|Gene_28273_29532|transcript|QHD43423.2|protein_coding|1/1|c.9T>A|p.Asp3Glu|9/1260|9/1260|3/419||,A|upstream_gene_variant|MODIFIER|ORF10|Gene_29557_29673|transcript|QHI42199.1|protein_coding||c.-1276T>A|||||1276|,A|downstream_gene_variant|MODIFIER|S|Gene_21562_25383|transcript|QHD43416.1|protein_coding||c.*2898T>A|||||2898|,A|downstream_gene_variant|MODIFIER|ORF3a|Gene_25392_26219|transcript|QHD43417.1|protein_coding||c.*2062T>A|||||2062|,A|downstream_gene_variant|MODIFIER|E|Gene_26244_26471|transcript|QHD43418.1|protein_coding||c.*1810T>A|||||1810|,A|downstream_gene_variant|MODIFIER|M|Gene_26522_27190|transcript|QHD43419.1|protein_coding||c.*1091T>A|||||1091|,A|downstream_gene_variant|MODIFIER|ORF6|Gene_27201_27386|transcript|QHD43420.1|protein_coding||c.*895T>A|||||895|,A|downstream_gene_variant|MODIFIER|ORF7a|Gene_27393_27758|transcript|QHD43421.1|protein_coding||c.*523T>A|||||523|,A|downstream_gene_variant|MODIFIER|ORF8|Gene_27893_28258|transcript|QHD43422.1|protein_coding||c.*23T>A|||||23| GT:AD:DP:GQ:PL  1/1:0,65:65:99:2926,196,0

These are the positions corresponding to codon 3 on the N gene, they should be annotated as a single MNP, p.Asp3Leu, but istead they've been annotated seperately, as three individual SNPs.

I'm using the latest version of SnpEff (v5), and according to the manual SnpEff should have this functionality, what am I doing wrong?



Source link