gravatar for Damianos P. Melidis

2 hours ago by

Leibniz University Hannover

Dear all,

For some analysis, I have gsvar files of whole genome variants compared to GRCh37.
My logic, is to combine the VEP coordinates with the transcript.coding_sequence from pyensembl.
For one particular deletion:

chr     start       end         ref     obs

chr1    248616705   248616711   TGCTGCG -

The VEP column, of the same variant row, follows:

OR2T2:ENST00000342927:frameshift_variant:HIGH:exon1/1:c.612_618del:p.Cys204Ter:PF13853 [Olfactory receptor]
Using the pyensembl the coding sequence for the VEP coordinates is:

>>>seq = "ATGGGCATGGAGGGTCTTCTCCAGAACTCCACTAACTTCGTCCTCACAGGCCTCATCACCCATCCTGCCTTCCCCGGGCTTCTCTTTGCAATAGTCTTCTCCATCTTTGTGGTGGCTATAACAGCCAACTTGGTCATGATTCTGCTCATCCACATGGACTCCCGCCTCCACACACCCATGTACTTCTTGCTCAGCCAGCTCTCCATCATGGATACCATCTACATCTGTATCACTGTCCCCAAGATGCTCCAGGACCTCCTGTCCAAGGACAAGACCATTTCCTTCCTGGGCTGTGCAGTTCAGATCTTCCTCTACCTGACCCTGATTGGAGGGGAATTCTTCCTGCTGGGTCTCATGGCCTATGACCGCTATGTGGCTGTGTGCAACCCTCTACGGTACCCTCTCCTCATGAACCGCAGGGTTTGCTTATTCATGGTGGTCGGCTCCTGGGTTGGTGGTTCCTTGGATGGGTTCATGCTGACTCCTGTCACTATGAGTTTCCCCTTCTGTAGATCCCGAGAGATCAATCACTTTTTCTGTGAGATCCCAGCCGTGCTGAAGTTGTCTTGCACAGACACGTCACTCTATGAGACCCTGATGTATGCCTGCTGCGTGCTGATGCTGCTTATCCCTCTATCTGTCATCTCTGTCTCCTACACGCACATCCTCCTGACTGTCCACAGGATGAACTCTGCTGAGGGCCGGCGCAAAGCCTTTGCTACGTGTTCCTCCCACATTATGGTGGTGAGCGTTTTCTACGGGGCAGCCTTCTACACCAACGTGCTGCCCCACTCCTACCACACTCCAGAGAAAGATAAAGTGGTGTCTGCCTTCTACACCATCCTCACCCCCATGCTCAACCCACTCATCTACAGCTTGAGGAATAAAGATGTGGCTGCAGCTCTGAGGAAAGTACTAGGGAGATGTGGTTCCTCCCAGAGCATCAGGGTGGCGACTGTGATCAGGAAGGGCTAG"
>>>seq[612-1:618]
'CGTGCTG'

You can see that this sequence is not the same as the ref column in gsvar (TGCTGCG).

Does anyone have encountered the same case?

Grateful to your ideas to resolve such cases.


Thank you and keep safe!

Damianos



Source link