I have a probably very simple question, but I have not found a good solution yet.
We are comparing a number of bacterial genomes using a number of approaches, one of these is calculating the average nucleotide identity (ANI), this is used to determine if two bacteria belong to the same species or not.
We noticed that some of the genomes are very similar, but encodes a large number of strain specific mobile elements scattered all over the genomes.
We would like to delete these mobile elements from the DNA sequences and then perform the ANI analysis on the genomes stripped of the mobile elements.
We do have the sequences of the mobile elements as fasta files, we can also easily create lists with the location of these in the genomes i.e. start and end base.
There are 50-100+ of these in each genome so manually finding and deleting them would be very time consuming.
My question is then is there an easy way, I am sure there is, to remove a number of DNA segments from a genome?
I first thought about deleting a range from ZZZ to XXX, but then realised that the start stop base number wouldnt match after the first mobile element had been deleted, if I did use a loop.
maybe this is so simple that there even exist some software for this?
Any hints would be appreciated