I'm in the process of designing complementary capture probes for two strains (type1 and type2) of the same virus, and as such most of their sequences are similar, but each strain has unique regions. I've tiled both genomes for each probe, and I'd like to subtract the two probe sets, leaving me only the sequences that are unique to type2 virus. The idea is that by designing probes for all of type1, and the unique regions of type2, we can save money by not making "redundant" probes for sequences that are conserved both strains.
Is anyone aware of a software that can do this type of subtraction? The probes are 120bp long, and I want to identify which probes differ by >= 5% (>= 6bp). Kind of like a reverse-BLAST Right now I'm BLASTing each probe and remaking them manually, but there are dozens that need to be remade, as well as entire insertions that the current process doesn't account for. Any help would be appreciated. Please feel free to ask for clarifying questions as well.
Examples of the probes:
Sufficiently matching pair:
>strain1:600-720bp TTGTGGCGGCATCATGTTTTTGGCATGTGTACTTGTCCTCATCGTCGACGCTGTTTTGCAGCTGAGTCCCCTCCTTGGAGCTGTAACTGTGGTTTCCATGACGCTGCTGCTACTGGCTTT >strain2:600-720bp TTGTGGCGGCATCATGTTTTTGGCATGTGTACTTGTCCTTATCGTCGACGCTGTTTTGCAGCTGAGTCCCCTCCTTGGAGCTGTAACTGTGGTTTCCATGACGCTGCTGCTACTGGCTTT
Mismatching pair that I would need to remake:
>strain1:5760-5880bp CCCTCCTCAGAAAACTCTGCATGGAGAAGCTGGACGTGAACCTCCCCCCCAGACCTGTGTGCTGTATTTACAAACACTACAATAAACCCAATGTGCAAATGTGGTTTGTATGGCTACTTT >strain2:5760-5880bp CCCTCCTCAGAAAACTCTGCATGGAGAAGCTGGACGTGAACCTTCCCCCCCCCCCCGACCTGTGTGCTGTATTTACAAACACTACAATAAACCCAATGTGCAAATGTGGTTTGTATGGCT