Joining blast hits together into single fasta sequence

1

I have a blast output file output.txt with multiple hits in it.

 Score = 885 bits (479),  Expect = 0.0
 Identities = 613/676 (91%), Gaps = 16/676 (2%)
 Strand=Plus/Minus



   Query  1       ATGATAATTGATACGACAGAAGTACAAACTATCAATTCTTTTTCTATATTAGAATCCTTA  60
                   |||||||||||||||||||||||||||||||| ||||||||||||  |||||||||||||
    Sbjct  116345  ATGATAATTGATACGACAGAAGTACAAACTATTAATTCTTTTTCTGGATTAGAATCCTTA  116286

    Query  61      AAAGAAGTCTATGGACTCATATGGATTTTTGTCCCCATTTTCACCCTTGTCTTAGGAATC  120
                   |||||||| |||||||||||||||||||||||||||||||| ||||||||||||||||||
    Sbjct  116285  AAAGAAGTATATGGACTCATATGGATTTTTGTCCCCATTTTAACCCTTGTCTTAGGAATC  116226

    Query  121     ACAATGGGGGTATTAGTAATTGTGTGGTTAGAAAGAAAAATATCCGCAGCAATACAACAA  180
                   ||||||||| ||||||| |||||||| ||||||||||||||||| ||| ||| |||||||
    Sbjct  116225  ACAATGGGG-TATTAGTCATTGTGTGATTAGAAAGAAAAATATCTGCAACAACACAACAA  116167

    Query  181     CGTATTGGACCTGAATATGCCGGCCCATTAGGAATTCTTCAAGCTTTAGCGGATGGGACC  240
                    |||||||||||||||| ||||||||||||||||||||||||||||||||||||||||| 
    Sbjct  116166  TGTATTGGACCTGAATAGGCCGGCCCATTAGGAATTCTTCAAGCTTTAGCGGATGGGACG  116107

    Query  241     AAACTATTTTTGAAGGAGGATCTTCTTCCTTCTAGAGGGAATATTCGTTTGTTTAGCGTC  300
                   |||||| ||||||||||| || ||||||| |||||||| |||||||| || ||||  |||
    Sbjct  116106  AAACTACTTTTGAAGGAGAATATTCTTCCGTCTAGAGGTAATATTCGCTTATTTAAGGTC  116047

    Query  301     GGACCTTCTATAGGGGTTATATCAATTCTACTAAGTTATTTAGTAATTCCTTTTGGATAT  360
                   ||||| ||||||||| |||||||||| ||||||||||          |||||||||||||
    Sbjct  116046  GGACCCTCTATAGGGTTTATATCAATCCTACTAAGTT----------TCCTTTTGGATAT  115997

    Query  361     CACCTTGTTTTAGCTGATCTCAGTATAGGTGtttttttATGGATTGCCATTTCAAGTATT  420
                   |||||||||||||||||| ||||||||||||||||||||||||||||| |||||||||||
    Sbjct  115996  CACCTTGTTTTAGCTGATTTCAGTATAGGTGTTTTTTTATGGATTGCCTTTTCAAGTATT  115937

    Query  421     GTCCCCATTGGTCTTCTTATGTCAGGATATGGATCAAATAATAAGTATTCCTTTTCAGGC  480
                   ||||| ||||||||||||||||||||||||| ||||||||||||||||||||||||||||
    Sbjct  115936  GTCCCTATTGGTCTTCTTATGTCAGGATATGAATCAAATAATAAGTATTCCTTTTCAGGC  115877

    Query  481     GGTCTACGAGCTGCAGCTCAATCGATTAGTTATGAAATACCATTAACTCTATGTGTGTTA  540
                   ||||||||||||| || |||||  ||||||||||||||||||||||||||||||||||||
    Sbjct  115876  GGTCTACGAGCTGTAGATCAATAAATTAGTTATGAAATACCATTAACTCTATGTGTGTTA  115817

    Query  541     GCAATATCTCTACGTGCGATTCGTTTGAACATGAACtttttttCTCTATTTTCTAGAAAA  600
                   |||||||||||||||| |||||||| ||||||||||| |||| |||  ||||||||||||
    Sbjct  115816  GCAATATCTCTACGTGTGATTCGTTAGAACATGAACTCTTTT-CTC--TTTTCTAGAAAA  115760

    Query  601     GAgaaaagaaatgaattgaaatttcaatacaatataaatagaattcaatatgtaaatatg  660
                   |||| |||||||||||| ||||  ||||| |||| | ||| ||| |||||||||||||||
    Sbjct  115759  GAGATAAGAAATGAATTTAAATA-CAATAAAATAGAGATATAATGCAATATGTAAATATG  115701

    Query  661     aa-ataaaaaaaaaGA  675
                   || || ||  ||||||
    Sbjct  115700  AATATGAACGAAAAGA  115685


     Score = 845 bits (457),  Expect = 0.0
     Identities = 603/670 (90%), Gaps = 24/670 (4%)
     Strand=Plus/Minus

    Query  678     ttttttATTCAACATTTCAGTTCGATGAGTTAAACCAGATAGTTATATGAGTGAAA-CAA  736
                   ||||||||| || |||| ||||||||||||||||||||| ||||||||||||||||  ||
    Sbjct  115566  TTTTTTATTAAAAATTTTAGTTCGATGAGTTAAACCAGAGAGTTATATGAGTGAAAAAAA  115507

    Query  737     AACTGCTCCTCAATTTGCAGTAAAACAAGAAAAATCTCATTCCCTAGGTACAAGAATGAA  796
                   |||||||||||||||||||||||||||| ||||||||||||||||||||||||||| |||
    Sbjct  115506  AACTGCTCCTCAATTTGCAGTAAAACAATAAAAATCTCATTCCCTAGGTACAAGAA-GAA  115448

    Query  797     A-TTGAAGTAAACATAAGTTGTTTACCCCAAGATTGAGATTCTTTGATTAGTCGTCATAT  855
                   | ||||||||||||||||||||||||||||| ||||||||| |||  |||||||||||||
    Sbjct  115447  ATTTGAAGTAAACATAAGTTGTTTACCCCAATATTGAGATTATTTTCTTAGTCGTCATAT  115388

    Query  856     CTTGAAGCGGATGCAAAAGATCAACTGTATTTATTACTATACTGGGGATCAATCAAAAAG  915
                   |||||||||||||||||||||| ||| |||||||||||||||||| ||||||||||||||
    Sbjct  115387  CTTGAAGCGGATGCAAAAGATCCACTTTATTTATTACTATACTGGAGATCAATCAAAAAG  115328

    Query  916     AAGTGGGTAGTTAGGAACACCAAAGTACACAAAGGATGAGTAATGGAAATAATGTAAGGT  975
                   |||||   |||||||||||||||||||| |||||||||||||||| |||||||||||| |
    Sbjct  115327  AAGTGAC-AGTTAGGAACACCAAAGTACGCAAAGGATGAGTAATGAAAATAATGTAAGAT  115269

    Query  976     ATCaaa-a-aa-aGGG---GTT-TTTG--CATAAAACTTTGCATAAAACGAATCATAAT-  1025
                   |||||| | || |      ||| |||   ||||||||||| ||||||||||||| |||| 
    Sbjct  115268  ATCAAAGATAACAAAAAAAGTTATTTTTTCATAAAACTTTCCATAAAACGAATCCTAATT  115209

    Query  1026    AAGGGCTTGAAGTTGGTAGAAATGATCAAGCAGTACTTCCCCACGATTCCAATCTAGAGT  1085
                   ||||||||| | |||||||||||||||||||||||||| ||||||||| | |||||||||
    Sbjct  115208  AAGGGCTTGTAATTGGTAGAAATGATCAAGCAGTACTTTCCCACGATTACGATCTAGAGT  115149

    Query  1086    ATGCTACTATTCGCTGATTAAAGAAATGACTATCAAGAACGAATTAATCCTTTATTTTAT  1145
                   |||||||||||||||||||||| ||||||||||||||||| ||| ||||||||||||| |
    Sbjct  115148  ATGCTACTATTCGCTGATTAAATAAATGACTATCAAGAACAAATGAATCCTTTATTTTCT  115089

    Query  1146    TTCCtttttttttttAGTTTTCagaaagaagaacaggaacaagacaaatagaatgcaata  1205
                   |  |||||||||||||||||||||||  ||||||||||||||||||||||||||||||||
    Sbjct  115088  TGACTTTTTTTTTTTAGTTTTCAGAAGAAAGAACAGGAACAAGACAAATAGAATGCAATA  115029

    Query  1206    caataatagaataaaa--aagaataaaacgggaataataagaaaataTTTAGTTCTTCGT  1263
                    ||||||| |||||||  |||||||||||||||||||||||||||||||| ||| | | |
    Sbjct  115028  TAATAATATAATAAAATAAAGAATAAAACGGGAATAATAAGAAAATATTT-GTT-T-C-T  114973

    Query  1264    TTCTTCATACATATGCATATGGGAATTCTTATCATGATTCATTAACTAATGCCCAATTCT  1323
                   |     |||||||||||||||| |||| ||||||||||||||||||||||| ||||||||
    Sbjct  114972  TA----ATACATATGCATATGGAAATTTTTATCATGATTCATTAACTAATGTCCAATTCT  114917

    Query  1324    TTTTATTTAT  1333
                   ||||||||||
    Sbjct  114916  TTTTATTTAT  114907


     Score = 721 bits (390),  Expect = 0.0
     Identities = 446/473 (94%), Gaps = 4/473 (1%)
     Strand=Plus/Minus

    Query  1737    GTGGCGTCAGCCCATAGGGTTTCTAGTTTTTCTAATGTCTTCTCTAGCAGAATGTGAAAG  1796
                   ||||||||| |||||||||||||||||| |||||||||||||||||||||||||||| ||
    Sbjct  114885  GTGGCGTCAACCCATAGGGTTTCTAGTTCTTCTAATGTCTTCTCTAGCAGAATGTGAGAG  114826

    Query  1797    ATTACCCTTTGATTTACCGGAAGCAGAGGAGGAATTAGTAGCAGGTTATCAAACCGAATA  1856
                   |||||| ||| ||||||| |||| |||||||  |||||||||||||||||||||| ||||
    Sbjct  114825  ATTACCTTTTAATTTACCAGAAGTAGAGGAGATATTAGTAGCAGGTTATCAAACCAAATA  114766

    Query  1857    TTCAGGTATAAAATACGGGTTATTTTATCTTGCTTCTTACCTAAATCTATTAGTTTCTTC  1916
                   ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
    Sbjct  114765  TTCAGGTATAAAATACGGGTTATTTTATCTTGCTTCTTACCTAAATCTATTAGTTTCTTC  114706

    Query  1917    ATT----ATTTGTAACAGTTCTTTACTTAGGTGGGTGGAATTTTTCTATTCCGTACATAT  1972
                   |||    ||||||||||||||||||||||||| |||||||||| ||||||||||||||||
    Sbjct  114705  ATTCATTATTTGTAACAGTTCTTTACTTAGGTAGGTGGAATTTCTCTATTCCGTACATAT  114646

    Query  1973    CTATTACTGAACTTTTTGGAATAAATAAAATGTTTAGAGTCTTTGTAATAGCAATTGGTA  2032
                   || ||||||||||||||||||||||||||||||| |||||||||||||||||||||  ||
    Sbjct  114645  CTCTTACTGAACTTTTTGGAATAAATAAAATGTTGAGAGTCTTTGTAATAGCAATTAATA  114586

    Query  2033    TCTTTATTACATTAGCTAAAGCTTATTTGTTTCTGTTCATTCCTATCACAACAAGATGGA  2092
                   ||||||| |||||||||||||||||||||||||||||||||||||| |||||||||||||
    Sbjct  114585  TCTTTATCACATTAGCTAAAGCTTATTTGTTTCTGTTCATTCCTATAACAACAAGATGGA  114526

    Query  2093    CTTTACCTAGGATGAGAATGGATCAGTTATTAAATCTTGGATGGAAATTTCTTTTACCTA  2152
                   ||||||||||||||||||||||||| ||||||||||||||||| ||||||| ||||||||
    Sbjct  114525  CTTTACCTAGGATGAGAATGGATCAATTATTAAATCTTGGATGAAAATTTCCTTTACCTA  114466

    Query  2153    TTTCTCTAGGTAATCTATTATTGACAACTTCTTCTCAACTTGTTTCACTATAA  2205
                   |||||||||||||||||||||| |||||||||| |||||||||||||||||||
    Sbjct  114465  TTTCTCTAGGTAATCTATTATTAACAACTTCTTTTCAACTTGTTTCACTATAA  114413

The hits from the subject sequence are quite close together, so I would like to join the hits from the subject sequences together into one long sequence from 116345 to 114413 bp in the subject sequence together, with the gaps in between the hips marked by n or - characters

Does anyone know of a method to concatenate multiple blast hits into one long sequence?


sequence


genomics


blast

• 80 views

BLAST is a local aligner (L in its name stands for local) and you seem to need a global aligner. You can find one by going to the BLAST web site and selecting Global Alignment. If you paste your subject and query sequences, it will give you the output you want.


Login
before adding your answer.

Traffic: 2183 users visited in the last hour



Source link