Extraction of specific sequences from a FASTA file

1

I have the entire hairpin sequences downloaded from the miRbase website. The first few lines look like this

cel-let-7 MI0000001 Caenorhabditis elegans let-7 stem-loop
UACACUGUGGAUCCGGUGAGGUAGUAGGUUGUAUAGUUUGGAAUAUUACCACCGGUGAAC
UAUGCAAUUUUCUACCUUACCGGAGACAGAACUCUUCGA
cel-lin-4 MI0000002 Caenorhabditis elegans lin-4 stem-loop
AUGCUUCCGGCCUGUUCCCUGAGACCUCAAGUGUGAGUGUACUAUUGAUGCUUCACACCU
GGGCUCUCCGGGUACCAGGACGGUUUGAGCAGAU
cel-mir-1 MI0000003 Caenorhabditis elegans miR-1 stem-loop
AAAGUGACCGUACCGAGCUGCAUACUUCCUUACAUGCCCAUACUAUAUCAUAAAUGGAUA

I want to extract just the human hairpin sequences from the entire file.
I have used the grep command as follows

 grep hsa-mir hairpin.fa > human_hairpin.fa

However, it only extracts the header line but I need the sequences as well like this.

hsa-mir-548ab MI0016752 Homo sapiens miR-548ab stem-loop
AUGUUGGUGCAAAAGUAAUUGUGGAUUUUGCUAUUACUUGUAUUUAUUUGUAAUGCAAAA
CCCGCAAUUAGUUUUGCACCAACC

Which commands should I follow?


Genomics

• 35 views



Source link