Protein fasta file header shorten

0

Dear all,

I want to short my fasta file header, which is like below, I listed two sequences.
At the same time I want to keep all the sequences exactly the way they are.

lcl|VSMA01000001.1_prot_KAB5584702.1_1 [locus_tag=GE09DRAFT_1165795] [db_xref=InterPro:IPR002198,JGIDB:Conioc1_1165795] [protein=tetrahydroxynaphthalene reductase] [protein_id=KAB5584702.1] [location=join(1826..1931,1988..2458,2736..2863,2927..3064)] [gbkey=CDS] MPGLTTNTGKYDQIPGPLGLASASLEGKVALVTGAGRGIGREMAQELGRRGAKVIVNYANSQESAEEVVQAIKKSGSDAA SIKANVSDVDQIVRMFDEAVKVFGKLDIVCSNSGVVSFGHVKDVTPEEFDRVFNINTRGQFFVAREAYKHLEVGGRLILM GSITGQAKGVPKHAVYSGSKGTIETFVRCMAIDFGDKKITVNAVAPGGIKTDMYHAVCREYIPNGINLTDDEVDEYACTW SPLHRVGLPIDIARVVCFLASQDGEWINGKVLGIDGAACM >lcl|VSMA01000001.1_prot_KAB5584703.1_2 [locus_tag=GE09DRAFT_1165796] [db_xref=InterPro:IPR021840,JGIDB:Conioc1_1165796] [protein=hypothetical protein] [protein_id=KAB5584703.1] [location=complement(join(3193..3215,3871..4374,4440..5628,5725..5886,5941..5989,6050..6066,6130..6234,6286..6495,6547..6561,6622..6728,6843..7103,7155..7719))] [gbkey=CDS] MFHPSRRRAEQTAYEYNIQATEDHEHDHGVVNLSAEKRRRPRGKRPNYKPTALKWPFIVAQILVLVIAMGLIIWAEKAMP DSDSTAIIDPLPSKGLPERSVKPEFGKHFRRDNTSGVVETATSQLDVQETTLTGGDGLITPGLGSTNGPADNVKTAVTDD

And I only want to keep the header like this:

GE09DRAFT_1165795 MPGLTTNTGKYDQIPGPLGLASASLEGKVALVTGAGRGIGREMAQELGRRGAKVIVNYANSQESAEEVVQAIKKSGSDAA SIKANVSDVDQIVRMFDEAVKVFGKLDIVCSNSGVVSFGHVKDVTPEEFDRVFNINTRGQFFVAREAYKHLEVGGRLILM GSITGQAKGVPKHAVYSGSKGTIETFVRCMAIDFGDKKITVNAVAPGGIKTDMYHAVCREYIPNGINLTDDEVDEYACTW SPLHRVGLPIDIARVVCFLASQDGEWINGKVLGIDGAACM
GE09DRAFT_1165796
MFHPSRRRAEQTAYEYNIQATEDHEHDHGVVNLSAEKRRRPRGKRPNYKPTALKWPFIVAQILVLVIAMGLIIWAEKAMP DSDSTAIIDPLPSKGLPERSVKPEFGKHFRRDNTSGVVETATSQLDVQETTLTGGDGLITPGLGSTNGPADNVKTAVTDD

I would be super greatful for any help.

Thanks,
Yanfang


header


shorten


fasta

• 46 views



Source link