gravatar for genomax

1 hour ago by

United States

Download this file from dbCAN2 here. This link was provided by an answer found here: Download CAZy database

Once you download the file, pull out the sequences for GH29 family using the following code (fasta linearization code by @Pierre):

awk '/^>/ {printf("%s%st",(N>0?"n":""),$0);N++;next;} {printf("%s",$0);} END {printf("n");}' < CAZyDB.07312019.fa | grep -A 1 GH29  --no-group-separator | tr "t" "n" > GH29_seq.fa

If you want them nicely folded every 60 characters:

awk '/^>/ {printf("%s%st",(N>0?"n":""),$0);N++;next;} {printf("%s",$0);} END {printf("n");}' < CAZyDB.07312019.fa | grep -A 1 GH29  --no-group-separator | tr "t" "n" | fold -w 60 > GH29_seq.fa

link

modified 1 hour ago

written
1 hour ago
by

genomax83k



Source link