I wish to sort the genes/ proteins from my genome of interest (vibrio cholerae) into categories.
One way is by using COGs. Eggnog is very nice but I wish to sort the entire genome.
I am working with a similar genome that is uploaded into the 2020 database.
Is there a way to obtain/download the COG list for with ID and categories for the entire genome from eggnog or the NCBI server?
I found an example of what I would like to end up with but it is not my genome. this contains the COG id, categories and the genes. If I could find the genome for vibrio cholerae that would be incredible.
61 ||||||||--|||-|-|-|||||||||---|||||||||-||-|||||||||||||--||------ 48 H HemL COG0001 Glutamate-1-semialdehyde aminotransferase
There is another link that has sorted 678 proteins of the V.cholerae genome, but I need the entire genome. I checked some of the genes that weren't sorted in to clusters by looking at the uniprot and eggnog sequences manually and they do have COGs.
VCA0906 [NT] COG0840 (578) Methyl-accepting chemotaxis protein
There should be a way to get the entire genome as it is already sorted. I looked through the FPT NCBI site and was able to sort out the V.cholerae (Vch) COG id along with the gene name. However, I'm not sure how to get the corresponding categories.
How can I get the COG category and id for the entire genome?