Replace Drugbank IDs with Drug name [R]
I have a dataset of genes associated with drugs derived from DrugBank.
I wish to simply translate all the drugbank IDs to drug names readable by a human.
The drugbank vocabulary (.csv) looks like this:
DB00003 Dornase alfa
DB00004 Denileukin diftitox
My dataset (.csv) has 15 columns but the important ones are:
LDLR DB09270; DB11251; DB14003
ALB DB00070; DB00137; DB00159; DB00162; DB00214;
As you can see some genes are linked to multiple or even hundreds of drugs. the multiple Drug IDs are separated by semicolons, in the same comma-delimited "column"
The R studio match or merge function only work for the first identifier in each column, thus effectively deleting the remainder in the same column "cell".
Any advice is welcome, thanks in advance!
• 17 views