gravatar for danielcgingerich

3 hours ago by

I need to align a dataset mapped to GRCh38.p2 (ensembl 79) and a dataset mapped to GRCh38.p13 (ensembl 98).
The first dataset (ensembl 79) has gene names and entrez IDs. The second dataset (ensembl 98) has gene names and ENSG IDs. I want to convert ensembl 79 entrez IDs to ENSG IDs. When I query on biomaRt, almost half of the genes are not found. I have tried using both "external_gene_name" and "enterezgene" as filters. I have tried using both the most recent mart and archived marts (ensembl 77-80).

FYI: approximately 25000 genes were not found, and of these genes about 10000 of them are pseudogenes.

Code below:

biomart <- useMart("ensembl", host = "", dataset = "hsapiens_gene_ensembl")
filters <-listFilters(biomart)
attributes <- listAttributes(biomart)

m1.biomart <- getBM(filters = "entrezgene", attributes = c("ensembl_gene_id","entrezgene", "external_gene_name", "hgnc_symbol"), values = m1.entrez.ids$entrez_id,  mart = biomart)

[1] 50281

[1] 25987

[1] 28701

Source link