I need to align a dataset mapped to GRCh38.p2 (ensembl 79) and a dataset mapped to GRCh38.p13 (ensembl 98).
The first dataset (ensembl 79) has gene names and entrez IDs. The second dataset (ensembl 98) has gene names and ENSG IDs. I want to convert ensembl 79 entrez IDs to ENSG IDs. When I query on biomaRt, almost half of the genes are not found. I have tried using both "external_gene_name" and "enterezgene" as filters. I have tried using both the most recent mart and archived marts (ensembl 77-80).
FYI: approximately 25000 genes were not found, and of these genes about 10000 of them are pseudogenes.
listEnsemblArchives() biomart <- useMart("ensembl", host = "https://oct2014.archive.ensembl.org", dataset = "hsapiens_gene_ensembl") filters <-listFilters(biomart) attributes <- listAttributes(biomart) m1.biomart <- getBM(filters = "entrezgene", attributes = c("ensembl_gene_id","entrezgene", "external_gene_name", "hgnc_symbol"), values = m1.entrez.ids$entrez_id, mart = biomart) length(unique(m1.entrez.ids$entrez_id))  50281 length(unique(m1.biomart$entrezgene))  25987 length(unique(m1.biomart$ensembl_gene_id))  28701