Hi all,

I've been using entrez_fetch in R package rentrez v1.2.2 to extract nucleotide sequences in FASTA format for a large number of GIDs. For a small minority I've found entrez_fetch simply returns an empty string with a newline character - example below.

> entrez_fetch(db = "nuccore", id = "108597802", rettype="fasta_cds_na")
[1] "n"

I get the same result using the accession rather than the GID.

> entrez_fetch(db = "nuccore", id = "DQ640652.1", rettype="fasta_cds_na")
[1] "n"

The exact function works for most other GIDs/accessions I feed it, and it also works if I request alternative rettypes, e.g.

> entrez_fetch(db = "nuccore", id = "108597802", rettype="gb")
[1] "LOCUS       DQ640652               29746 bp    RNA     linear   VRL 12-JUN-2006nDEFINITION  SARS coronavirus GDH-BJH01, complete genome.nACCESSION   DQ640652nVERSION     DQ640652.1nKEYWORDS    .nSOURCE      SARS coronavirus GDH-BJH01n  ORGANISM  SARS coronavirus GDH-BJH01n            Viruses; Riboviria; Nidovirales; Cornidovirineae; Coronaviridae;n            Orthocoronavirinae; Betacoronavirus; Sarbecovirus.nREFERENCE   1  (bases 1 to 29746)n  AUTHORS   Cai,J.-P., Hei,A.-L., Hu,J.-H., Wang,S.-K., Zhang,C.-B., Dai,D.-P.,n            Shen,Z.-Y., Guo,J., Li,M., Wu,Y.-S., Cheng,G., He,Y.-S. and Hou,M.n  TITLE     Direct Submissionn  JOURNAL   Submitted (14-MAY-2006) National Center for Clinical Laboratory,n            Beijing Hospital, 1 Da Hua Road, Dong Dan, Beijing 100730, ChinanFEATURES             Location/Qualifiersn     source          1..29746n                     /organism="SARS coronavirus GDH-BJH01"n                     /mol_type="genomic RNA"n                     /strain="GDH-BJH01"n                     /isolation_source="Homo sapiens lung"n                     /host="Homo sapiens"n                     /db_xref="taxon:388737"n                     /country="China"nORIGIN      n        1 ggcttccagg aaaagccaac

Curiously though using the API through a browser also returns a blank file: example.

If anyone is able to shed some light on why these sequences aren't being returned in FASTA format properly, I'd be very grateful!



Source link