I want to extract the "Country of isolation" from the Genbank file. Tried to run the following command in Google collab. Accessions.txt contains accession numbers i.e. ['GCA_001719305.1',

from Bio import Entrez

# Read the accessions from a file
with open(accessions_file) as f:
    ids = f.read().split('n')

# Fetch the entries from Entrez
Entrez.email="[email protected]"  # Insert your email here
handle = Entrez.efetch('nuccore', id=ids, retmode="xml")
response = Entrez.read(handle)

# Parse the entries to get the country
def extract_countries(entry):
    sources = [feature for feature in entry['GBSeq_feature-table']
               if feature['GBFeature_key'] == 'source']

    for source in sources:
        qualifiers = [qual for qual in source['GBFeature_quals']
                      if qual['GBQualifier_name'] == 'country']

        for qualifier in qualifiers:
            yield qualifier['GBQualifier_value']

for entry in response:
    accession = entry['GBSeq_primary-accession']
    for country in extract_countries(entry):
        print(accession, country, sep=',')

Getting following error. Please help me to resolve this. Thanks in advance.

HTTPError                                 Traceback (most recent call last)
<ipython-input-17-4518f5766224> in <module>()
      1 Entrez.email="[email protected]"
----> 2 handle = Entrez.efetch('nuccore', id=ids, retmode="xml")
      3 response = Entrez.read(handle)

7 frames
/usr/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 400: Bad Request

Source link