gravatar for rjqmantaring

2 hours ago by

I'm pretty new to BioPython and I'm trying to use it to extract all of the CDS features from a .embl file. This is my code:

#!/usr/bin/python3.7

for rec in SeqIO.parse("file.embl", "embl"):
if rec.features:
for feature in rec.features:
      if feature.type == "CDS":
            print(feature.location)
            print (feature.qualifiers["protein_id"])
            print (feature.location.extract(rec).seq)

When I run my code I get the following error:

Traceback (most recent call last):
File "extractor.py", line 5, in <module>
 record = SeqIO.read("file.embl", "embl")
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 720, in read
 first = next(iterator)
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 655, in parse
 for r in i:
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 489, in parse_records
 record = self.parse(handle, do_features)
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 473, in parse
 if self.feed(handle, consumer, do_features):
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 440, in feed
 self._feed_first_line(consumer, self.line)
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 661, in _feed_first_line
 raise ValueError('Did not recognise the ID line layout:n' + line)
ValueError: Did not recognise the ID line layout:
ID                   file ; ; ; ; ; 29902 BP.

I can't seem to find any relevant documentation or forum post on that specific error message. Can anyone help me figure out what's going on?

Thanks in advance.



Source link