I am extracting secondary structural information from PDB files using DSSP with Biopython. For a specific PDB file (1YYJ), I am getting KeyError: ('A', (' ', 80, ' ')). Minimum code for reproducing the error is as follows:

from Bio.PDB import

parser = PDBParser(QUIET=True)
structure = parser.get_structure(id='struct',file="1YYJ.pdb") 
model = structure[0]
dssp = DSSP(model,"1YYJ.pdb")
dssp["A",(" ",80," ")]

I am getting (" ",80," ") by using get_id() method of Bio.PDB.Residue module. I did some debugging and found that keyerror does not occur upto (" ",79," "). Using the debugging code below, I noticed that, dssp is missing residue 80 (79 and 81 present) for some weird reason, and after 80, dssp entries are not consistent with PDB parser entries. The alignment between the two seems to be broken after the missing 80th residue.

residues = Selection.unfold_entities(model['A'], 'R')
for residue,element in zip(residues,dssp):
    print(residue.get_id(),element)

My script is working fine on around 2000 PDB files, but producing this weird behavior for 1YYJ. Please help me. Thanks a lot.



Source link