Varying length sequences in hhblits (hhsuite 3.3) generated MSA


Hi everyone,

I have run hhblits from hhsuite 3.3 for generating MSA for some protein sequences. However, I can see that some of the sequences have different lengths (including gaps) compared to the seed sequence. Upon further inspection, the sequences with different lengths have some lowercase letters which are responsible for this length difference. If lowercase letters are removed, all the sequences have exactly the same length. Some of the sequences are pasted here:

enter image description here

If you look at it, the second and third sequences have an extra lowercase r, which is making them different lengths from the seed sequence (the first one).

Could you please let me know if this is a normal thing? I would expect all the aligned sequences to have the same length when gaps are considered.

Thank you.





Source link