Is it acceptable to pool the annotations from the various sources InterProScan offers, and annotate a sequence with a subset of these?

For example, if I have something like so:

id  annot src start stop
seq1  dom1  Pfam  100  120
seq1  dom1a  CDD  101  128
seq1  dom2  Pfam  60  80

Is it acceptable to take dom1a from CDD and dom2 from Pfam, and leave out dom1 from Pfam (since it's redundant with dom1a)?







When you ask is it acceptable, what purpose do you have in mind?

If you are talking about positional redundancy, dom1 and dom1a overlap, but I would still make sure that Pfam and CDD annotations for those domains are the same. If so, you would also have functional redundancy, in which case it would be safe to drop one annotation.

For the sake of consistency, you may want to consider always retaining Pfam annotations and dropping others whenever they overlap. Or you may want to always drop the shorter domain, which in the case above would be from Pfam.

