I currently have two GTFs which are incomplete. What I mean by this?. For example, in one of them I have 5000 genes and in another I have 7000, however, there are 3000 genes that coincide (they are the same). Therefore, I would like to create a GTF with the 5000 + 7000 genes, but without duplicating the 3000 identical genes, so I would have in the end a GTF with 9000 genes (not 12000).

What could I do? I think both Cuffmerge and Cuffcompare are useless in these cases.

To give more precision to the GenoMax comment: will tell you the differences.

Then you can choose to complement the annotation using, that means you use one annotation as reference and you will pick from the second annotation only non-overlaping predictions and add them to the reference annotation (i.e. for overlapping loci only the gene prediction from the reference is kept)

Or to merge annotation using, that mean you take non overlaping locus from both annotations and you merge the annotation when they are overlapping (overlapping genes will be merged as one locus and different mRNA will be seen as isoforms, duplicates mRNA will be removed to keep one)

I made this perl script to compare two GFF files. It might be useful after some modifications (changing GFF version to 2?). Needs BioPerl installed.

Useage: file1.gff, gile2.gff

