I have 5 replicates of RNAseq data. I aligned the read files with STAR and then assembled the gtf (using StringTie) for each replicate individually giving me 5 gtf files. I then used Stringtie Merge on the 5 gtf files to get a single merge.gtf.
When I do
wc -l on the individual gtf files, I get an average of 700,000+ transcripts. When I do the same for the merge.gtf, I get 1.7 million transcripts. Since these are all replicates, shouldn't I get roughly the same number of transcripts in the merged gtf and the individual gtfs?
Did I do something wrong in the merge step?