I have a lot of rna-seq paired end data which have a very good quality, but some of the files have a lot of overrepresented sequences, not adapters.
I made a blast of these sequences. Some of them didn't match to anything, and some other seems to be rRNA.
I understand that there are divided opinions, and some people say is better to remove the overrepresented sequences, and others says that there's no need to.
This time i decided to remove them with cutadapt, because the overrepresented sequences varies from one file to another. But after removing them, the FastQC basic stadistics of these files changed (sequence length 1-150) and NEW overrepresented sequences appeared (i wasn't expecting to obtain more of the initial ones).
I'm thinking that maybe i made a mistake with the cutadapt and want to try with trimmomatic, but i can't find in the manual, an option where i can specify the sequence that i want to remove from a specific file (my impression is that with trimmomatic i can remove only adapters that are recognized by the software).
Can anyone give me an advice about what to do in order to proceed with the (de novo) assembly?