I am in the process of analyzing a collection of small RNA datasets from flies (Drosophila melanogaster). So far this is what I have done:
- Trim adaptors using cutadapt and extract 18-30nt long sequences.
- Align to the Drosophila genome using Bowtie (not Bowtie2): Reads with at least one alignment 80-85%
- Count miRNAs using featureCounts using miRBase annotations.
- I also use ShortStack to identify potential novel small RNAs.
Now I want to identify piRNAs and siRNAs from my data. I see there are a couple of piRNA databases with piRNA data (like piRNAdb and piRBase). Can I use a GTF from one of these databases to identify piRNAs? How about siRNAs?