After using Minimap2 to align my mRNA/(cDNA) transcripts to the GENCODE human genome and counting with featureCounts to the GENCODE gtf file, I have a ridiculously large count of MALAT1 genes for every sample (up to 5% of all reads are this gene, ~5-25K counts/gene non-normalised, sequenced with Nanopore minION).
My sample is bulk rna from resected brain tissue. This gene is apparently up regulated in cancer, so it makes just enough sense to see a lot of counts. But this many counts? I do not know.
Do you all have any experience in dealing with something like this and can give me some tips on how to see if I should exclude this gene or see if this is an error somewhere in my pipeline.