After aligning my raw RNA-Seq reads with Salmon, I imported the quant files into R and created a count matrix with annotations using tximeta (creating a SummarizedExperiment object).

Afterwards I built a DGEList object in order to pass it onto edgeR, using the tximeta function makeDGEList. This creates an offset matrix, which I believe normalizes with respect to average transcript length.

Now, when I tried normalize the data in edgeR using calcNormFactors, it gives me this warning:

LL_normalized <- calcNormFactors(LL_genefiltered, method = "TMM")

Warning message:
In calcNormFactors.DGEList(LL_genefiltered, method = "TMM") :
object contains offsets, which take precedence over library
sizes and norm factors (and which will not be recomputed).

It appears the offset matrix from tximeta and the normalization attempted by edgeR conflict with one another, and they have different norm factor values.

I would like to normalize by sequencing depth, RNA composition (effective library size), and gene length.

So my question is, does the offset matrix created by tximeta's makeDGEList function account for all these factors (I don't think it does, only for gene length), or do I need to pass on the DGEList object without an offset matrix and let edgeR take care of the normalization(if so, how?), or is there a way to make them work together?

Of note, the tximport vignette warns not to manually pass the original gene-level counts to downstream methods without an offset. Perhaps this was assuming that no other normalization would be done after tximport/tximeta, and can be disregarded?



Source link