CPM threshold for RNASeq count data

0

Hi, I have a DGEList object created used edgeR. The dimension of this object is: 57820 - 1013. I have to choose the filtering and I am not sure that my choice is completely correct. The norm factor in x$samples are all 1 and the summary(x$samples$lib.size) is:

   Min.   1st Qu.   Median   Mean    3rd Qu.     Max. 
 6557050 31326322 36019156 35935285 40766618 79411964

I tried with keep.exprs <- rowSums(cpm(x)>0.4) >= 5 and keep.exprs <- filterByExpr(x). When I run x_filtered <- x[keep.exprs,] with the first one the total dimension becomes 52082 - 1013 while with the second one 24045 - 1013.

Which is the best filtering and why ?


Count


RNASeq


cpm


filtering


edgeR

• 95 views



Source link