CPM threshold for RNASeq count data
Hi, I have a DGEList object created used edgeR. The dimension of this object is: 57820 - 1013. I have to choose the filtering and I am not sure that my choice is completely correct. The norm factor in x$samples are all 1 and the
Min. 1st Qu. Median Mean 3rd Qu. Max. 6557050 31326322 36019156 35935285 40766618 79411964
I tried with
keep.exprs <- rowSums(cpm(x)>0.4) >= 5 and
keep.exprs <- filterByExpr(x). When I run
x_filtered <- x[keep.exprs,] with the first one the total dimension becomes 52082 - 1013 while with the second one 24045 - 1013.
Which is the best filtering and why ?
• 95 views