I had raw data of GEO dataset. I downloaded cel files It is from Affymetrix Human Clariom D Assay with no biological replicates. I used RMA for normalization, then to filter the low expressed genes, I drew a histogram of median expression. I followed this article for this filtering method
My question is my data should have normal distribution because 1) it is after normalization
2) central limit theorem?????, however, it is skewed here
here is the histogram of my data ibb.co/0ynx0SF
What should I do in this case?
library(pd.clariom.d.human) GSE103965_norm <- oligo::rma(GSE103965, target = "core") #filtering low intensity genes GSE103965_f <- rowMedians(Biobase::exprs(GSE103965_norm)) dev.off() hist_res <- hist(GSE103965_f, 100, col = "cornsilk1", freq = FALSE, main = "Histogram of the median intensities", border = "antiquewhite4", xlab = "Median intensities") emp_mu <- hist_res$breaks[which.max(hist_res$density)] emp_sd <- BiocGenerics::mad(GSE103965_f)/2 prop_cental <- 0.50 lines(sort(GSE103965_f), prop_cental*dnorm(sort(GSE103965_f), mean = emp_mu , sd = emp_sd), col = "grey10", lwd = 4)