gravatar for firstorthopedicdoctor

3 hours ago by

Dear colleagues
I had raw data of GEO dataset. I downloaded cel files It is from Affymetrix Human Clariom D Assay with no biological replicates. I used RMA for normalization, then to filter the low expressed genes, I drew a histogram of median expression. I followed this article for this filtering method
My question is my data should have normal distribution because 1) it is after normalization
2) central limit theorem?????, however, it is skewed here

here is the histogram of my data
What should I do in this case?

codes are


GSE103965_norm <- oligo::rma(GSE103965, target = "core")
#filtering low intensity genes

GSE103965_f <- rowMedians(Biobase::exprs(GSE103965_norm))
hist_res <- hist(GSE103965_f, 100, col = "cornsilk1", freq = FALSE, 
            main = "Histogram of the median intensities", 
            border = "antiquewhite4",
            xlab = "Median intensities")
emp_mu <- hist_res$breaks[which.max(hist_res$density)]
emp_sd <- BiocGenerics::mad(GSE103965_f)/2
prop_cental <- 0.50

lines(sort(GSE103965_f), prop_cental*dnorm(sort(GSE103965_f),
                 mean = emp_mu , sd = emp_sd),
                 col = "grey10", lwd = 4)

Source link