gravatar for Phanest

2 hours ago by

I have a dataset consisting of 5 subjects, whose DNA methylation is collected at time 0 and 1 using the 850k Illumina chip.
I wanted to find the differentially methylated sites. For that I used the R limma library; I built a linear model and used moderated t and F statistics. Additionally I used multiple testing to remove the false positives.

However even with all this, I end up with 750 000 diferentially methylated sites, which seems too large to be true. I think I used the functions correctly but I want to be sure, especially whether the design matrix is right.

Here's the script I've been using

"Uses the limma library to find Diferentially Methylated Sites"
library(limma)
library(multtest)
src = "http://www.biostars.org/./Data/Methylation M Values"
t0File = "t0_M.csv"
t1File = "t1_M.csv"

t0File = paste(src, t0File, sep = "/")
t1File = paste(src, t1File, sep = "/")

t0 = read.csv(t0File)
t1 = read.csv(t1File)

Data = merge(t0, t1, by = "row.names", all = TRUE)
rownames(Data) = Data$Row.names
Data = Data[, -1]
design = c(rep(0, 5), rep(1, 5))
#design = data.frame( t0 = c(rep(1, 5), rep(0, 5)), t1 = c(rep(0, 5), rep(1, 5)))
linearFit = lmFit(Data, design)

#contrasts = makeContrasts(contrasts = "t0/t1", levels = design)
#linearFit = contrasts.fit(linearFit, contrasts)

BayesFit = eBayes(linearFit, proportion = 0.05)


result = decideTests(BayesFit, p.value = 0.05)
type = c("BH")
multTestResult = mt.rawp2adjp(BayesFit$p.value, type)

Here's the distribution of both of the groups

imgur.com/a/sddRDkS

link

modified 2 hours ago

written
2 hours ago
by

Phanest0



Source link