Hello Everyone,
I am trying to perform differential expression. I started by importing the featurecounts data down to removing unwanted columns and performing the Matrix. After running the code for the DESeqDataSet, I got an error massage:
"Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
all genes have equal values for all samples. will not be able to perform differential analysis"
I a kind of ignored it because I could not really understood the meaning. I went ahead to do sizefactors and i got a value of 1 for all the samples.
Below is a view of what i did:
library(DESeq2)
library(Biobase)
#load the data file from featureCounts.txt
countData<- read.table("/home/mlsi/RNASeq/countTable/featureCounts.txt", head=T,row.names = 1)
#delete column 1-5
deleteColumnCountdata<- countData[-c(1,2,3,4:5)]
colnames(deleteColumnCountdata)
# romove .bam or .sam from filename
colnames(deleteColumnCountdata) <- gsub ("\X.home.mlsi.RNASeq.mapping.","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata) <- gsub ("\.UHR_[123].bam","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata) <- gsub ("\.HBR_[123].bam","",colnames(deleteColumnCountdata))
colnames(deleteColumnCountdata)
class(deleteColumnCountdata)
head(deleteColumnCountdata)
# convert 'deleColumnCountdata' to matrix
newCountsData<-as.matrix(deleteColumnCountdata)
head(newCountsData)
group<- factor(c(rep("UHR",3), rep("HBR",3)))
con<- factor(c(rep("cancer",3), rep("normal",3)))
# contruct a data frame
countDataDataFrame<- data.frame(row.names = colnames(newCountsData), group , con)
head(countDataDataFrame)
#instantiate the DESeq dataset
dds<- DESeqDataSetFromMatrix(countData =newCountsData, colData =countDataDataFrame, design = ~ con)
"Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
all genes have equal values for all samples. will not be able to perform differential analysis"
dds<-estimateSizeFactors(dds)
sizeFactors(dds)
HBR_! HBr_2 HBR_3 UHR_1 UHR_2 UHR_3 all the samples have 1
is this correct? because I read that sizefactors are almost 1
then, I went ahead to do Pre-filtering: dds<- dds [rowSums(counts(dds)) > 1, ]
I tried rlogTransformation because I want to have a heatmap/clustering of the data:
rld<- rlogTransformation(dds)
I got this erroe:
Error in estimateDispersionsFit(object, fitType, quiet = TRUE) :
all gene-wise dispersion estimates are within 2 orders of magnitude
from the minimum value, and so the standard curve fitting techniques will not work.
One can instead use the gene-wise estimates as final estimates:
dds <- estimateDispersionsGeneEst(dds)
dispersions(dds) <- mcols(dds)$dispGeneEst
...then continue with testing using nbinomWaldTest or nbinomLRT
I also tried rlog, log but same error. I tried to follow the suggestion outlined in the error massage, but I dont think I will achieve my goal at the end.
I even tried DESeq(dds); i got this error: estimating size factors
estimating dispersions
Error in .local(object, ...) :
all genes have equal values for all samples. will not be able to perform differential analysis
Where am I getting wrong?
I will appreciate solutions and suggestions.
regards,
Anthony