I'm currently attempting to perform a differential expression analysis on scRNA-Seq using Limma-Trend, however I'm unsure of the correct design.

The following data frame represents my actual data :

df <- data.frame(Population = c("A","A","A","A","A","A","B","B","B","B"),
           Stage = c(4,4,4,5,5,5,7,7,7,8),
           Region = c("X","Y","X","X","X","X","Y","Y","Y","Y"),
           Cell_type = c("I","J","K","I","I","J","J","K","I","J"))
df$Group <- paste(df$Population,df$Cell_type,sep = "_")

I have two populations of cells, each extracted from two regions, made up mostly of 3 different cell types. The "days" observation is the number of days development of the cell: population A consists of days 4 and 5, population B of days 7 and 8. The day value is probably not comparable across populations i.e had we collected cells from population B at day 4, these would not be equivalent with day 4 population A cells. This is due to the core of our experiment. For this analysis, the age of the cell is not something we care about - Cell type is what matters.

I am unsure whether therefore it is necessary to account for the days within the design matrix in Limma. My aim is to compare cell types across populations and so I'm currently using the design ~0 + Group + Region



Source link