gravatar for Hamel-patel

3 hours ago by

Dear all,

I need some advice on controlling for the variable "days since disease onset" for differential expression analysis using limma. I have the following sample groups which I want to compare:

  1. Group 0 - control patients, disease-free
  2. Group 1 - disease patients with mild symptoms
  3. Group 2 - disease patients with moderate symptoms
  4. Group 3 - disease patients with severe symptoms

I plan to do a general "disease vs healthy" analysis and then look at individual comparisons, i.e 1 vs 0, 2 vs 0, 2 vs1, etc....

I have age and gender for all samples, which is simple to add to the limma model design as covariates. However, I also have "days since disease onset", which is the number of days since the onset of disease, and this day is when a sample was taken to be analysed. This, unfortunately, is significantly different between group1 and the rest (group 2 and 3). This is the summary of the "days since disease onset" variable.

  1. Group 0 - NA
  2. Group 1 - samples were taken on average 20 (95% CI 17-24) days after the first symptoms
  3. Group 2 - samples were taken on average 12 (95% CI 10-13) days after the first symptoms
  4. Group 3 - samples were taken on average 11 (95% CI 9-13) days after the first symptoms

There is also a correlation of "days since disease onset" and genes within group 1, group 2 and group 3.

When comparing the control group to the disease groups as individuals and as a whole (case vs control, where groups 1, 2, 3 are merged and treated as one group), how do I control for the variation in "days since disease onset"? Is a simple case of assigning all the controls to 0 for "days since disease onset" variable and then using "days since disease onset" in the model design? Is this the same when comparing individual disease groups to control, i.e group 0 vs group 1?

When comparing group 1 to group 2, I can put "days since disease onset" as a covariate in the limma model design, to account for variation within groups, however, as they are significantly different between groups, does this mean the DE results may be a reflection gene expression change over the time of disease rather than symptom severity?

Thanks in advance!


modified 3 hours ago

3 hours ago


Source link