I have a count matrix from an RNA-seq experiment that I'd like to normalize using DESeq2 and perform DE analysis on. My code is below:

dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design= ~ condition)

My experiment is performed over two time periods, week1 (with treated vs untreated) and week2, (untreated vs untreated). Samples were collected at the end of week 1 and week 2 without replacement. So essentially, week 2 we should see the reversal of any unregulated genes from week 1 (and the data is clustering this way).

I have two possible coldata files

coldata1

sample_id   condition   week
treated1    treated 1
treated2    treated 1
treated3    treated 1
untreated1  untreated   1
untreated2  untreated   1
untreated3  untreated   1
treated4    treated 2
treated5    treated 2
treated6    treated 2
untreated4  untreated   2
untreated5  untreated   2
untreated6  untreated   2

coldata2

sample_id   condition   week
treated1    treatedA    1
treated2    treatedA    1
treated3    treatedA    1
untreated1  untreated   1
untreated2  untreated   1
untreated3  untreated   1
treated4    treatedB    2
treated5    treatedB    2
treated6    treatedB    2
untreated4  untreated   2
untreated5  untreated   2
untreated6  untreated   2

So coldata2 would have three treatments instead of two. I'm a bit lost on which is better, and what the best way to fill the design section. I was thinking about making it time-series, but since the treatment was reversed, I'm not sure it's appropriate.

Any help would be greatly appreciated! Apologies if it is not clear, please let me know and I'll try to reexplain.

Edit, for clarification:

During week1: treated vs untreated samples. End of week 1: harvested half of the samples and isolated RNA, etc.
During week2: untreated (were treated in week 1) vs untreated (were untreated in week 1). End of week 2: harvested rest of samples and terminated experiment.



Source link