I have a single-cell RNA-seq dataset. I am trying to find out if cells coming from a time-point 1 are more or less abundant compared to timepoint 2 in cluster1. The timepoints are imbalanced and therefore I think the hypergeometric test would be suitable here. I am not sure how to apply it.

Cluster 1
- time-point 1 (93 cells)
- time-point 2 (261cells)

total number of cells coming from timepoint 1 (597 cells)
total number of cells coming from timepoint 2 (2014 cells)

Here I found an example of the application but I am not sure if I put the values for my case correctly.

**Test for under-representation (depletion)**

[http://mengnote.blogspot.com/2012/12/calculate-correct-hypergeometric-p.html
][1]

`phyper(hitInSample, hitInPop, failInPop, sampleSize, lower.tail= TRUE)`

`phyper(93, 597, 2014, 354, lower.tail= TRUE)`

`[1] 0.9548432`

So that would mean that time-point 1 is not underrepresented in cluster1?

Is that correct?

[1]: mengnote.blogspot.com/2012/12/calculate-correct-hypergeometric-p.html



Source link