I am running a few bioinformatics analysis on free data sets, and then we will validate it in vivo. To increase our chances of finding something relevant, we decided to use the following criteria:
- the transcript is diferentially expressed in at least one group, compared to control group (FC>2, FDR<0.05)
- the transcript has an overall high expression on the samples
- the transcript is part of a coexpression network
- the transcript has a high network centrality
my main concern is on the criteria 2. the dataset has 12 groups and about 300 samples. How should I determine if the transcript has a high overall expression? my first thought was to use a geometric or arithmetic average or median, but I am skeptical about it.