gravatar for raywong.chn

2 hours ago by

We're analyzing RNAseq data with a pipeline consisting of Salmon, tximeta, and DESeq2.

We have a multi-factorial experimental design, and the experiment was performed on cell lines.

On thing that surprised us is that in the result output, we observe many gene polymorphisms.

For example, for gene NLRP2 we observed multiple entries associated with unique ensembl IDs ENSG00000022556, ENSG00000275082, ENSG00000275843, etc.

baseMean    log2FoldChange  pvalue  padj    gene    CTRL_1  CTRL_2  A_1 A_2 B_1 B_2 A+B_1   A+B_2
ENSG00000022556 559.2711127 -1.709470173    5.51E-09    2.16E-07    NLRP2   33.063154   17.498608   23.790824   28.562371   6.421092    6.755627    29.858583   23.977158
ENSG00000275082 349.6580809 2.406888875 0.592471935 0.817837758 NLRP2   0   7.920205    10.814798   0   18.640884   18.543885   0   3.545411

My question is how do we interpret data like this? And how to deal with this kind of situation? Can we add/average different entries associated with the same gene?

link

modified 2 hours ago

written
2 hours ago
by

raywong.chn0



Source link