I am almost new to programming so I have had a simple problem which was hard to resolve for me. I am trying to get the pearson/spearman/kendall correlation coefficient for 12328 genes and 900 conditions. I can do that in R but I would like to do that in python. First column represents conditions which are different treatments on the same cancer cell line and other columns are gene expression profiles. So, conditions in rows and genes in columns.
I used this code to calculate both correlation and P-value for each pair of genes
import pandas as pd import numpy as np import scipy from scipy.stats import pearsonr from scipy.stats import spearmanr from scipy.stats import kendalltau LFC_t=pd.read_csv("book1_t.csv") column_list= LFC_t.columns df_out=pd.DataFrame() c=1 d=1 while c< 12328: while d<12328: g1=LFC_t[column_list[c]] g2=LFC_t[column_list[d]] p_r, p_p = pearsonr (g1, g2) d=d+1 df_out=pd.merge[p_r, p_p] #df_out=p_r.append(p_p) c=c+1
As you can see, I can compute both correlations (p_r) and p-values (p_p) for each pairs of genes but I do not know how to save it in a DataFrame. Because the data for each new pairs would be over-wrighted on the previous data.
Also I need to have a file like this:
Thank you very much in advance.