Hi guys,

I'd like to create a new column of a dataframe using for and if/else functions. My dataframe ds1 has the following 4 columns:

Sample_Name                Sample_Well               

pool_women                    A01                     
 213141                       B01                      
pool_men                      C01                    
253141                        D01                      
202196                        E01                      
200569                        F01                      
242196                        G01                      

Sentrix_ID  Sentrix_Position
    2,04426E+11    R01C01
    2,04426E+11    R02C01
    2,04426E+11    R03C01
    2,04426E+11    R04C01
    2,04426E+11    R05C01
    2,04426E+11    R06C01
    2,04426E+11    R07C01

Now, I want to create a new column Sample_group, in which I should find 0 for samples starting with "21" and "20" in Sample_Name, 1 for samples starting with "25" and "24" in Sample_Names and 2 for the others (pool_women and pool_men), as following:

Sample_Name                Sample_Well               

pool_women                    A01                     
 213141                       B01                      
pool_men                      C01                    
253141                        D01                      
202196                        E01                      
200569                        F01                      
242196                        G01                      

Sentrix_ID  Sentrix_Position   Sample_group
    2,04426E+11    R01C01               2
    2,04426E+11    R02C01               0
    2,04426E+11    R03C01               2
    2,04426E+11    R04C01               1
    2,04426E+11    R05C01               0
    2,04426E+11    R06C01               0
    2,04426E+11    R07C01               1

I wrote the following code:

variables <- colnames(ds1[,which(colnames(ds1)=="Sample_Name")])

for(i in variables){
if(gsub("(^\d{2}).*", "\1", i) == "21" | gsub("(^\d{2}).*", "\1", i) == "20") {ds1$Sample_group1 <- 0}

if(gsub("(^\d{2}).*", "\1", i) == "24" | gsub("(^\d{2}).*", "\1", i) == "25") {ds1$Sample_group1 <- 1}

else {ds1$Sample_group1 <- 2}

}

However, I found only 1 at the Sample_group column for all samples.

What's wrong with my code?

Thank u!



Source link