Hi, I have a 1000 txt files with two columns: the gene symbol column, and the mutation status column. I want to join all of these files into one file, which will contain first gene symbol column and the following 1000 sample columns of mutation status. For example, I want to join the following two input files:
Gene Sample1 A yes B yes D yes Gene Sample2 B yes C yes E yes
into the output file
Gene Sample1 Sample2 A yes NA B yes yes C NA yes D yes NA E NA yes
I know the full_join solution in R using the dplyr package, but it need to read all the files into R. Does anyone has the simple solution in Unix to do this?
Thanks a lot!