gravatar for hosein_salehi6

2 hours ago by

Hello there, I have a list.txt (big file) contains 2000 samples and 18000 coordinates (same as below file 1).

Coordinates   Sample    Values
chr1:110238914-110324454          SampleB   1
chr1:110238914-110324454          SampleC   3
chr1:110238914-110324454          SampleD   1
chr5:65562670-65627908        SampleD   1
chr5:65562670-65627908        SampleA   1
chr5:65562670-65627908        SampleB   4
chr5:65562670-65627908        SampleC   1
chr2:158248715-158335919              SampleB   1
chr2:158248715-158335919              SampleA   0
chr2:158248715-158335919              SampleC   1

Actually I want to make a matrix by the above file. Whereas coordinates to be as rows name and samples as columns name, then if the coordinate has related sample put the related value in the matrix, if the coordinate does not the value for the sample just put 2 in the matrix, the result should be same the below.

Coordinates   SampleA    SampleB        SampleC         SampleD
chr1:110238914-110324454        2   1   3   1
chr5:65562670-65627908      1   4   1   1
chr2:158248715-158335919            0   1   1   2

I would really appreciate it , if I can receive any scripts for linux,bash (preferably) or R to get this result?


Source link