gravatar for r.tor

2 hours ago by

I have a .bed file data that is obtained from concatenating of two .bed files. It's been done through BEDOPS --everything option so, all four columns (chrstartendgene_ID) are preserved nicely. For each gene ID, there are a few rows of coordinates that may or may not be overlapped.
I am looking for to merge the coordinates of each gene separately, so that if they have at least one overlap, they will merge, and if not, they will remain separate. Merging should not implement between all genes.
I've actually tried bedtools merge and BEDOPS merge, but could not make it because they see the whole file as one.

> data
chr1   206721  208928  ENSG00000951
chr1   207322  208145  ENSG00000951
chr1   312006  314918  ENSG00000885
chr1   312077  312277  ENSG00000885
chr1   313423  314611  ENSG00000885
chr1   315128  315716  ENSG00000885
chr1   235826  238431  ENSG00000082
chr1   242929  244929  ENSG00000627
chr1   247107  249107  ENSG00000627
chr1   249284  252043  ENSG00000627

The expected output would be like this:

 > data.output
 chr1   206721  208928  ENSG00000951
 chr1   312006  314918  ENSG00000885
 chr1   315128  315716  ENSG00000885
 chr1   242929  244929  ENSG00000627
 chr1   247107  249107  ENSG00000627
 chr1   249284  252043  ENSG00000627

Thank you.

link

modified 2 hours ago

by

ATpoint40k

written
2 hours ago
by

r.tor40



Source link