gravatar for Bosberg

2 hours ago by

Germany, MPI

In the genomic ranges package for R, is there an option in setdiff that prevents the collapsing of adjacent ranges? for example, If I have the following:

gr1 = GRanges object 
  [1]     chr1      1-10      *
  [2]     chr1     11-20      *
  [3]     chr1     21-30      *

gr2 = GRanges object
  [1]     chr1     18-25      *

and I take setdiff:

What I get:

setdiff( gr1, gr2 )
  [1]     chr1      1-17      *      # <---- reduce happens automatically.
  [2]     chr1     26-30      *

The output stops at 17 and picks up again at 26 to avoid gr2 from 18-25, so at least that much is correct, but unfortunately, my output now has a single continuous range from 1 to 17 instead of 1-10, and a different GRange directly adjacent from 11-17. There's a reduce operation being done automatically that I want to suppress.

What I want:

setdiff( gr1, gr2, <Some_option_to_suppress_reduce> )
  [1]     chr1      1-10      *
  [2]     chr1     11-17      *
  [3]     chr1     26-30      *

I don’t want these first two regions to be reduced into one.

What I've tried:

The best solution I've come up with so far is to convert to a list and then back to GRange, like this:

unlist( GRangesList( lapply( 1:length(gr1), function(i) setdiff ( gr1[i], gr2) ) ))

..which does what I want but with all the converting between data types it's really slow and inefficient. Is there an option to turn off reduce directly (or some other more elegant solution)?

link

modified 1 hour ago

written
2 hours ago
by

Bosberg0



Source link