Bedops split bed coordinates with zero based half opened format

0

I wonder if I made a mistakes in the code below, I tried to split my long 3k bases promoter sequence extracted using bedops into 1000nt shorter sequence with --chop but it ends up with an extra sequence which only has 1 base in it. I am doing this because STREME seemed to performed better when sequence length not greater 1000 bases according to the tutorial.

Example range of interest:

ND_123.1:74801-76802

bedops --everything promoters.forward.bed promoters.reverse.bed | bedops --chop 1000 -

I have stucked in this piece of code for a few days. Hope to get some help.

Undesired output

ND_123.1:74801-75801

ND_123.1:75801-76801

ND_123.1:76801-76802 <== The one base only sequence

To solve it I think I can probably do the following code on the forward strands:

awk -vFS="t" -vOFS="t" '($6 == "+"){ print $1, ($2 - 1), $2, $4, $5, $6; }' gene.bed | bedops --range -$WINDOW:-1 --everything - > promoters.forward.bed

And also change the strand on reverse strands from 0:$WINDOW to 1:$WINDOW ... is this making sense?


DNA


bedops


STREME

• 56 views



Source link