Extract reads overlapping a specific region in bam file

1

Hello,

ref --------------start-----------------------------------------stop-------------------------------
r1                   -------------------------------------------------------
r2             ---------------------------------------------------------------
r3                  -----------------    --------------------------------------
r4              -----------------------------------------------------------------------------

a. I would like to extract reads from bam file that overlap entirely the start and stop region from both directions (from start to stop and from stop to start). From the example above, I need to keep only r1,r2 and r4, but not r3.

How to do this?

I tried this, but I still have some small fragmented reads...

samtools view -b -h -q 10 input.bam chrX:230-330 | awk 'BEGIN{OFS="t"}{if($1 ~ /^"@"/) {print} else {if($4 >= 230) {print} else {}}}' | samtools view -Sbo output.bam 

I also tried:

samtools view -h -q 10 input.bam chrX:230-330 | awk 'BEGIN{OFS="t"}{if($1 ~ /^"@"/) {print} else {if($4 >= 230 && length($10) >= 100) {print} else {}}}' | samtools view -Sbo output.bam -


awk


samtools

• 226 views



Source link