Extracting reads from BAM file based on partial read name
I am trying to use the FilterSamReads tool from picard to subset reads from BAM files. My BAM files have their reads labelled in the following way:
The "_i1" at the end of this name represents a differentially expressed version of the read "TRINITY_DN9898_c0_g1". I need to subset based on the first part of the name ("TRINITY_DN9898_c0_g1"), so that the output BAM will contain all differentially expressed versions of that read (I.e. could be "i1" or "i3" etc...).
This is just one example of a list of 500 differentially expressed reads I am trying to extract from whole-transcriptome libraries. I ran the FilterSamReads tool but the output returned all of the reads unfiltered - I suppose it did not detect the partial read names I gave it. Does anyone know how I can accomplish my task?
• 91 views