Starting with a bunch of features in transcript coordinates along with an alignment of the transcript to the genome, is there a way to get alignments in genomic coordinates?
For example, I have a bed file with features of interest as follows:
tx1 10 25 feat1 100 + tx1 45 95 feat2 100 +
And I have an alignment file, say, in BAM format with
tx1 aligned to
tx1 is a multi-exon transcript and aligns to
chr1 with intronic regions. What I am trying to get to is an output bed file with my features in chromosome coordinates that look something like:
chr1 1500 1525 feat1 100 + chr1 1945 1995 feat2 100 +
- I am flexible with input, output and alignment formats.
- I would prefer a solution that does not rely on any existing annotation as both
chr1may be arbitrary sequences that are outside the scope of the standard databases.
tx1is multi-exonic and the features can span two or more adjacent exons, so the output should have multiple rows for such split features