Easy on paper, but I cannot crack it!

Say I have

  • a BAM file
  • a genomic position I am interested in (eg. chr20:4690579)
  • a specific read ID (QNAME) I am interested in (eg. c29bd60b-6f46-4c65-873c-4b74afc5ad87)

Using Rsamtools, how do I get the nucleotide from that read at that position?


I am fairly close using pileup(). Here is where I am so far;

bampoint <- BamFile(here('.../46345_hp.bam'))
pos <- ScanBamParam(which=GRanges('chr20', IRanges(4690579, 4690579)))
pilpa <- PileupParam(distinguish_strands=FALSE,
                     distinguish_nucleotides=TRUE)

pileup(bampoint, scanBamParam=pos, pileupParam=pilpa)

returns

  seqnames     pos nucleotide count           which_label
1    chr20 4690579          -     4 chr20:4690579-4690579
2    chr20 4690579          A     8 chr20:4690579-4690579
3    chr20 4690579          C     4 chr20:4690579-4690579
4    chr20 4690579          G   219 chr20:4690579-4690579
5    chr20 4690579          T   254 chr20:4690579-4690579

Not too bad – now how do I do this for one specific read?


Thoughts so far

Solution1: I could delete all other reads except my read ID from the bam file before I give it to pileup(), but I do not seem to be able to give it anything else than the BamFile() pointer. For example,

bam <- scanBam(bampoint)

then same as above, but pileup like:

pileup(bam, scanBamParam=pos, pileupParam=pilpa)

Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘pileup’ for signature ‘"missing"’

If I find how to feed a bam to pileup, then I can probably filter the bam beforehand to keep only my read ID...

Solution2: if there was something like a distinguish_read parameter in pileup(), I could simply look for my read in the output, but I understand that would not be really called 'piling up'...

Any idea?



Source link