Easy on paper, but I cannot crack it!
Say I have
- a BAM file
- a genomic position I am interested in (eg. chr20:4690579)
- a specific read ID (QNAME) I am interested in (eg. c29bd60b-6f46-4c65-873c-4b74afc5ad87)
Using Rsamtools, how do I get the nucleotide from that read at that position?
I am fairly close using pileup(). Here is where I am so far;
bampoint <- BamFile(here('.../46345_hp.bam')) pos <- ScanBamParam(which=GRanges('chr20', IRanges(4690579, 4690579))) pilpa <- PileupParam(distinguish_strands=FALSE, distinguish_nucleotides=TRUE) pileup(bampoint, scanBamParam=pos, pileupParam=pilpa)
seqnames pos nucleotide count which_label 1 chr20 4690579 - 4 chr20:4690579-4690579 2 chr20 4690579 A 8 chr20:4690579-4690579 3 chr20 4690579 C 4 chr20:4690579-4690579 4 chr20 4690579 G 219 chr20:4690579-4690579 5 chr20 4690579 T 254 chr20:4690579-4690579
Not too bad – now how do I do this for one specific read?
Thoughts so far
Solution1: I could delete all other reads except my read ID from the bam file before I give it to pileup(), but I do not seem to be able to give it anything else than the BamFile() pointer. For example,
bam <- scanBam(bampoint)
then same as above, but pileup like:
pileup(bam, scanBamParam=pos, pileupParam=pilpa)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘pileup’ for signature ‘"missing"’
If I find how to feed a bam to pileup, then I can probably filter the bam beforehand to keep only my read ID...
Solution2: if there was something like a
distinguish_read parameter in pileup(), I could simply look for my read in the output, but I understand that would not be really called 'piling up'...