gravatar for gmora

3 hours ago by

University of Calgary

I am trying to do some analysis on the mitochondrial genome, specifically examining variants. The experiment was 10X WGS and the sequencing was Illumina based. As part of my analysis, I am taking a mitochondrial alignment file and generating a mpileup using samtools. All of the code runs without error however, I notice that I am getting very high base quality scores and I cannot find evidence that these base quality scores exist in the bam, for example:

In my pileup, I notice the character "o" which by ASCII conversion using perl -E 'say ord("o")-33' would be a Base-Quality == 78. A shortened-example of the reported base quality scores are:
"F2JJFF7JJoFCJFA7A<f=ffjaafaaff<djafa<a8"< p="">

When I examine the bam used to generate this pileup I do not observe any instance of the 'o' character.
The code to examine the bam is:
samtools view SM_chrM_test.bam | cut -f 11 | egrep 'o' | wc -l the result is 0.

Any suggestions as to why I am observing this difference would be appreciated.

link

written
3 hours ago
by

gmora20



Source link