gravatar for nmargu

2 hours ago by

Hello all,
I am making alignment of pair-end reads after Illumina sequencing 2x150 but my DNA fragments are mostly shorter. I assume because of the flags that they are correctly mapped. After trimming and using bowtie-2 I have noticed that the majority of my pairs have the same TLEN negative value. And also, in the sequence column (SEQ, column 10) the sequence is exactly the same.
As I understood from TLEN, the leftmost segment receives the + and the rightmost segment receives the - but according to SAM manual "If segments cover the same coordinates then the choice of which is leftmost and rightmost is arbitrary, but the two ends must still have differing signs". Assuming my fragments are in this scenario, they have the same sequence but they always receive the negative sign. Is this normal?
And also, regarding the sequence, why the sequence is the same in those cases? I need to retrieve from SAM the exact sequence that was aligned from each read (pair1 and pair2) and because of this problem, I am losing information from one side. Does anyone have a suggestion of what could I do?

Here is a proper pair with different TLEN sing:

MN00409:35:000H2KJ2J:1:11102:12030:17923        99      CP047231        3573289 255     151M    =       3573330 **192**     **AACTTTTCCGGCTTCCCGTTCGTCAGTACCTCGGGAAGCCGCCAACCAGGATAAAATGTCAGCCCTAATCAGCGTTGCAGGATAAAGCACCGCTCACTCTTCAACAGACCGATTTGCACCCCAGCAAATGTAGCGTTATTGTTACCTTCCT** FFFFFFFFFFFFFF/F/FFFFFFFFFF/FFAFFFFFFFFFFF/AFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFAFFFF6F6/6F=FAF/FFFFFFFFFFF=F=FF=FFFFFFAFFFFFFFFFFF=/FFFFAFFFFF AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YS:i:0  YT:Z:CP
MN00409:35:000H2KJ2J:1:11102:12030:17923        147     CP047231        3573330 255     151M    =       3573289 **-192**    **CCAACCAGGATAAAATGTCAGCCCTAATCAGCGTTGCAGGATAAAGCACCGCTCACTCTTCAACAGACCGATTTGCACCCCAGCAAATGTAGCGTTATTGTTACCTTCCTTGCTACAGAGTTCGACAGATATCCCGCTATGACATTCTCCC** AA=F/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/AFFFFFAAFAFFF=FFFFFFFFFFFFFFFF=FF/6/FFFFFF/FF=FFAFFFFFF=FFAFFFFF6F/AFFFFFF6FFFFFFFF6FFF/F/6A=F6 AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YS:i:0  YT:Z:CP

Here is a "problematic" pair with the same sing TLEN:

MN00409:35:000H2KJ2J:1:11102:12474:20162        99      CP047231        322941  255     112M    =       322941  **-112**    **GGTGATTAAACGTGTGGCGAAGCAGCTCTCGCAGGAAGGCGGCTCGCTGAAGATGTACAACATCGCCGATCGCCTGGAAACGGTGATGTGGGAGAGCAAAAAGATGTTCCCC**        AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFF/AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF        AS:i:-10        XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:6C5C99     YS:i:-10        YT:Z:CP
MN00409:35:000H2KJ2J:1:11102:12474:20162        147     CP047231        322941  255     112M    =       322941  **-112**    **GGTGATTAAACGTGTGGCGAAGCAGCTCTCGCAGGAAGGCGGCTCGCTGAAGATGTACAACATCGCCGATCGCCTGGAAACGGTGATGTGGGAGAGCAAAAAGATGTTCCCC**        =FFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFAFF        AS:i:-10        XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:6C5C99     YS:i:-10        YT:Z:CP

Thank you so much 🙂

• link

•

modified 2 minutes ago

by

genomax ♦ 82k

•

written
2 hours ago
by

nmargu • 0



Source link