gravatar for debitboro

2 hours ago by

Belgium

Dear all,

I've paired-end reads generated by ABI-Solid system 4. I've two fastq files R1.fastq and R2.fastq. I've looked at the content of the two files and I found that the reads didn't match in names (header) as follows which generates some issues for the analysis (for example when trimming the reads using cutadapt).

R1.fastq:

@SRR3159522.1 2_33_78 length=50
GGGATCAAAGGTGCCTAAGAAAGTTCTCACTAAGGGNATCTTCTACGCC
+SRR3159522.1 2_33_78 length=50
CCCDFFFFHHHHHJJJGJJJJJJJJIIGIIIIIJJJ#1?CGHDHHGIJI
@SRR3159522.2 2_36_51 length=50
CTGGTGCGAAAAGGTGAAATAAAAAAGAAGAACGAAGAAGCCGGTGCCA
+SRR3159522.2 2_36_51 length=50
BBCFDFFFHHHHHJGHHIJIJJJJJJIGIIJJIIIJJIGGIJJJHIHHH
@SRR3159522.3 2_36_551 length=50
CCACACCGGGTAAGCTGGTTTGGCGATGCGGGATGATCCGAACGTGGAG
...
...

R2.fastq

@SRR3159522.27470956 2_33_78 length=35
TGTTTNNNNNNNNNNNNAAATGCCAGATCCACAA
+SRR3159522.27470956 2_33_78 length=35
BCBFF############23AGHHHIJJIHIJJJJ
@SRR3159522.27470957 2_36_51 length=35
GTATGCTCCGTNANAGTCTACCAGCACTGACCAG
+SRR3159522.27470957 2_36_51 length=35
[email protected]#2#3AEHIJJIIJJIJJJJJIJJ
@SRR3159522.27470958 2_36_551 length=35
GTCCTGNTNNNNNNNTGAACCAACACCTTTTGTG
...
...

As you can see the headers of the reads are different and don't match each other.

When I used cutadapt to trim the reads, I got a name matching error. I've tried to replace the headers of R2.fastq with the headers of R1.fastq to get the same headers and get rid of the issue but I don't know how to do it. I want to transform R2.fastq as follows:

@SRR3159522.1 2_33_78 length=35
TGTTTNNNNNNNNNNNNAAATGCCAGATCCACAA
+SRR3159522.1 2_33_78 length=35
BCBFF############23AGHHHIJJIHIJJJJ
@SRR3159522.2 2_36_51 length=35
GTATGCTCCGTNANAGTCTACCAGCACTGACCAG
+SRR3159522.2 2_36_51 length=35
[email protected]#2#3AEHIJJIIJJIJJJJJIJJ
@SRR3159522.3 2_36_551 length=35
GTCCTGNTNNNNNNNTGAACCAACACCTTTTGTG
...
...

Someone can help me?



Source link