Determine where an interleaved FASTQ record starts

0

FASTQ-files have a record length of 4 lines. But you can also determine where a record starts even in the middle of a file by looking at '@' and lines around that (see stackoverflow.com/a/41707920/363028).

Can we do something similar with interleaved FASTQ-files?

Based on stackoverflow.com/a/68707816/363028: is there something that tells us where an interleaved FASTQ-record starts?

@M10991:61:000000000-A7EML:1:1101:14011:1001 1:N:0:28
NGCTCCTAGGTCGGCATGATGGGGGAAGGAGAGCATGGGAAGAAATGAGAGAGTAGCAA
+
#[email protected];[email protected]<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1101:14011:1001 2:N:0:28
NGCTCCTAGGTCGGCATGACGCTAGCTACGATCGACTACGCTAGCATCGAGAGTAGCAA
+
#[email protected];[email protected]<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1201:15411:3101 1:N:0:28
NGCTCCTAGGTCGGCATGATGGGGGAAGGAGAGCATGGGAAGAAATGAGAGAGTAGCAA
+
#[email protected];FFGG[email protected]<EE<@FFC,CEGCCGGFF<FGF
@M10991:61:000000000-A7EML:1:1201:15411:3101 2:N:0:28
CGCTAGCTACGACTCGACGACAGCGAACACGCGATCGATCGGAAATGAGAGAGTAGCAA
+
#[email protected];[email protected]<EE<@FFC,CEGCCGGFF<FGF

In the above example you can use the '@' trick combined with '.* 1:N' to determine this seqname is of a R1. But does this always work? And if not: Is there something else that can tell us, whether a FASTQ-record is for R1 or R2?


fastq

• 41 views



Source link