Something appears to be wrong with one of my fastq files: Blood_ACAGTG_L002_R2_010.fastq.gz

I first noticed an error when trying to trim this file (with its R1 counterpart) with trimmomatic:

java -jar /home/shared/programs/Trimmomatic-0.39/trimmomatic-0.39.jar PE -threads 15 Blood_ACAGTG_L002_R1_010.fastq.gz Blood_ACAGTG_L002_R2_010.fastq.gz /mnt/bdata/shared/SF10711_exome/gbm_14_009_trimmed.fastq.gz ILLUMINACLIP:NexteraPE-PE.fa:2:30:12 LEADING:8 TRAILING:8 SLIDINGWINDOW:4:20 MINLEN:60

java.io.EOFException: Unexpected end of ZLIB input stream
at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
at java.base/java.util.zip.GZIPInputStream.read(GZIPInputStream.java:118)
at org.usadellab.trimmomatic.util.ConcatGZIPInputStream.read(ConcatGZIPInputStream.java:73)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:181)
at java.base/java.io.BufferedReader.fill(BufferedReader.java:161)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392)
at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:71)
at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:179)
at org.usadellab.trimmomatic.threading.ParserWorker.run(ParserWorker.java:42)
at java.base/java.lang.Thread.run(Thread.java:829)
Exception in thread "Thread-1" java.lang.RuntimeException: java.io.EOFException: Unexpected end of ZLIB input stream
at org.usadellab.trimmomatic.threading.ParserWorker.run(ParserWorker.java:56)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
at java.base/java.util.zip.GZIPInputStream.read(GZIPInputStream.java:118)
at org.usadellab.trimmomatic.util.ConcatGZIPInputStream.read(ConcatGZIPInputStream.java:73)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:181)
at java.base/java.io.BufferedReader.fill(BufferedReader.java:161)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:326)
at java.base/java.io.BufferedReader.readLine(BufferedReader.java:392)
at org.usadellab.trimmomatic.fastq.FastqParser.parseOne(FastqParser.java:71)
at org.usadellab.trimmomatic.fastq.FastqParser.next(FastqParser.java:179)
at org.usadellab.trimmomatic.threading.ParserWorker.run(ParserWorker.java:42)
... 1 more
Input Read Pairs: 3860000 Both Surviving: 3102127 (80.37%) Forward Only Surviving: 456443 (11.82%) Reverse Only Surviving: 125247 (3.24%) Dropped: 176183 (4.56%)
TrimmomaticPE: Completed successfully

Looking into this error, I was lead to this thread: Error: Help understand Trimmomatic ZLIB input stream error

Trying to unzip the file, I get an 'unexpected end of file' error.

When I try to view the contents:

zcat Blood_ACAGTG_L002_R2_010.fastq.gz | tail

gzip: Blood_ACAGTG_L002_R2_010.fastq.gz: unexpected end of file
@HWI-D00328:58:H7EAEADXX:2:2215:11524:35696 2:N:0:ACAGTG
ATCTTGCCCTGCCGCACTGACTACGGCTGCTGCCGCCTTTCTATGGCTGTGCGTCTCATCCCCGCTGTCCATCTGGGAGATGGGGTCTTCCTTGTGGCGCC
+
  CCCFFFFFHHHGHJJJJBIGJGIIJJJJJJFI9BFFHIJIIGGGIIGEGE;AA?B>CDEEEDD'3=BBCDAFDCDDD2<5?CCBD9<C:@[email protected]@B
@HWI-D00328:58:H7EAEADXX:2:2215:11723:35707 2:N:0:ACAGTG
TAGATTGTTAGAAAGATCCAAGTATTAAGATCTAGGGTGGCTAACTTTTCACAGACAAAAAGCTTGTTTGTAAGGTCATTTACTATACCCTTAATTCAGGA
+
==+2<@AAB?<A?BBBBB9+3=34>A,>CB4?=AC?9110;AA>ABBBB7*=AA3=>BBB2;>3A76>>BBABAA=7>[email protected]@@>>@>@@B>>=;;?>B=;?3
@HWI-D00328:58:H7EAEADXX:2:2215:11603:35719 2:N:0:ACAGTG

When I do the same for a different fastq, working file, we have:

zcat Blood_ACAGTG_L002_R1_010.fastq.gz | tail

+
  @@BFFFFFHHHHHJJJIIIJCHIIJEGHIJGJJGHJJIIJJJJJJJFGHIJJJJJJEHJJJJIJHHHFFFFFE>>>BCDDBCCDDDDDDDDDDDDC9CCDC
@HWI-D00328:58:H7EAEADXX:2:2215:18033:58714 1:N:0:ACAGTG
CTTCTTTCCTTTTAGGTGGTTCTAGATGTTGGTTGTGGATCAGGAATCCTGTCATTTTTTGCTGTACAGGCTGGAGCTAGGACAGTTTATGCAGTTGAAGC
+
  @@@FFFFFHFGHH>FG<[email protected][email protected];?ECDBB66;[email protected]>[email protected]>CD5::AC>>>@
  @HWI-D00328:58:H7EAEADXX:2:2215:18170:58720 1:N:0:ACAGTG
GCAAAGTAGTCAGGAATCGATCTCGTGAAGCCCGCAAGGACCGAACACCCCCACCCCGATTTAGACCTACGGGTGCTGCCCCATGTCTCCCACCAAAGCCC
+
  [email protected]<[email protected]@AE<FFFFIIFFBDFD:AFEEEC4ABDC<@BBBBB?BBBBBBBB9>B9<[email protected][email protected]@B9?(:@@AA?BBB(<39?<

Is there anything obvious the differs between the ends of these two files that can be manually fixed?

Thanks in advance!



Source link