gravatar for aman.akash2008

3 hours ago by

Trying to use BBMap clumpify to remove duplicates in WGS Illumina PE reads. But its throwing an error:

    Exception in thread "Thread-6" Exception in thread "Thread-5" Exception in thread "Thread-5" Exception in thread 
   "Thread-6" java.lang.AssertionError: SRR9845570.201 D00656:415:HYN72BCX2:1:1108:1112:3886 length=151
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash_inner(KmerComparator.java:79)
at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)
  java.lang.AssertionError: SRR9845570.1 D00656:415:HYN72BCX2:1:1108:1385:2052 length=151
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
    java.lang.AssertionError: SRR9845571.1 D00656:382:HT5CFBCX2:1:1106:1366:2160 length=101 at 
   clump.KmerComparator.hash_inner(KmerComparator.java:79)

at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash_inner(KmerComparator.java:79)
at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)

and it goes on.

The code I am using is:

    for ((i=0;i<$TotalSamples;i++))
do
     printf "n Removing duplicate reads in sample ${fileIA1[$i]} and sample ${fileIA2[$i]}  t"
     clumpify.sh in1=${fileIA1[$i]} in2=${fileIA2[$i]} out1=deduped_"${fileIA1[$i]}".fq out2=deduped_"${fileIA2[$i]}".fq 
     dedupe=t dupedist=2500 subs=5 -Xmx50g &
done

Is it common issue with clumpify or there is a problem with code? The system I am running the script on has 56 CPU cores and 256GB RAM.



Source link