Trying to use BBMap clumpify to remove duplicates in WGS Illumina PE reads. But its throwing an error:
Exception in thread "Thread-6" Exception in thread "Thread-5" Exception in thread "Thread-5" Exception in thread
"Thread-6" java.lang.AssertionError: SRR9845570.201 D00656:415:HYN72BCX2:1:1108:1112:3886 length=151
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash_inner(KmerComparator.java:79)
at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)
java.lang.AssertionError: SRR9845570.1 D00656:415:HYN72BCX2:1:1108:1385:2052 length=151
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
java.lang.AssertionError: SRR9845571.1 D00656:382:HT5CFBCX2:1:1106:1366:2160 length=101 at
clump.KmerComparator.hash_inner(KmerComparator.java:79)
at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)
at hiseq.FlowcellCoordinate.setFrom(FlowcellCoordinate.java:53)
at clump.ReadKey.<init>(ReadKey.java:46)
at clump.ReadKey.<init>(ReadKey.java:33)
at clump.ReadKey.makeKey(ReadKey.java:23)
at clump.KmerComparator.hash_inner(KmerComparator.java:79)
at clump.KmerComparator.hash(KmerComparator.java:70)
at clump.KmerComparator.hash(KmerComparator.java:66)
at clump.KmerSort$FetchThread1.run(KmerSort.java:394)
and it goes on.
The code I am using is:
for ((i=0;i<$TotalSamples;i++))
do
printf "n Removing duplicate reads in sample ${fileIA1[$i]} and sample ${fileIA2[$i]} t"
clumpify.sh in1=${fileIA1[$i]} in2=${fileIA2[$i]} out1=deduped_"${fileIA1[$i]}".fq out2=deduped_"${fileIA2[$i]}".fq
dedupe=t dupedist=2500 subs=5 -Xmx50g &
done
Is it common issue with clumpify or there is a problem with code? The system I am running the script on has 56 CPU cores and 256GB RAM.