Hi,
I am trying to trim some reads Ive been given. I am trying to run it on several hundred paired end reads. These are kept in directories so that the directory "samples" contains each patient ID, which contains the forward and reverse fastq.gz files.
I am using a submit script, the arguments provided therefore are ${1}
is the patient ID and ${2}
is the code for the fastq.gz file.
The script I have is:
FASTQ_DIR="Path to samples"/samples/${1}
OUT_DIR="Path to out"/TRIM/${1}
mkdir -p ${OUT_DIR}
module add Java/1.8.0_144
java -jar /"Path to software"/software/Trimmomatic-0.39/trimmomatic-0.39.jar PE
-phred33
${FASTQ_DIR}/${2}_R1_001.fastq.gz
${FASTQ_DIR}/${2}_R2_001.fastq.gz
-baseout ${OUT_DIR}/${2}.fastq.gz
ILLUMINACLIP:TruSeq3-PE.fa:2:30:10
LEADING:3
TRAILING:3
SLIDINGWINDOW:4:15
MINLEN:36
What happens then if I execute the script:
$ . TRIM.sh "PatientID" "SampleID"
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/scratch
TrimmomaticPE: Started with arguments:
-phred33 "Path to first fastq" "Path to second fastq" -baseout "Path to out" ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
It then stays here indefinitely and just won't do anything else until I cancel it. It created the output files but they aren't correct and appear corrupted.
Any advice would be greatly appreciated here, I'm not great at bioinformatics so any explanation would be welcomed. Obviously I took out the patientID etc..
Thanks