gravatar for russell.stewart.j

2 hours ago by

Hi all!

I'm painfully inexperienced when it comes to coding. I know it's possible to do use cutadapt for trimming without separate lines of code but I'm not sure how. I have 24 paired end samples all with variations on the following names:


So I've got separate cutadapt lines to trim each:

cutadapt -a AGATCGGAAGAG -A AGATCGGAAGAG -o A1_S12_R1_001_trimmed.fastq -p A1_S12_R2_001_trimmed.fastq A1_S12_R1_001.fastq A1_S12_R2_001.fastq > A1_S12_cutadapt.txt

cutadapt -a AGATCGGAAGAG -A AGATCGGAAGAG -o A3_S13_R1_001_trimmed.fastq -p A3_S13_R2_001_trimmed.fastq A3_S13_R1_001.fastq A3_S13_R2_001.fastq > A3_S13_cutadapt.txt

I know there is a way to list my fastqs and drop the root of the file name into a loop command, something like this:

for i in $(ls *fastq | sed 's/_R[12]_001.fastq//' | sort -u); do cutadapt -a AGATCGGAAGAG -A AGATCGGAAGAG -o ${i}_R1_001_trimmed.fastq -p ${i}_R2_001_trimmed.fastq ${i}_R1_001.fastq ${i}_R2_001.fastq > ${i}_cutadapt.txt

Actually, I'd ideally run it using GNU Parallel but I know the syntax is slightly different. In fact, I've used something like this for non-paired end samples before, but don't know how to adapt it for paired end reads:

ls | time parallel -j+0 --eta 'fastx_clipper -a TGGAATTCTCGGG -c -v -i {} -o ../processing/{.}.clip'

Any suggestions or further reading would be appreciated. I'd love to understand these variables better.

Source link