How to paralle trim_galore for single sample?


It takes about 15 hours for my paired end human genome to be trimmed. Although there are ways to parallel trim_galore for multiple sample, could you suggest method for single sample?



You could just split your fastq files into multiple chunks (assuming you know the total number of reads) and then run multiple trim_galore commands.

You could try this if you have seqkit:

seqkit split2 -1 reads_1.fq.gz -2 reads_2.fq.gz -p 2 -O out

Or normally (you have to re-gzip them at the end):

zcat XXX.recal.fastq.gz | split -l 4000000 - prefix

TrimGalore! is a wrapper around cutadapt. You could therefore you cutadapt directly, which has a multithreading option, and in case pigz is in PATH it will use this for (de)compression rather than default gzip. That all will speed-up things. pigz is a multithreaded version of gzip.

before adding your answer.

Traffic: 2455 users visited in the last hour

Source link