Bash loop for bbduk

Hello Everyone!
I'm trying to run a loop to trim my rnaseq reads in bbduk. However, I can't seem to input files from a specific directory nor output them to a directory.
Probably this is very easy to solve. I'm currently using:

for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz | sed 's/_1.fq.gz//'`
do
bbduk.sh -Xmx1g in1=$i_1.fq.gz in2=$i_2.fq.gz out1=/home/gabriel.gama/Análises/Teste1/bbduk/$i_clean_1.fq.gz out2=/home/gabriel.gama/Análises/Teste1/bbduk/$i_clean_2.fq.gz ref=/home/gabriel.gama/bbduk/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10 
done

I'm not getting the sample names as output, even tough the command run. I'm getting the absolute path to file names as output

Maybe I should use something such as

for a in 'basename $i'

to get the basename of the file, and then reference it as such:

out1=/home/gabriel.gama/Análises/Teste1/bbduk/$a_clean_1.fq.gz 


bash


bbduk

• 79 views

Please do not use spaces in folder names. It is just simpler to use a _ when you feel like using a space. It would be better name my machine to my_machine.

There can be no spaces in bbduk.sh options. You seem to have a space between ref= and the directory after it.

Try this:

 for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz`; 
    do dname=$(dirname ${i}); name=$(basename ${i} _1.fq.gz); 
    bbduk.sh -Xmx1g in1=${dname}/${name}_1.fq.gz in2=${dname}/${name}_2.fq.gz 
    out1=/home/gabriel.gama/Análises/Teste1/bbduk/${name}_clean_1.fq.gz out2=/home/gabriel.gama/Análises/Teste1/bbduk/${name}_clean_2.fq.gz 
    ref=/home/gabriel.gama/bbduk/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10 ; 
done

Try setting up basename in your for command:

for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz | sed 's/_1.fq.gz//' | xargs -i basename {}`

After a lot of help, I learned one the fastest and easiest way to do so in a for loop.
The thing is, manipulating $i is better, such as:

for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz | sed 's/_1.fq.gz//'` 
do
bbduk.sh -Xmx1g in1=$i_1.fq.gz in2=$i_2.fq.gz out1=/home/gabriel.gama/Análises/Teste1/bbduk/${i##*/}_clean_1.fq.gz out2=/home/gabriel.gama/Análises/Teste1/bbduk/${i##*/}_clean_2.fq.gz ref=/home/gabriel.gama/bbduk/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10 
done

The term

${i##*/}

gives the basename of the directory!

Thanks stackoverflow.com/users/140750/william-pursell for the response


Login
before adding your answer.

Traffic: 1069 users visited in the last hour



Source link