gravatar for MAPK

8 hours ago by

I am trying to download SRA data and create paired end fastq files per read groups. Can someone please share how I can get this done? I would really appreciate if you could share a shell script to do this.

So far, I tried this, but then this only splits fastq per RGs (not their mate pairs)

SRR="SRR1350739"
IFS=$'n'
RGLINES=($(sam-dump --ngc XXXX.ngc ./${SRR} | sed -n '/^[^@]/!p;//q' | grep ^@RG))
args=(tee)
for RGLINE in ${RGLINES[@]}; do
  unset IFS
  RG=(${RGLINE})
args+=(>(grep -A3 --no-group-separator "\.${RG[1]#ID:}/[12]$" | gzip > "./${SRR}.${RG[1]#ID:}.fastq-dump.split.defline.z.tee.fq.gz"))

done

echo "Splitting ${SRR} into ${#RGLINES[@]} ReadGroups"
fastq-dump --ngc XXXX.ngc --split-e --defline-seq '@$ac.$si.$sg/$ri' --defline-qual '+' -Z "${SRR}" | eval ${args[@]}

link

modified 7 hours ago

written
8 hours ago
by

MAPK1.6k



Source link