gravatar for newbie

4 hours ago by

Dear all,

I have downloaded some already published raw data (fastqs). Initially, I did QC and found adapter content in both forward and reverse reads.

Below you can see the fastqc details before adapter trimming of both forward and reverse reads:

enter image description here

To remove the adapter content I used cutadapt like below:

cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -o tr_sample_R1.fastq.gz -p tr_sample_R2.fastq.gz sample_R1.fastq.gz sample_R2.fastq.gz

With adapter trimming I see like below:

enter image description here

So, I have some questions:

1) Before adapter trimming, sequence length distribution was looking fine but after adapter trimming I see that something went wrong. Why is it like that?

2) I see that there is some bias in the first 10-15 bases. What I should do for that? Is it really a problem?

3) Why the GC content have multiple peaks?

Please clarify my doubts. thanks in advance.


modified 4 hours ago



4 hours ago


Source link