Hi,
I have reads from DNA sequencing,5 pools. The reads look like this:
@NS500455:80:HG7TNBGXB:1:11101:17723:1055 1:N:0:ATCACG
ACTTANGTGTATGTAAACTTCCGACTTCAACTGTATAGGGATCCNAGCTCCAATTCGCCCTATAGTGAGTCGTAT
+
/AAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
It's illumina primer followed by the construct of a transposon which I had to filter. So, I used cutadapt:
for file in /FASTQ/*
do
cutadapt -g ACTTAAGTGTATGTAAACTTCCGACTTCAACTG --discard-untrimmed --minimum-length=35 "$file" -o "`basename -s .fastq &
done
Then I tried aligning this with Bowtie2 using the command :
./bowtie2-align-s -x rat/rn4 -U A1_S1_R1_001.fastq --sensitive-local -S A_align.sam
Bowtie2 fails to align majority if them :
2648325 reads; of these:
2648325 (100.00%) were unpaired; of these:
2242006 (84.66%) aligned 0 times
309307 (11.68%) aligned exactly 1 time
97012 (3.66%) aligned >1 times
15.34% overall alignment rate
I tool the aligned and the unaligned sequences and used BLAT to figure out any contamination and to see if the sequences are from the rat genome. All positive.
I don't have any clue as to why this is happening.
Can someone help please.
regards to all.